Updates announcements and testing
@Andrew It never happened before?
@stormi I do see this now at boot (related to netdata):
[ 49.028835] xenstat.plugin: segfault at 80 ip 000000000040378a sp 00007ffc4f4278a0 error 4 in xenstat.plugin[400000+8000] [ 49.028842] Code: f4 ff ff 41 b8 68 5d 40 00 b9 d4 00 00 00 ba 30 5f 40 00 be d8 52 40 00 bf 8b 4f 40 00 31 c0 45 31 e4 e8 a9 04 00 00 4c 89 e3 <48> 8b 9b 80 00 00 00 48 85 db 0f 85 be f4 ff ff e9 b7 f7 ff ff 8b
So, I reproduced, but also with the previous kernel, so it's not related to this kernel update.
Update: same regarding the Xen update candidate. Reverting it does not fix the segfault.
@stormi I have just not seen that error before and it was not in the old logs. I guess it's just netdata getting old and cranky (grincheux). Otherwise things are good in normal operation.
Update published. Thanks for the tests!
New Update Candidates (xen, xapi, templates)
- Xen: Enable AVX-512 by default for EPYC Zen4 (Genoa)
- Xapi: Redirect http requests on the host webpage to https by default.
- Guest templates:
- Add the following templates: RHEL 9, AlmaLinux 9, Rocky Linux 9, CentOS Stream 8 & 9, Oracle Linux 9
Test on XCP-ng 8.2
From an up to date host:
For Xen, Xapi and Guest templates:
yum clean metadata --enablerepo=xcp-ng-testing yum update xen-dom0-libs xen-dom0-tools xen-hypervisor xen-libs xen-tools xapi-core xapi-tests xapi-xe guest-templates-json guest-templates-json-data-linux guest-templates-json-data-other guest-templates-json-data-windows --enablerepo=xcp-ng-testing reboot
- xen-*: 4.13.4-9.29.1.xcpng8.2
- xapi-*: 1.249.26-2.2.xcpng8.2
- guest-templates-json-*: 1.9.6-1.2.xcpng8.2
What to test
Normal use and anything else you want to test. The closer to your actual use of XCP-ng, the better.
Test window before official release of the updates
No precise ETA, but the sooner the feedback the better.
@gduperrey Mirror error
failure: repodata/6b271e84b07dced2015bb0d835fb0ec1be1d308d92010993d44a1af0c130aa9f-primary.sqlite.bz2 from xcp-ng-testing: [Errno 256] No more mirrors to try. http://mirrors.xcp-ng.org/8/8.2/testing/x86_64/repodata/6b271e84b07dced2015bb0d835fb0ec1be1d308d92010993d44a1af0c130aa9f-primary.sqlite.bz2: [Errno 14] HTTP Error 404 - Not Found Name: mirrors.xcp-ng.org Address: 22.214.171.124
@gduperrey The repository seems to work now....
Unrelated to the above: a security update for sudo was published. I don't think it's very likely to be an actual threat in the context of your use of XCP-ng, but it might be in specific contexts.
@stormi Applied the update through XO and now XO can not login to the host with the below error.
connect ECONNREFUSED 192.168.40.201:443
I rebooted the host and I can no longer login as root.
ssh: connect to host 192.168.40.201 port 22: Connection refused
Was it the only update applied? Is the stunnel service running?
Oh, I also read that you can't connect as root.
@stormi Yes only update.
Were you using sudo on the host before?
@stormi No just default install
I hardly see a cause-effect link between the update and the issues (both SSH and XAPI not responding anymore?), but computers are full of surprises.
Did you change the firewall configuration? Could the IP address have changed or the same be attributed to another device?
@stormi Done nothing but apply the update through XO web console. I have yanked the plug and making sure it actually reboots.
That fixed it I can login via ssh with root and XO sees the host.
Maybe it was still rebooting, stuck on the shutdown phase, waiting for some kind of I/O or something. This would explain why it didn't respond.
I concur. If the shutdown process is stuck somewhere (eg an NFS share), you can't connect at all (connection refused in SSH, no XAPI connection) and it can stays like this for a while.
@stormi It was not responding after the update from XO. I could long in via ssh and restarted the tool stack but this did not help XO still could not login. I issued a reboot command via ssh which dropped the ssh session and the host did not reboot most likely due to running VMs. I then yanked the power and the host rebooted and everything is working as it should. The update definitely caused the issue.
I did update my home lab without any issue, before, during and after the update (I did test just after the update without any reboot).