-
@stormi Thanks for the quick reply. That's what I'd figured but I like to be cautious so thought I should ask.
-
Urgent security update candidate (only a few hours to provide feedback)
There's an escalation of privilege vulnerability in Intel CPUs of the last few years. It was silently mitigated by Intel in previous microcode updates for the most recent CPU generations, but older affected CPUs were only fixed in yesterday's microcode.
See https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00950.html
Test on XCP-ng 8.2
yum clean metadata --enablerepo=xcp-ng-testing yum update microcode_ctl --enablerepo=xcp-ng-testing reboot
The usual update rules apply: pool coordinator first, etc.
Versions
microcode_ctl
: 2.1-26.xs26.2.xcpng8.2
What to test
Normal use and anything else you want to test. The closer to your actual use of XCP-ng, the better.
If you don't have time for more, just installing the update, rebooting, and checking one VM can start, will be enough.
Test window before official release of the updates
A few hours.
-
The server (Lenovo System x3650 M5) has started and all 4 VMs are started and functional
-
@stormi Microcode updated on affected Gen11 i7. Running normally.
-
Thanks for the feedback! Update published: https://xcp-ng.org/blog/2023/11/15/november-2023-security-update/
The blog post also contains information about two vulnerabilities in Xen, but which don't affect XCP-ng in a supported and/or default configuration.
Users of PV guests who still haven't converted them to HVM should consider it, though.
-
New security update candidates
As promised in the announcement of the previous security update, here's a new one which includes changes for previously missing XSA updates as well as an updated AMD microcode.
Security updates
xen-*
:- Fix XSA-445 - x86/AMD: mismatch in IOMMU quarantine page table levels. On x86 AMD systems with IOMMU hardware, a device in quarantine mode, using
dom_io
, could access leaked data from previously quarantined pages. This is not enabled by default in XCP-ng, but can still be enabled at Xen boot time. - Fix XSA-446 - x86: BTC/SRSO fixes not fully effective. A PV guest could infer memory content from other guests. We do not recommand using PV guests and have been suggesting switching to HVM for a while, so we do hope most users were not impacted by this.
- Fix XSA-445 - x86/AMD: mismatch in IOMMU quarantine page table levels. On x86 AMD systems with IOMMU hardware, a device in quarantine mode, using
linux-firmware
: Update AMD microcode to 2023-10-19 drop, updating the family 19h, so Zen 3, Zen3+ and Zen 4. AMD Advisory here.
Other updates
We plan to also push other, non security, updates at the same time, to pave the way for the upcoming refreshed installation ISOs.
gpumon
: suppression of logs which were needlessly written every 5s into /var/log/daemon.log.tzdata
: updated timezones.vendor-drivers
: pull new drivers into XCP-ng:igc-module
: Intel device drivers for I225/I226r8125-module
: Realtek r8125 device driversmpi3mr-module
: Broadcom mpi3mr RAID device driver
Test on XCP-ng 8.2
yum clean metadata --enablerepo=xcp-ng-testing yum update "xen-*" linux-firmware gpumon vendor-drivers tzdata --enablerepo=xcp-ng-testing reboot
The usual update rules apply: pool coordinator first, etc.
Versions:
xen
: 4.13.5-9.38.1.xcpng8.2linux-firmware
: 20190314-10.1.xcpng8.2 (Update: now 20190314-10.2.xcpng8.2, which adds firmware for rtl8125)gpumon
: 0.18.0-11.2.xcpng8.2tzdata
: 2023c-1.el7vendor-drivers
: 1.0.2-1.6.xcpng8.2
What to test
Normal use and anything else you want to test. The closer to your actual use of XCP-ng, the better.
Test window before official release of the updates
~4 daysSamuel, along with David and Gaël
-
Update done and reboot successful
-
The update is installed and seems to be working without problems on my two test systems.
-
@stormi Updated on several Intel Xeon servers. Updated on new Intel and AMD (zen3) systems with IGC and r8125 chips. One issue... the base install does not include the standard firmware for the 8125.
-
@stormi My two host cluster (HP ProDesk 600 G6) updated without an issue. Let's see how the cluster is performing during the coming days.
-
@Andrew I just pushed an updated
linux-firmware
to testing, with firmware for rtl8125. Should be available within 10 minutes. -
@stormi r8125 firmware loads.
-
So, we found out the AMD vulnerability actually doesn't affect XCP-ng directly, because Xen doesn't use AMD's SEV features currently.
The other two vulnerabilities still need fixing, but they both can only be exploited if XCP-ng is used in an either unlikely or unsupported way. We'll fix them in due course, but won't push the update to everyone today as initially planned. We will delay them slightly to give them a chance to be grouped with future updates and thus cause less maintenance for users.
Thanks for the tests anyway: we will be able to publish these packages whenever we need now.
-
New security update candidates (kernel)
A new XSA was published on the 23rd of January, so we have a new security update to include it.
Security updates
kernel
:
* Fix XSA-448 - Linux: netback processing of zero-length transmit fragment. An unprivileged guest can cause Denial of Service (DoS) of the host bysending network packets to the backend, causing the backend to crash. This was discovered through issues when using pfSense with wireguard causing random crashes of the host.
Test on XCP-ng 8.2
yum clean metadata --enablerepo=xcp-ng-testing yum update kernel --enablerepo=xcp-ng-testing reboot
The usual update rules apply: pool coordinator first, etc.
Versions:
kernel
: 4.19.19-7.0.23.1.xcpng8.2
What to test
Normal use and anything else you want to test. The closer to your actual use of XCP-ng, the better.
Test window before official release of the updates
~2 days due to security updates. -
Did anyone install it? The 2 days delay is over and we'll publish today.
-
@stormi Yes, I installed it on a few running hosts. I did not have any kernel crashes before, and none after...
-
Installed here, works
-
@NielsH Kind of off topic but figured I'd mention it as I only recently discovered this.
Not sure what VMs you're running, but if they can survive being off for a short time (redundancy of services or planned outage) you can reboot the host using Smart Reboot under the Advanced tab. While it incurs some downtime, it allows for a much faster reboot time than migrating the VMs to another server and back.
I use local storage as well and it's been a game changer for dealing with pool patches.
-
The update has been published, thanks for the feedback and tests.
https://xcp-ng.org/blog/2024/01/26/january-2024-security-update/
-
@CJ said in Updates announcements and testing:
@NielsH Kind of off topic but figured I'd mention it as I only recently discovered this.
Not sure what VMs you're running, but if they can survive being off for a short time (redundancy of services or planned outage) you can reboot the host using Smart Reboot under the Advanced tab. While it incurs some downtime, it allows for a much faster reboot time than migrating the VMs to another server and back.
I use local storage as well and it's been a game changer for dealing with pool patches.
Cheers, thanks for the suggestion. In our case we actually are phashing out xcp-ng and are in the process of migrating to Proxmox since we can migrate with 30-35Gbit/s there. The disk performance is so much faster there we can perform all the updates in a single day instead of 2 weeks
Another issue we had was that VM migrations of very large VMs (usually 8cores+) are quite impactful. Because we want to use VMs with 24-48 cores and 128GB RAM as well it simply was not usable enough for us. There's several seconds, or sometimes even minutes of downtime during the last phase of the migration with the large VMs.
With Proxmox we have seen very little downtime (<1s) which we are very happy about.