XCP-ng Security Bulletin: MDS hardware vulnerabilities in Intel CPUs

Security May 21, 2019

Several security issues have been found in recent CPUs from Intel (likely yours) that may allow unprivileged processes to read memory data belonging to other processes using the same CPU core. This can even include data from other virtual machines or the hypervisor itself.

Full mitigation of this hardware issue requires a reponse in several parts. All of them are necessary to mitigate the issue:

Install security updates for XCP-ng and reboot the hosts (pool master first as usual).
Check with your hardware vendors for system firmware updates ("BIOS").
Disable CPU hyper-threading.
Install security updates for the operating systems of your VMs and follow instructions from their vendors, to protect the VM from attacks coming from within the VM.
Stop and restart the VMs to fully apply the mitigating changes (a reboot is not enough)

More details about these steps in what follows, but first a few words about the security issue itself.

The list of vulnerable CPUs is available at https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00233.html

The "MDS attacks"

Academics found ways to exploit a new class of vulnerabilities in Intel processors. This is once again a side-channel attack related to speculative execution, as were Meltdown, Spectre and Foreshadow.

MDS stands for "Microarchitectural Data Sampling". MDS attacks target very small caches that are used by the CPU in addition to its main cache for faster reads and writes of the data the CPU is processing. Those attacks (four of them, each one having its own CVE) allow an unpriviledged process to retrieve information belonging to other processes running on the same CPU core. In the case of an hypervisor, an unpriviledged process from within a VM can get sensitive information not only from processes in the same VM, but also potentially from other VMs or from the hypervisor itself.

List of CVEs:

CVE-2018-12126 - Microarchitectural Store Buffer Data Sampling (MSBDS) - "Fallout"
CVE-2018-12127 - Microarchitectural Load Port Data Sampling (MLPDS)
CVE-2018-12130 - Microarchitectural Fill Buffer Data Sampling (MFBDS) - "Zombieload" or "RIDL"
CVE-2018-11091 - Microarchitectural Data Sampling Uncacheable Memory (MDSUM)

1. Installing software security updates for the hypervisor

We provide updates for XCP-ng 7.6. We hope to be able to provide updates for XCP-ng 7.5 in the near future.

The updates contain:

microcode updates from Intel for a list of CPUs. This is the core of the mitigation.
mitigation patches for xen.

If your CPU belongs to the list of vulnerable CPUs, after applying our updates and rebooting the host, you can check whether new microcode covered your CPU with xl dmesg | grep "Hardware features:". Presence of the text MD_CLEAR indicates that updated microcode is present.

Note: Intel has not released updated microcode for vulnerable legacy CPUs, so they are not covered by this update. This document from Intel lists the models with available microcode updates and lists those that won't get any update.

As usual, refer to our Updates Howto for update instructions. In short, you have two options:

using yum update directly on each host
using Xen Orchestra to install them pool wide with one click in the "Patch" tab of the pool view, clicking on the "Install pool patches" button:

Install pool patches button

Note: updating won't interrupt anything, you can update confidently in production. It will take effect only after a host reboot.

It's up to you to decide when to reboot your hosts. As usual, always reboot your pool master first. Just be aware that until you decide to reboot, your hosts aren't protected against these attacks.

2. System firmware updates ("BIOS")

Your hardware vendor may provide system firmware updates, either now or in the future. They can include newer CPU microcode. They can also be required to adapt to some microcode changes from the update we provided. Make sure to check any instructions they provide related to the mitigation of those issues.

If you skipped section 1/ above, have a look at the list of CPUs that are not going to receive microcode updates from Intel. Those will sadly remain vulnerable.

3. Disable CPU hyper-threading

(also known as simultaneous multi-threading)

In previous security bulletins related to hardware CPU vulnerabilities, we already advised to disable Hyper Threading. Mitigating this new class of vulnerabilities requires disabling hyper-threading if your CPU is in the list of vulnerable hardware.

Hyper-threading defeats all other attempts to mitigate the issue

Disabling hyper-threading is likely to impact the performance of your platform. This is an unfortunate consequence of those hardware security flaws.

You may consider keeping hyper-threading enabled only in situations where you have an absolute control over the workload. Please note that there have been demos from researchers showing that they could leak data through webassembly executed in the browser, so in theory simply visiting the wrong website, or maybe even a malicious ad that executes on a "trusted" website, could leak data from other VMs or from the hypervisor itself.

The following documents from Citrix, that also apply to XCP-ng, explain how to disable hyper-threading and cover issues that can be met if there are over-provisioned or vCPUs pinned to physical pCPUs that do not exist anymore due to disabling hyper-threading:

4. Install security updates for the operating systems of your VMs

Even after following steps 1 to 3, you'll still need to update your guest operating systems to protect them from internal data leakage (between processes within the VM).

5. Stop and start your VMs (reboot is not enough)

A reboot of a VM is not enough for it to update its knowledge of the CPU (CPUID), so for full mitigation it is necessary to stop the guest and then start it.