XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    "CROSSTalk" CPU vulnerabilty (cross-core data leak)

    Scheduled Pinned Locked Moved News
    29 Posts 8 Posters 6.4k Views 3 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • stormiS Offline
      stormi Vates 🪐 XCP-ng Team
      last edited by stormi

      Intel CPUs "CROSSTalk" vulnerability.

      Following the disclosure of the CROSSTalk CPU vulnerabilities and the release of updated microcode by Intel, here are update candidates for XCP-ng 8.0 and 8.1. Prompt feedback by all available testers is wanted.

      The updated microcode has a huge performance impact on specific CPU operations, such as random number generation. Performance impact on common or specific workloads is yet to be evaluated. You are welcome to share your own findings.

      The updates includes:

      • updated microcode. I think that the mitigation is on if you can find SRBDS_CTRL in /var/log/xen/hypervisor.log. No fix is available at the moment for the IvyBridge CPU family (core i3, i5, i7 for desktop and equivalent Xeons. List at https://en.wikipedia.org/wiki/Ivy_Bridge_(microarchitecture)), which is not supported by Intel anymore.
      • updated Xen with new options to "offer boot time information, defaults selection, and opt-out controls" - see srb-lock in https://xenbits.xen.org/docs/unstable-staging/misc/xen-command-line.html#spec-ctrl-x86
      • updated kernel to lower the performance impact of the microcode update

      Install them on XCP-ng 8.0 or 8.1 with:

      yum update kernel microcode_ctl xen-dom0-libs xen-dom0-tools xen-hypervisor xen-libs xen-tools --enablerepo=xcp-ng-testing
      

      Downgrade with:

      yum downgrade kernel microcode_ctl xen-dom0-libs xen-dom0-tools xen-hypervisor xen-libs xen-tools
      

      Related:

      • Xen advisory: http://xenbits.xen.org/xsa/advisory-320.html
      • Citrix advisory: https://support.citrix.com/article/CTX275165
      gskgerG 1 Reply Last reply Reply Quote 0
      • gskgerG Offline
        gskger Top contributor @stormi
        last edited by

        @stormi updated my homelab / playlab Intel NUC ( i5-6260U, Skylake, XCP-ng 8.1) with no problem.

        /var/log/xen/hypervisor.log indeed shows SRBDS_CTRL (log shortend for better reading):

        CPU0: Intel machine check reporting enabled
        Speculative mitigation facilities:
        Hardware features: IBRS/IBPB STIBP L1D_FLUSH SSBD MD_CLEAR SRBDS_CTRL
        Compiled-in support: INDIRECT_THUNK SHADOW_PAGING
        Xen settings: BTI-Thunk JMP, SPEC_CTRL: IBRS+ SSBD-, Other: SRB_LOCK+ IBPB L1D_FLUSH VERW BRANCH_HARDEN
           L1TF: believed vulnerable, maxphysaddr L1D 46, CPUID 39, Safe address 8000000000
           Support for HVM VMs: MSR_SPEC_CTRL RSB EAGER_FPU MD_CLEAR
           Support for PV VMs: MSR_SPEC_CTRL RSB EAGER_FPU MD_CLEAR
           XPTI (64-bit PV only): Dom0 enabled, DomU enabled (with PCID)
           PV L1TF shadowing: Dom0 disabled, DomU enabled
        

        Using 7zip benchmark on DOM0 (installed from EPEL repository) does not show any significant impact on performance.
        Most likely, this benchmark is not that relevant with respect to the affected cpu functions (e.g. random number generation).
        Would love to run sysbenchbut have no idea how to install it on XCP-ng.

        stormiS 1 Reply Last reply Reply Quote 0
        • stormiS Offline
          stormi Vates 🪐 XCP-ng Team @gskger
          last edited by

          @gskger said in "CROSSTalk" CPU vulnerabilty (cross-core data leak):

          Would love to run sysbenchbut have no idea how to install it on XCP-ng.

          This will install the version from EPEL:

          yum install sysbench --enablerepo=base,updates,epel
          
          gskgerG 1 Reply Last reply Reply Quote 0
          • gskgerG Offline
            gskger Top contributor @stormi
            last edited by gskger

            @stormi nice and way to simple - should have read the documentation more carefully 😇

            Anyway, did a clean install and update of xcp-ng 8.1 and ran sysbench (sysbench cpu --cpu-max-prime=20000 --threads=4 --time=30 run) without and with the microcode update.
            Even with maxing out all threads, there is no noticeable impact on CPU performance on my Intel NUC ( i5-6260U, 2 Cores, 4 Threads).
            But mileage may vary with different CPU types.

            1 Reply Last reply Reply Quote 0
            • stormiS Offline
              stormi Vates 🪐 XCP-ng Team
              last edited by

              Yeah the performance impact is very dependent on the kind of workload.

              1 Reply Last reply Reply Quote 0
              • stormiS Offline
                stormi Vates 🪐 XCP-ng Team
                last edited by

                As part of this update candidate, there is now a kernel update to lower the performance impact of the microcode update.

                We only have the testing results from one user for now. Can other users spare some hardware and time today?

                1 Reply Last reply Reply Quote 0
                • stormiS Offline
                  stormi Vates 🪐 XCP-ng Team
                  last edited by

                  Update published: https://xcp-ng.org/blog/2020/06/12/intel-microcode-security-update-crosstalk/

                  1 Reply Last reply Reply Quote 1
                  • D Offline
                    demanzke
                    last edited by

                    Hi stormi,

                    I know I am a bit late to report but I just updated my XCP-ng install from the main repo. I'm using an i3-7350K that shouldn't be vulnerable according to Intels list you posted.
                    After the update default boot settings don't work, the loading screen stalls for a long time then prints a bunch of messages containing "dracut-initqueue timeout - starting timing scripts" followed by "could not boot" and stops there (I don't have the exact wording as I don't have access to the system right now).
                    If I select safe boot in GRUB there is a kernel panic during boot "couldn't enable IOMMU and iommu=required/forced".
                    Selecting the 4.19.0 kernel during boot works as usual.

                    Is there anything else I could try?

                    1 Reply Last reply Reply Quote 0
                    • stormiS Offline
                      stormi Vates 🪐 XCP-ng Team
                      last edited by stormi

                      @demanzke thanks for the report.

                      To be sure I understand, is it like described on the following picture?

                      55f25c88-6c74-4828-ad77-7e644f87492e-image.png

                      The XCP-ng (Xen 4.13.0 / Linux 4.19.0+1) uses the Xen and linux Kernel version from the last ISO installation or upgrade.

                      A screenshot from the failed boot could be useful. I think the "dracut-initque timeout" is usually followed by some information about what failed. Could you also run xen-bugtool -y from the booted host, upload the resulting tar.gz somewhere and send me the link in a private message?

                      It looks like it's a kernel issue, but since the boot option that works for you also reverts Xen to a previous version, the way to be sure would be to downgrade the kernel and then boot again:

                      # yum downgrade won't work for the kernel because it's a protected package, so let's use rpm
                      yumdownloader kernel-4.19.19-6.0.10.1.xcpng8.1
                      rpm -Uv --oldpackage kernel-4.19.19-6.0.10.1.xcpng8.1.x86_64.rpm
                      
                      1 Reply Last reply Reply Quote 0
                      • D Offline
                        demanzke
                        last edited by

                        It is the "Safe Mode" options that results in a kernel panic, not "Serial".
                        I will grab the screenshots and bugtool logs and test the different kernel later today.

                        stormiS 1 Reply Last reply Reply Quote 0
                        • stormiS Offline
                          stormi Vates 🪐 XCP-ng Team @demanzke
                          last edited by

                          @demanzke fixed the picture 🙂

                          1 Reply Last reply Reply Quote 0
                          • stormiS Offline
                            stormi Vates 🪐 XCP-ng Team
                            last edited by

                            Do you get an emergency shell after dracut-initqueue timeout? If yes, there are probably logs that you can read from the current filesystem (which is in RAM at this stage of the boot process so probably disappears afterwards).

                            1 Reply Last reply Reply Quote 0
                            • D Offline
                              demanzke
                              last edited by

                              There is no emergency shell after the failed boot, sadly.

                              This is what happens after the loading bar on default settings:
                              IMG_20200615_192408.jpg

                              Right after selecting "Safe Boot":
                              IMG_20200615_192531.jpg

                              Installing the suggested kernel 6.0.10 changed nothing. Should I try downgrading other packages or an even older kernel version?

                              stormiS 1 Reply Last reply Reply Quote 0
                              • stormiS Offline
                                stormi Vates 🪐 XCP-ng Team @demanzke
                                last edited by

                                @demanzke said in "CROSSTalk" CPU vulnerabilty (cross-core data leak):

                                Installing the suggested kernel 6.0.10 changed nothing. Should I try downgrading other packages or an even older kernel version?

                                Yes, please try:

                                yum downgrade xen-dom0-libs xen-dom0-tools xen-hypervisor xen-libs xen-tools
                                

                                Then if it still changes nothing:

                                yum downgrade microcode_ctl
                                

                                After all this, you'll be theoretically back to the state from before the update... Though there may be an issue with the initrd generation, which would still not allow you to boot.

                                D 1 Reply Last reply Reply Quote 0
                                • D Offline
                                  demanzke @stormi
                                  last edited by

                                  @stormi said in "CROSSTalk" CPU vulnerabilty (cross-core data leak):

                                  Yes, please try:

                                  yum downgrade xen-dom0-libs xen-dom0-tools xen-hypervisor xen-libs xen-tools
                                  

                                  Then if it still changes nothing:

                                  yum downgrade microcode_ctl
                                  

                                  After all this, you'll be theoretically back to the state from before the update... Though there may be an issue with the initrd generation, which would still not allow you to boot.

                                  Sadly nothing changed after downgrading the packages. The only thing I have changed after the base install was installing your ZFS port.
                                  At this point I would try a fresh install on the weekend and see if the problem reappears unless you have another suggestion.

                                  1 Reply Last reply Reply Quote 0
                                  • stormiS Offline
                                    stormi Vates 🪐 XCP-ng Team
                                    last edited by

                                    If we want to understand fully what happens, we could compare the contents of the initial ramdisks:

                                    • initrd-4.19.0+1.img => doesn't work anymore
                                    • initrd-fallback.img => still works

                                    One can extract them with:

                                    mkdir initrd-current
                                    cd initrd-current/
                                    /usr/lib/dracut/skipcpio /boot/initrd-4.19.0+1.img | zcat | cpio -ivd
                                    cd ..
                                    mkdir initrd-fallback
                                    cd initrd-fallback/
                                    /usr/lib/dracut/skipcpio /boot/initrd-fallback.img | zcat | cpio -ivd
                                    

                                    I don't know what differences to look for, to be honest. Maybe you could save those files and upload them somewhere for anyone interested to look at?

                                    Reinstalling the host then trying the update again, without ZFS first, then with it (which probably means reinstalling again and redoing the steps), could also be interesting to help precisely understand what happens.

                                    For now, it mainly looks like it's related to the initrd, which is generated by dracut when the kernel or other kernel modules (such as the kernel module for ZFS) are installed. As you may know, the initrd is the initial ramdisk which contains a minimal system booted before the actual system and which must be able to mount your root filesystem to be able to continue. Unfortunately we don't know from the output you get what the error is so it's all conjectures.

                                    1 Reply Last reply Reply Quote 0
                                    • B Offline
                                      Biggen
                                      last edited by

                                      When this Crosstalk microcode update hit last week there was an issue with certain Intel CPUs where we coudn't boot after the patch was applied. I run Linux Mint on my laptop and I couldn't boot it after taking the microcode update. I had to boot into recovery and then apt remove intel-microcode to get it back to a working state. Later that day, Ubuntu (or whoever) released a new intel-microcode update that corrected the problem.

                                      Not sure if this is even remotely close to the same issue but wanted to put this out there.

                                      1 Reply Last reply Reply Quote 1
                                      • DanpD Offline
                                        Danp Pro Support Team
                                        last edited by

                                        Has anyone else encountered this issue? Wondering if these patches should be pulled until this gets resolved.

                                        1 Reply Last reply Reply Quote 0
                                        • stormiS Offline
                                          stormi Vates 🪐 XCP-ng Team
                                          last edited by stormi

                                          As far as I know, those patches work well on Citrix' test hosts. They also work well on our hosts at Vates. The microcodes underwent Intel's QA so I don't expect them to break on the vast majority of hardware, though there are reports of issues with some specific models. In @demanzke's case, reverting to the previous microcode did not fix the issue so at first it doesn't look like it's related to the microcode.

                                          1 Reply Last reply Reply Quote 0
                                          • stormiS Offline
                                            stormi Vates 🪐 XCP-ng Team
                                            last edited by stormi

                                            Intel just released updated microcode (actually it's a revert) for some models: https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files/releases

                                            I'll update the microcode_ctl package. The "older" microcode that is used instead is still recent enough to contain the fixes against CROSSTalk / SRBDS. Or so I had understood, but I can't find evidence about it.

                                            L 1 Reply Last reply Reply Quote 1
                                            • First post
                                              Last post