XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. bleader
    3. Best
    Offline
    • Profile
    • Following 0
    • Followers 1
    • Topics 0
    • Posts 61
    • Groups 4

    Posts

    Recent Best Controversial
    • RE: XCP-ng 8.2 updates announcements and testing

      Update published: https://xcp-ng.org/blog/2024/09/27/september-2024-security-updates/

      Thank you for the tests!

      posted in News
      bleaderB
      bleader
    • RE: XCP-ng 8.2 updates announcements and testing

      New security update candidates (xen)

      Two new XSAs were published on 30th of January.

      • XSA-449 impacts PCI passthrough users.
      • XSA-450 is only impacting the case where Xen is compiled without HVM support, that is not the case in XCP-ng. We therefore chose not to include this fix yet (will likely be included in future versions, maybe not part of a critical security update).

      SECURITY UPDATES

      • xen-*:
            * Fix XSA-449 - pci: phantom functions assigned to incorrect contexts. A malicious VM assigned with a PCI device could in some cases access data of a guest previously using the same PCI device. This requires PCI passthrough on a device using phantom functions and reassigning the same device to a new VM to be exploitable.

      Test on XCP-ng 8.2

      yum clean metadata --enablerepo=xcp-ng-testing
      yum update "xen-*" --enablerepo=xcp-ng-testing
      reboot
      

      The usual update rules apply: pool coordinator first, etc.

      Versions:

      • xen: 4.13.5-9.38.2.xcpng8.2

      What to test

      Normal use and anything else you want to test, if you are using PCI passthrough devices that's even better, but we also would be glad to have confirmation from others that their normal use case still works as intended.

      Test window before official release of the updates
      2 day because of security updates.

      posted in News
      bleaderB
      bleader
    • RE: XCP-ng 8.2 updates announcements and testing

      New security update candidates (kernel)

      A new XSA was published on the 23rd of January, so we have a new security update to include it.

      Security updates

      • kernel:
          * Fix XSA-448 - Linux: netback processing of zero-length transmit fragment. An unprivileged guest can cause Denial of Service (DoS) of the host bysending network packets to the backend, causing the backend to crash. This was discovered through issues when using pfSense with wireguard causing random crashes of the host.

      Test on XCP-ng 8.2

      yum clean metadata --enablerepo=xcp-ng-testing
      yum update kernel --enablerepo=xcp-ng-testing
      reboot
      

      The usual update rules apply: pool coordinator first, etc.

      Versions:

      • kernel: 4.19.19-7.0.23.1.xcpng8.2

      What to test

      Normal use and anything else you want to test. The closer to your actual use of XCP-ng, the better.

      Test window before official release of the updates
      ~2 days due to security updates.

      posted in News
      bleaderB
      bleader
    • RE: XCP-ng 8.2 updates announcements and testing

      Update published https://xcp-ng.org/blog/2024/07/18/july-2024-security-updates/

      Thank you everyone for your tests!

      posted in News
      bleaderB
      bleader
    • RE: XCP-ng 8.2 updates announcements and testing

      New security update candidate (xen, xapi, xsconsole)

      Two new XSAs were published on 16th of July.


      • XSA-458 guests which have a multi-vector MSI capable device passed through to them can leverage the vulnerability.
      • XSA-459 impacts systems running Xapi v3.249.x, which means any up to date XCP-ng 8.2. Note this requires heavy crafting and likely social engineering on the attacker side, see the XSA's "VULNERABLE SYSTEMS" section for more details.

      SECURITY UPDATES

      • xen-*:
        • Fix XSA-458 - double unlock in x86 guest IRQ handling. When passing through a multi-vector MSI capable device to a guest, an attacker could use an error handling path that could lead to the issue, no exploitations results have been ruled out: Denial of Service (DoS), crashes, information leaks, or elevation of privilege could all be possible.
      • xapi, xsconsole:
        • Fix XSA-459 - Xapi: Metadata injection attack against backup/restore functionality. A malicious guest can manipulate its disk to appear to be a metadata
          backup, then having about a 50% chance of appearing ahead of a legitimate metadata backup. The more disks the guest has, the higher the chances of this happening are.

      Test on XCP-ng 8.2

      yum clean metadata --enablerepo=xcp-ng-testing
      yum update "xen-*" "xapi-*" xsconsole --enablerepo=xcp-ng-testing
      reboot
      

      The usual update rules apply: pool coordinator first, etc.

      Versions:

      • xen: xen-4.13.5-9.40.2.xcpng8.2
      • xapi: xapi-1.249.36-1.2.xcpng8.2
      • xsconsole: xsconsole-10.1.13-1.2.xcpng8.2

      What to test

      Normal use and anything else you want to test.

      Test window before official release of the update

      ~ 1 day because of security updates.

      posted in News
      bleaderB
      bleader
    • RE: XCP-ng 8.2 updates announcements and testing

      The update has been published, thanks for testing.

      https://xcp-ng.org/blog/2024/02/02/february-2024-security-update/

      posted in News
      bleaderB
      bleader
    • RE: XCP-ng 8.2 updates announcements and testing

      The update has been published, thanks for the feedback and tests.

      https://xcp-ng.org/blog/2024/01/26/january-2024-security-update/

      posted in News
      bleaderB
      bleader
    • RE: Epyc VM to VM networking slow

      Hello guys,

      I'll be the one investigating this further, we're trying to compile a list of CPUs and their behavior. First, thank you for your reports and tests, that's already very helpful and gave us some insight already.

      Setup

      If some of you can help us cover more ground that would be awesome, so here is what would be an ideal for testing to get everyone on the same page:

      • An AMD host, obviously 🙂
        • yum install iperf ²
      • 2 VMs on the same host, with the distribution of your choice¹
        • each with 4 cores if possible
        • 1GB of ram should be enough if you don't have a desktop environment to load
        • iperf2²

      ¹: it seems some recent kernels do provide a slight boost, but in any case the performance is pretty low for such high grade CPUs.
      ²: iperf3 is singlethreaded, the -P option will establish multiple connexions, but it will process all of them in a single thread, so if reaching a 100% cpu usage, it won't get much increase and won't help identifying the scaling on such a cpu. For example on a Ryzen 5 7600 processor, we do have about the same low perfomances, but using multiple thread will scale, which does not seem to be the case for EPYC Zen1 CPUs.

      Tests

      • do not disable mitigations for now, as its only on kernel side, there are still mitigation active in xen, and from my testing it doesn't seem to help much, and will increase combinatory of results
      • for each test, run xentop on host, and try to get an idea of the top values of each domain when the test is running
      • run iperf -s on VM1, and let it run (no -P X this would stop after X connexion established)
      • tests:
        • vm2vm 1 thread: on VM2, run iperf -c <ip_VM1> -t 60, note result for v2v 1 thread
        • vm2vm 4 threads on VM2, run iperf -c <ip_VM1> -t 60 -P4, note result for v2v 4 threads
        • host2vm 1 thread: on host, run iperf -c <ip_VM1> -t 60, note result for h2v 1 thread
        • host2vm 4 threads on host, run iperf -c <ip_VM1> -t 60 -P4, note result for h2v 4 threads

      Report template

      Here is an example of report template

      • Host:
        • cpu:
        • number of sockets:
        • cpu pinning: yes (detail) / no (use automated setting)
        • xcp-ng version:
        • output of xl info -n especially the cpu_topology section in a code block.
      • VMs:
        • distrib & version
        • kernel version
      • Results:
        • v2m 1 thread: throughput / cpu usage from xentop³
        • v2m 4 threads: throughput / cpu usage from xentop³
        • h2m 1 thread: througput / cpu usage from xentop³
        • h2m 4 threads: througput / cpu usage from xentop³

      ³: I note the max I see while test is running in vm-client/vm-server/host order.

      What was tested

      Mostly for information, here are a few tests I ran which did not seem to improve performances.

      • disabling the mitigations of various security issues at host and VM boot time using kernel boot parameters: noibrs noibpb nopti nospectre_v2 spectre_v2_user=off spectre_v2=off nospectre_v1 l1tf=off nospec_store_bypass_disable no_stf_barrier mds=off mitigations=off. Note this won't disable them at xen level as there are patches that enable the fixes for the related hardware with no flags to disable them.
      • disabling AVX passing noxsave in kernel boot parameters as there is a known issue on Zen CPU avoided boosting when a core is under heavy AVX load, still no changes.
      • Pinning: I tried to use a single "node" in case the memory controllers are separated, I tried avoiding the "threads" on the same core, and I tried to spread load accross nodes, althrough it seems to give a sllight boost, it still is far from what we should be expecting from such CPUs.
      • XCP-ng 8.2 and 8.3-beta1, seems like 8.3 is a tiny bit faster, but tends to jitter a bit more, so I would not deem that as relevant either.

      Not tested it myself but @nicols tried on the same machine giving him about 3Gbps as we all see, on VMWare, and it went to ~25Gbps single threaded and about 40Gbps with 4 threads, and with proxmox about 21.7Gbps (I assume single threaded) which are both a lot more along what I would expect this hardware to produce.

      @JamesG did test windows and debian guests and got about the same results.

      Althrough we do get a small boost by increasing threads (or connexions in case of iperf3), it still is far from what we can see on other setups with vmware or proxmox).

      Althrough Olivier's pool with zen4 desktop cpu do scale a lot better than EPYCs when increasing the number of threads, it still is not providing us with expected results for such powerful cpus in single thread (we do not even reach vmware single thread performances with 4 threads).

      Althrough @Ajmind-0 test show a difference between debian versions, results even on debian 11 are stil not on par with expected results.

      Disabling AVX only provided an improvement on my home FX cpu, which are known to not have real "threads" and share computing unit between 2 threads of a core, so it does make sense. (this is not shown in the table)

      It seems that memcpy in the glibc is not related to the issue, dd if=/dev/zero of=/dev/null has decent performances on these machines (1.2-1.3GBytes/s), and it's worth keeping in mind that both kernel and xen have their own implementation, so it could play a small role in filling the ring buffer in iperf, but I feel like the libc memcpy() is not at play here.

      Tests table

      I'll update this table with updated results, or maybe repost it in further post.

      Throughputs are in Gbit/s, noted as G for shorter table entries.

      CPU usages are for (VMclient/VMserver/dom0) in percentage as shown in xentop.

      user cpu family market v2v 1T v2v 4T h2v 1T h2v 4T notes
      vates fx8320-e piledriver desktop 5.64 G (120/150/220) 7.5 G (180/230/330) 9.5 G (0/110/160) 13.6 G (0/300/350) not a zen cpu, no boost
      vates EPYC 7451 Zen1 server 4.6 G (110/180/250) 6.08 G (180/220/300) 7.73 G (0/150/230) 11.2 G (0/320/350) no boost
      vates Ryzen 5 7600 Zen4 desktop 9.74 G (70/80/100) 19.7 G (190/260/300) 19.2G (0/110/140) 33.9 G (0/310/350) Olivier's pool, no boost
      nicols EPYC 7443 Zen3 server 3.38 G (?) iperf3
      nicols EPYC 7443 Zen3 server 2.78 G (?) 4.44 G (?) iperf2
      nicols EPYC 7502 Zen2 server similar ^ similar ^ iperf2
      JamesG EPYC 7302p Zen2 server 6.58 G (?) iperf3
      Ajmind-0 EPYC 7313P Zen3 server 7.6 G (?) 10.3 G (?) iperf3, debian11
      Ajmind-0 EPYC 7313P Zen3 server 4.4 G (?) 3.07G (?) iperf3, debian12
      vates EPYC 9124 Zen4 server 1.16 G (16/17/??⁴) 1.35 G (20/25/??⁴) N/A N/A !xcp-ng, Xen 4.18-rc + suse 15
      vates EPYC 9124 Zen4 server 5.70 G (100/140/200) 10.4 G (230/250/420) 10.7 G (0/120/200) 15.8 G (0/320/380) no boost
      vates Ryzen 9 5950x Zen3 desktop 7.25 G (30/35/60) 16.5 G (160/210/300) 17.5 G (0/110/140) 27.6 G (0/270/330) no boost

      ⁴: xentop on this host shows 3200% on dom0 all the time, profiling does not seem to show anything actually using CPU, but may be related to the extremely poor performance

      last updated: 2023-11-29 16:46

      All help is welcome! For those of you who already provided tests I integrated in the table, feel free to not rerun tests, it looks like following the exact protocol and provided more data won't make much of a difference and I don't want to waste your time!

      Thanks again to all of you for your insight and your patience, it looks like this is going to be a deep rabbit hole, I'll do my best to get to the bottom of this as soon as possible.

      posted in Compute
      bleaderB
      bleader
    • RE: Live migrate of Rocky Linux 8.8 VM crashes/reboots VM

      So, after our investigations, we were able to pinpoint the issue.

      It seem to happen on most RHEL derivative distributions when migrating from 8.7 to 8.8. As suggested, the bug is in the kernel.

      Starting with 4.18.0-466.el8 the patch: x86/idt: Annotate alloc_intr_gate() with __init is integrated and will create the issue. It is missing x86/xen: Split HVM vector callback setup and interrupt gate allocation that should have been integrated as well.

      The migration to 8.8 will move you to 4.18.0-477.* versions that are also raising this issue, that's what you reported.

      We found that the 4.18.0-488 that can be found in CentOS 8 Stream integrates the missing patch, and do indeed work when installed manually.

      Your report helped us identify and reproduce the issues. That allowed us to provide a callstack to Xen devs. Then Roger Pau Monné found that it was this patch missing quickly, and we were able to find which versions of the kernel RPMs were integrating it and when the fix was integrated.

      This means the issue was identified on RH side, and it is now a matter of having an updated kernel in derivative distributions like Rocky and Alma.

      posted in Compute
      bleaderB
      bleader
    • RE: Security Assessments and Hardening of XCP-ng

      The question that I'm asking here is how does the Vates Team evaluate these vulnerabilities, Qualys, Greenbone, something else?

      I'm not sure what you mean by evaluate vulnerability, especially the list about Qualys, Greenbone…

      If you mean how do we track and process them, I cannot talk about XO side, but I can shed some light on XCP-ng side:

      • we have an internal dependency-track (DT) with various projects (8.2 default install, 8.2 available packages, same split for 8.3), with a custom SBOM generation, to feed DT
        • this is based on CVEs and their Common Platform Enumeration (CPE)
        • the main issue here is that not all CVEs fill the CPEs the same way, so there may be some misses
        • we're trying to improve the SBOM generation to minimize this
      • we also monitor the oss-security mailing list, and some other sources
      • DT reports the CVEs that matched, and we can keep them in or mark them internally as not impacted, fixed, etc
      • we evaluate the priority for us based on their general criticality, but modulate this depending on if it is in base install or not, if it is a part of the software that is meant to be used as a server, if it related to remote acces, and more
      • the one we're impacted by and feel are important, we either update to the latest package version, but now that CentOS 7 is end of life, that's less likely to happen, or try to backport the fix ourselves when possible.

      That's for the dom0 side, on the hypervisor side, we're part of the security list of the Xen Project, so we receive the XSAs and integrate them as fast as we can in following our release process, sometime integrating the patches ourselves, sometime going with the XenServer fixes. If we integrate them ourselves we most of the time remove our own integration and move to the one from XenServer as the people working on these fixes are mostly the ones working on the XSAs in the first place so they have a better knowledge and insigts than us.

      I hope this answers this question.

      Is the Vates team open to the community reporting these vulnerabilities openly or would a ticket be best?

      On XCP-ng side, everything that are packages from open source would be reports of publicly disclosed CVEs, so you can openly report them. If people were to find new vulnerabilities it would depend, but should follow a classic private disclosure in the first place:

      • if it's in an open source package, the upstream would be the best place to do so
      • On the same idea if it's regarding Xen, XAPI, or other Xen Project software, reporting them upstream through the security process is the best way, and it could be nice to drop us a ticket for a heads up too, but that's not mandatory
      • if it is for some of the packages directly coming from us, creating a ticket for us to be able to work on it before a public disclosure would be best.

      Sorry, you asked about the whole ecosystem, but I'm only able to answer from the XCP-ng side of things.

      posted in XCP-ng
      bleaderB
      bleader
    • RE: High Fan Speed Issue on Lenovo ThinkSystem Servers

      Could one of you try the kernel-alt package? It is not meant for production as it is not fully tested and supported, but if a higher patch level of the 4.19 helps, it could give us more idea of what's happening.

      EDIT: it should be updated to a new patch level soon-ish, so if current one does not fix, we should soon have another shot with a more recent update.

      posted in Hardware
      bleaderB
      bleader
    • RE: XCP-ng 8.2 updates announcements and testing

      Home host, no XOSTOR, updated fine, no issue with my usual VMs.

      posted in News
      bleaderB
      bleader
    • RE: XCP-ng 8.2 updates announcements and testing

      New security update candidates (xen, microcode_ctl)

      A new XSA was published on September 24th 2024.
      Intel published a microcode update on the September 10th 2024.
      We also included an updated xcp-ng-release for testing, althrough not related to security.


      • XSA-462 a malicious HVM or PVH guest can trigger a DoS of the host.

      SECURITY UPDATES

      • xen-*:
            * Fix XSA-462 - x86: Deadlock in vlapic_error(). The handling of x86's APIC (Advanced Programmable Interrupt Controller) allows a guest to configure an illegal vector to handle error interrupts. This causes the vlapic_error() to recurse, this is protected, but the lock used for this protection will try to be taken recursiveley, leading to a deadlock.
      • microcode_ctl:
            * Latest Intel microcode update, still named IPU 2024.3, including security updates for:
                * INTEL-SA-01103
                * INTEL-SA-01097

      Other updates

      • xcp-ng-release:
            * Update the "XOA Quick deploy" feature in the host's web page.
            *  Point at repo.vates.tech for CentOS since mirrorlist.centos.org was cut
            * Add "(EOL)" to repo descriptions for EOL repos
            * Drop unused repos

      Test on XCP-ng 8.2

      yum clean metadata --enablerepo=xcp-ng-candidates
      yum update "xen-*" microcode_ctl xcp-ng-release --enablerepo=xcp-ng-candidates
      reboot
      

      The usual update rules apply: pool coordinator first, etc.

      Versions:

      • xen: 4.13.5-9.44.1.xcpng8.2
      • microcode_ctl: microcode_ctl-2.1-26.xs29.5.xcpng8.2
      • xcp-ng-release: xcp-ng-release-8.2.1-13

      What to test

      Normal use and anything else you want to test.

      Test window before official release of the update

      ~ 1 day because of security updates.

      posted in News
      bleaderB
      bleader
    • RE: XCP-ng 8.2 updates announcements and testing

      Update published https://xcp-ng.org/blog/2024/08/16/august-security-update/

      Thank you all for testing 🙂

      posted in News
      bleaderB
      bleader
    • RE: XCP-ng 8.2 updates announcements and testing

      My bad, we were a bit late and I tried to be quick and forgot to move it... Just did that, should be good soon, it needs some time to sync repos.

      posted in News
      bleaderB
      bleader
    • RE: XCP-ng 8.2 updates announcements and testing

      The update has been published, thank you for testing it out.

      https://xcp-ng.org/blog/2024/03/15/march-2024-security-update/

      posted in News
      bleaderB
      bleader
    • RE: XCP-ng 8.2 updates announcements and testing

      New security update candidate (xen, microcode_ctl)

      Two new XSAs were published on 12th of March, in cunjunction with microcode updates from Intel.

      • XSA-452 The mitigation is currently off by default as it impacts only Atom CPUs, but can be enabled on Xen command line.
      • XSA-453 This is a variation of Spectre-v1, which impacts a large panel of recent CPUs and architectures. This seems to not really be exploitable on Xen without specific changes and is not considered an emergency.

      SECURITY UPDATES

      • xen-*:
            * Fix XSA-452 - x86: Register File Data Sampling. Data from floating point, vector and integer register could be infered by an attacker on Atom processors, including data from a privileged context.
            * Fix XSA-453 - GhostRace: Speculative Race Conditions. As mentioned, this is a Spectre-v1 variation that can allow an attacker to infer memory accross host and guests through a Use-After-Free flaw.
      • microcode_ctl: Security updates from intel:
      • INTEL-SA-INTEL-SA-00972
      • INTEL-SA-INTEL-SA-00982
      • INTEL-SA-INTEL-SA-00898
      • INTEL-SA-INTEL-SA-00960
      • INTEL-SA-INTEL-SA-01045

      Test on XCP-ng 8.2

      yum clean metadata --enablerepo=xcp-ng-testing
      yum update "xen-*" microcode_ctl --enablerepo=xcp-ng-testing
      reboot
      

      The usual update rules apply: pool coordinator first, etc.

      Versions:

      • xen: 4.13.5-9.39.1.xcpng8.2
      • microcode_ctl: 2.1-26.xs28.1.xcpng8.2

      What to test

      Normal use and anything else you want to test.

      Test window before official release of the update

      2 days because of security updates.

      posted in News
      bleaderB
      bleader
    • RE: Epyc VM to VM networking slow

      We're still actively working on it, we're still not a 100% sure what the root cause is unfortunately.

      It does seem to affect all Zen generations, from what we could gather, sligthly differently: it seems to be a bit better on zen3 and 4, but still always leading to underwhelming network performance for such machines.


      To provide some status/context to you guys: I worked on this internally for a while, then as I had to attend other tasks we hired external help, which gave us some insight but no solution, and now we have @andSmv working on it (but not this week as he's at the Xen Summit).

      From the contractors we had, we found that grant table and event channels have more occurences than on an intel xeon, looking like we're having more packet processed at first, but then they took way more time.

      What Andrei found most recently is that PV & PVH (which we do not support officially), are getting about twice the performance of HVM and PVHVM. Also, having both dom0 and a guest pinned to a single physical core is also having better results. It seems to indicate it may come from the handling of cache coherency and could be related to guest memory settings that differs between intel and amd. That's what is under investigation right now, but we're unsure there will be any possibilty to change that.


      I hope this helps make things a bit clearer to you guys, and shows we do invest a lot of time and money digging into this.

      posted in Compute
      bleaderB
      bleader
    • RE: All NICs on XCP-NG Node Running in Promiscuous Mode

      Running tcpdump switches the interface to promiscuous to allow all traffic that reaches the NIC to be dumped. So I assume the issue you had on your switches allowed traffic to reach the host, that was forwarding it to the VMs, and wasn't dropped because tcpdump switched the VIF into promiscuous mode.

      If it seems resolved, that's good, otherwise let us know if we need to investigate further on this 🙂

      posted in XCP-ng
      bleaderB
      bleader
    • RE: All NICs on XCP-NG Node Running in Promiscuous Mode

      I think the promisc mode is due to the fact the interfaces end up in OVS bridges, without that, the traffic coming from the outside to the VMs MAC addresses would be dropped.

      Once it reach the OVS bridge the interface is in, it is up to OVS to act as a switch and only forward packets to the MAC he knows on its ports so all the traffic should not be forwarded to all the VIFs.

      I just tested on 8.2 and 8.3:

      • tcpdumpping icmp on 2 VMs, pinging VM1 does not show traffic on VM2, pinging VM2 does not show traffic on VM1, pinging the host show no traffic on the VMs
      • tcpdumpping everything, only ignoring ssh (as I was logged in on both VM in ssh), the only traffic I see is the multicast traffic on the network.

      So to answer your question, yes it is normal the NICs are in promiscuous, but that should not lead to all traffic going to all the VMs.

      posted in XCP-ng
      bleaderB
      bleader