XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. TeddyAstie
    Offline
    • Profile
    • Following 0
    • Followers 0
    • Topics 2
    • Posts 49
    • Groups 4

    TeddyAstie

    @TeddyAstie

    Vates 🪐 XCP-ng Team Xen Guru
    10
    Reputation
    18
    Profile views
    49
    Posts
    0
    Followers
    0
    Following
    Joined
    Last Online
    Location France

    TeddyAstie Unfollow Follow
    Hypervisor & Kernel Team Xen Guru Vates 🪐 XCP-ng Team

    Best posts made by TeddyAstie

    • Xen ERMS Patch - Call for performance testing

      Hello !

      I am looking to get some feedback and evaluation on a performance-related patch for Xen (XCP-ng 8.3 only).
      This patch changes the memcpy implementation of Xen to use the "ERMS variant" (aka REP MOVSB) instead of the current REP MOVSQ+B implementation.
      This is expected to perform better on the vast majority of Intel CPUs and modern AMD ones (Zen3+), but may perform worse on some older AMD CPUs.

      This change may impact the performance of PV drivers (especially network).

      You can find more details regarding this proposed change in : https://github.com/xcp-ng-rpms/xen/pull/54
      This change may be reworked in the future to take more in account the specificities of each CPUs (e.g check presence of ERMS flag).

      🚧 Keep in mind that this patched version is experimental and not officially supported. 🚧

      Installation :

      # Download repo file for XCP-ng 8.3
      wget https://koji.xcp-ng.org/repos/user/8/8.3/xcpng-users.repo -O /etc/yum.repos.d/xcpng-users.repo
      
      # Installing the patched Xen packages (you should see `.erms` packages)
      yum update --enablerepo=xcp-ng-tae1
      

      You can revert the changes by downgrading the Xen package with the ones in the default repos.

      yum downgrade --disablerepo=xcp-ng-tae1 "xen-*"
      
      TSnake41 opened this pull request in xcp-ng-rpms/xen

      draft Use ERMS variant for memcpy #54

      posted in Development
      TeddyAstieT
      TeddyAstie
    • RE: Wide VMs on XCP-ng

      @plaidypus I don't know a lot about NUMA on Xen, but we have a part in the docs regarding that
      https://docs.xcp-ng.org/compute/#numa-affinity

      And also other documentation on the subject
      https://xapi-project.github.io/new-docs/toolstack/features/NUMA/index.html
      there was a design session regarding NUMA in latest Xen Summit : https://youtu.be/KoNwEYMlhyU?list=PLQMQQsKgvLnvjRgDnb-5T51e1kGHgs1SO

      posted in XCP-ng
      TeddyAstieT
      TeddyAstie
    • RE: XCP-ng 8.3 & AMD Firepro S7150x2

      @tuxen said (https://xcp-ng.org/forum/topic/3652/no-free-virtual-function-found-vgpu-s7150/4?_=1731502751059)

      After some digging, could be the case of a GPU firmware being incompatible with UEFI. Do you have any spare server for testing XCP-ng boot in legacy/BIOS with this GPU?

      Perhaps it is the issue ?

      posted in Hardware
      TeddyAstieT
      TeddyAstie
    • RE: XCP-ng 8.3 & AMD Firepro S7150x2

      @ohajek

      Nov 13 11:30:21 xen03 kernel: [10188.720655] AMD IOMMUv2 driver by Joerg Roedel jroedel@suse.de
      Nov 13 11:30:21 xen03 kernel: [10188.720656] AMD IOMMUv2 functionality not available on this system
      

      This is expected, Dom0 Kernel (Linux) is not supposed to access the IOMMU when it is already used by Xen. To check if AMD-Vi is working, you need to check xl dmesg instead.

      I took a quick look at kern_gim_compiled.txt, and it look likes it timed-out somewhere

      Oct 23 20:49:32 xen03 kernel: [   80.657394]        gim error:(wait_cmd_complete:2387)  wait_cmd_complete -- time out after 0.003004460 sec
      Oct 23 20:49:32 xen03 kernel: [   80.657408]        gim error:(wait_cmd_complete:2390)   Cmd = 0x17, Status = 0x0, cmd_Complete=0
      

      3ms looks like a short timeout for me, but aside that, it looks like a driver(gim) or hardware issue

      posted in Hardware
      TeddyAstieT
      TeddyAstie
    • RE: XCP-ng 8.3 updates announcements and testing

      @abudef Note that even with this update, nested virtualization is still not really supported in XCP-ng 8.3.
      It's there, you can enable it at your own risk. It broke due to some change in XAPI (even though Xen hypervisor had "support" for it).
      It never actually got removed from Xen hypervisor (it was marked experimental in Xen 4.13 used in XCP-ng 8.2, it is also the case for Xen 4.17), although nothing really changed, it still has the same issues and limitations as said.

      The current state of nested virtualization in Xen is quite clumsy and there are future plans to remake it properly from ground without taking shortcuts and have proper tests to back it.

      Aside that, after some experiments, it seems that mostly nested EPT is incomplete/buggy, so your L1 hypervisor should not rely on it. You should add hap=0 to nested XCP-ng Xen cmdline. Beware that it will imply a pretty large performance hit, but I had more consistent results with this.
      I am quite suprised that Windows works while Linux don't, maybe it is somewhat related to PV drivers ?

      posted in News
      TeddyAstieT
      TeddyAstie
    • RE: PCI Passthrough of QAT adapter IQA89601G1P5

      PCIe AER needs proper PCIe, which in practice needs Q35 chipset in the guest (or some other guest type/PCI passthrough way).

      Q35 support is currently work in progress

      posted in Compute
      TeddyAstieT
      TeddyAstie
    • RE: Google Coral TPU PCIe Passthrough Woes

      I think it is the same MSI-X/PBA issues that may be partially fixed with https://gitlab.com/xen-project/xen/-/commit/b2cd07a0447bfa25e96ae13e190225b61a3670cb

      However, with this device, MSI-X vector table and PBA are in a same page (vector table in 46800 and PBA in 46068) though, which is threated a bit differently

      If PBA lives on the same page, discard writes and log a message.
      Technically, writes outside of PBA could be allowed, but at this moment
      the precise location of PBA isn't saved, and also no known device abuses
      the spec in this way (at least yet).
      

      But Coral appears to abuse this according to DKMS driver by having more than just MSI-X and PBA on a single page
      https://github.com/google/gasket-driver/blob/main/src/apex_driver.c#L103-L140

      posted in Compute
      TeddyAstieT
      TeddyAstie
    • RE: Guest receiving passthrough SATA controllers does not see attached drives

      Hello @hvm,

      Can you give the output of xl dmesg in XCP-ng and of dmesg in the guest that has the issues ?
      I have the impression that something is going wrong with reserved regions related to the SATA controller.

      posted in Compute
      TeddyAstieT
      TeddyAstie

    Latest posts made by TeddyAstie

    • RE: Epyc VM to VM networking slow

      @Forza said in Epyc VM to VM networking slow:

      olivierlambert said in Epyc VM to VM networking slow:

      If we become partners officially, we'll be able to have more advanced accesses with their teams. I still have hope, it's just that the pace isn't on me.

      Hi, is there anything new to report on this? We have very powerful machines, but unfortunately limited by this stubborn issue.

      Can you test https://xcp-ng.org/forum/topic/10862/early-testable-pvh-support ?

      We observe very significant improvements on AMD EPYC with PVH.

      We're still pin-pointing the issue with HVM, the current hypothesis is a issue regarding memory typing (grant-table accessed as uncacheable(UC) which is very slow) related to grant-table positionning in HVM.

      posted in Compute
      TeddyAstieT
      TeddyAstie
    • Early testable PVH support

      Hello !

      Xen supports 3 virtualization modes, PV (deprecated), HVM (used in XCP-ng) and PVH.
      While HVM is supported in XCP-ng (and used), PVH hasn't been integrated yet, but today in XCP-ng 8.3 we have some early support for it.

      The PVH mode has been officially introduced in Xen 4.10 as leaner, simpler variant of HVM (it was initially named HVM-lite) with little to no emulation, only PV devices, and less overall complexity.
      It aims to be a great and simpler alternative to traditional HVM for modern guests.

      A quick comparison of all modes
      PV mode :

      • needs specific guest support
      • only PV devices (no legacy hardware)
      • relies on PV MMU (less efficient than VT-x EPT/AMD-V NPT overall, but works without virtualization technologies)
      • unsafe against Spectre-style attacks
      • supports: direct kernel boot, pygrub
      • deprecated

      HVM mode :

      • emulate a real-behaving machine (using QEMU)
        • including legacy platform hardware (IOAPIC, HPET, PIT, PIC, ...)
        • including (maybe legacy) I/O hardware (network card, storage ...)
        • some can be disabled by the guest (PVHVM), but they exist at the start of the guest
      • relies on VT-x/AMD-V
      • traditional PC boot flow (BIOS/UEFI)
      • optional PV devices (opt-in by guest; PVHVM)
      • performs better than PV mode on most machines
      • compatible with pretty much all guests (including Windows and legacy OS)

      PVH mode :

      • relies on VT-x/AMD-V (regarding that, on the Xen side, it's using the same code as HVM)
      • minimal emulation (e.g no QEMU), way simpler overall, lower overhead
      • only PV devices
      • support : direct kernel boot (like PV), PVH-GRUB, or UEFI boot (PVH-OVMF)
      • needs guest support (but much less intrusive than PV)
      • works with most Linux distros and most BSD; doesn't work with Windows (yet)

      Installation

      🚧 Keep in mind that this is very experimental and not officially supported. 🚧

      PVH vncterm patches (optional)

      While XCP-ng 8.3 actually has support for PVH, due to a XAPI bug, you will not be able to access the guest console. I provide a patched XAPI with a patched console.

      # Download repo file for XCP-ng 8.3
      wget https://koji.xcp-ng.org/repos/user/8/8.3/xcpng-users.repo -O /etc/yum.repos.d/xcpng-users.repo
      
      # You may need to update to testing repositories.
      yum update --enablerepo=xcp-ng-testing
      
      # Installing the patched XAPI packages (you should see `.pvh` XAPI packages)
      yum update --enablerepo=xcp-ng-tae2
      

      This is optional, but you probably want that to see what's going on in your guest without having to rely on SSH or xl console.

      Making/converting into a PVH guest

      You can convert any guest into a PVH guest by modifying its domain-type parameter.

      xe vm-param-set uuid={UUID} domain-type=pvh
      

      And revert this change by changing it back to HVM

      xe vm-param-set uuid={UUID} domain-type=hvm
      

      PVH OVMF (boot using UEFI)

      You also need a PVH-specific OVMF build that can be used to boot the guest in UEFI mode.

      Currently, there is no package available for getting it, but I provide a custom-built OVMF with PVH support
      https://nextcloud.vates.tech/index.php/s/L8a4meCLp8aZnGZ

      You need to place this file in the host as /var/lib/xcp/guest/pvh-ovmf.elf (create all missing parents).
      Then sets it as PV-kernel

      xe vm-param-set uuid={UUID} PV-kernel=/var/lib/xcp/guest/pvh-ovmf.elf
      

      Once done, you can boot your guest as usual.

      Tested guests

      On many Linux distros, you need to add console=hvc0 in the cmdline, otherwise, you may not have access to a PV console.

      • Alpine Linux
      • Debian

      Known limitations

      • Some stats shows "no stats" (XAPI bug ?)
      • No support for booting from ISO, you can workaround this by importing your iso as a disk and using it as read-only disk
      • No live migration support (or at least, don't expect it to work properly)
      • No PCI passthrough support
      • No actual display (only PV console)
      posted in Development
      TeddyAstieT
      TeddyAstie
    • RE: Is Dynamic Memory Allocation and NVME passthrough supported?

      Hello,
      Unfortunately, the current approach of ballooning (dynamic memory) cannot work with PCI passthrough. I don't think it is possible to workaround that limitation (at least not in XCP-ng 8.3).

      If I adjust dynamic to be 48 GiB/48 GiB the machine will then boot. Once booted, I can then once again apply the desired dynamic config of 16 GiB/48 GiB.

      Am I misunderstanding the configuration options and this is just not supported, or have I stumbled across a bug?

      What's probably happening is that the dynamic configuration you set is not effective yet, and only applies when you reboot, that's why you got PCI passthrough work because you actually used static memory allocation.

      posted in Management
      TeddyAstieT
      TeddyAstie
    • RE: Xen ERMS Patch - Call for performance testing

      @olivierlambert

      iperf (-P8) for VM to VM on Xeon Gold 6138
      Before: 25-35 Gbps
      After: 25-39 Gbps

      Seems slightly higher at best, but hard to measure a difference as the performance tend to vary a lot between runs.

      posted in Development
      TeddyAstieT
      TeddyAstie
    • RE: Citrix tools after version 9.0 removed quiesced snapshot

      @vkeven
      XCP-ng 8.1 release note says

      VSS and quiesced snapshots support is removed, because it never worked correctly and caused more harm than good. Note that Windows guest tools version 9 (the default for recent versions of Windows if you install Citrix drivers) already removed VSS support, even for older versions of CH / XCP-ng

      I am not sure if this VSS feature is bound to the PV drivers, or if it also needs hypervisor support. Though it is not recommended to stay on a old version of the guest agent.

      posted in XCP-ng
      TeddyAstieT
      TeddyAstie
    • Xen ERMS Patch - Call for performance testing

      Hello !

      I am looking to get some feedback and evaluation on a performance-related patch for Xen (XCP-ng 8.3 only).
      This patch changes the memcpy implementation of Xen to use the "ERMS variant" (aka REP MOVSB) instead of the current REP MOVSQ+B implementation.
      This is expected to perform better on the vast majority of Intel CPUs and modern AMD ones (Zen3+), but may perform worse on some older AMD CPUs.

      This change may impact the performance of PV drivers (especially network).

      You can find more details regarding this proposed change in : https://github.com/xcp-ng-rpms/xen/pull/54
      This change may be reworked in the future to take more in account the specificities of each CPUs (e.g check presence of ERMS flag).

      🚧 Keep in mind that this patched version is experimental and not officially supported. 🚧

      Installation :

      # Download repo file for XCP-ng 8.3
      wget https://koji.xcp-ng.org/repos/user/8/8.3/xcpng-users.repo -O /etc/yum.repos.d/xcpng-users.repo
      
      # Installing the patched Xen packages (you should see `.erms` packages)
      yum update --enablerepo=xcp-ng-tae1
      

      You can revert the changes by downgrading the Xen package with the ones in the default repos.

      yum downgrade --disablerepo=xcp-ng-tae1 "xen-*"
      
      TSnake41 opened this pull request in xcp-ng-rpms/xen

      draft Use ERMS variant for memcpy #54

      posted in Development
      TeddyAstieT
      TeddyAstie
    • RE: XCP-NG server crashes/reboots unexpectedly

      @nvs said in XCP-NG server crashes/reboots unexpectedly:

      Thanks. Unfortunately my machine doesnt have IPMI. So can I just connect a serial cable between this machine and another machine

      Yes though you would still need to boot using the "XCP-ng (Serial)" grub entry.
      (you can also add some serial console bits adding them to xen cmdline)

      posted in Hardware
      TeddyAstieT
      TeddyAstie
    • RE: XCP-NG server crashes/reboots unexpectedly

      @nvs said in XCP-NG server crashes/reboots unexpectedly:

      @nvs Machine crashed/restarted itself again this morning. I didn't even have all of the usual VMs running this time. Nothing was logged in kern.log when it crashed again. Before it crashed I checked a few times in the hours before xl dmesg but nothing obvious to me (same log as I posted above). Any suggestions highly welcome as I'm sure how to proceed with troubleshooting this. My next step would be replacing the PSU and see if anything changes, but its a long shot.

      Ok so it doesn't seem caused by a driver bug causing corruption somewhere in Xen/Linux.
      So something is causing Xen to crash, and it's not very easy to know without using e.g the serial console (so you can get the actual Xen crash message).
      You need for that something connected and monitoring the machine's serial console (or using IPMI) and boot XCP-ng in "(serial)" mode.

      posted in Hardware
      TeddyAstieT
      TeddyAstie
    • RE: XCP-NG server crashes/reboots unexpectedly

      @nvs you need to reboot, and it should stick accros reboots

      Side question, any chance to tail -f xl dmesg to see real time output? That would allow me to see any last messages before it crashes potentially.

      xl dmesg -c allow you to clear the Xen console while displaying it

      posted in Hardware
      TeddyAstieT
      TeddyAstie
    • RE: XCP-NG server crashes/reboots unexpectedly

      Can you try adding iommu=strict to Xen command-line ?

      /opt/xensource/libexec/xen-cmdline --set-xen "dom0-iommu=strict"
      

      And regularly check if there is something showing up in xl dmesg.

      posted in Hardware
      TeddyAstieT
      TeddyAstie