XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    USB + GPU pass-though issue

    Scheduled Pinned Locked Moved News
    7 Posts 4 Posters 130 Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic was forked from XCP-ng 8.3 updates announcements and testing stormi
    This topic has been deleted. Only users with topic management privileges can see it.
    • G Offline
      gb.123 @olivierlambert
      last edited by gb.123

      @olivierlambert @stormi

      Another bug I encountered ( I don't know if this is to be mentioned here or whether I should open this as an issue in github )
      Also, this bug may be present in previous versions as the current version is the one I have first tried this on:

      Here is the summary:

      If USB Keyboard & Mouse is passed-through along-with GPU:
      The GPU gets stuck in D3 state (on Shutdown/Restart of VM) (Classic GPU reset problem)

      If no vUSB is passed but GPU is passed through:
      The GPU works correctly and resets correctly (on Shutdown/Restart of VM)

      I will try the workaround of passing the whole usb controller to see how it goes; but in my use case that may not be possible for regular usage (I'll just be doing this for testing only)

      Update :
      When I run :

      $> lspci
      Extract of Output (Partial):

      07:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] Device 15b8
      

      However, this controller does not show up when I run :
      xe pci-list

      Is it a bug that lspci & xe pci-list have different number of devices ?

      How can I pass this controller since xe pci-list does not show it so I can't get the UUID ?
      Will kernel parameters (like XCP-ng 8.2) work in this case ?

      Is it safe to run on XCP-ng host ?

       echo 1 > /sys/bus/pci/rescan
      

      (I'm trying to find a way where the PCI card is reset by the host without complete reboot, though I am aware that the above command will not reset it.)

      Also is it advisable to use :

      xl pci-assignable-add 07:00.0
      

      in XCP-ng 8.3 ? or is this method deprecated ?

      TeddyAstieT 1 Reply Last reply Reply Quote 0
      • olivierlambertO Offline
        olivierlambert Vates 🪐 Co-Founder CEO
        last edited by

        Question for @TeddyAstie maybe

        1 Reply Last reply Reply Quote 1
        • TeddyAstieT Offline
          TeddyAstie Vates 🪐 XCP-ng Team Xen Guru @gb.123
          last edited by

          @gb.123 said in XCP-ng 8.3 updates announcements and testing:

          Here is the summary:

          If USB Keyboard & Mouse is passed-through along-with GPU:
          The GPU gets stuck in D3 state (on Shutdown/Restart of VM) (Classic GPU reset problem)

          If no vUSB is passed but GPU is passed through:
          The GPU works correctly and resets correctly (on Shutdown/Restart of VM)

          I have no clue what vUSB may change regarding GPU passthrough.

          When I run :

          $> lspci
          Extract of Output (Partial):

          07:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] Device 15b8
          

          However, this controller does not show up when I run :
          xe pci-list

          Is it a bug that lspci & xe pci-list have different number of devices ?

          How can I pass this controller since xe pci-list does not show it so I can't get the UUID ?
          Will kernel parameters (like XCP-ng 8.2) work in this case ?

          Question for @Team-XAPI-Network regarding the filtering on PCI IDs.
          I don't think XAPI allows using arbitrary BDF, but I may be wrong.

          Is it safe to run on XCP-ng host ?

           echo 1 > /sys/bus/pci/rescan
          

          (I'm trying to find a way where the PCI card is reset by the host without complete reboot, though I am aware that the above command will not reset it.)

          Probably. But it's not going to change anything as the device doesn't completely leave the Dom0 when passed-through.
          FYI a function-level-reset is systematically performed by Xen when doing PCI passthrough, thus your device should be reset before entering another guest (aside reset bugs like you may have).

          Also is it advisable to use :

          xl pci-assignable-add 07:00.0
          

          in XCP-ng 8.3 ? or is this method deprecated ?

          I don't think XAPI supports this PCI passthrough approach.
          This is a command which allows dynamically to remove a device from Dom0 and put it into "quarantine domain", so that it will be ready to passthrough it.

          Current XAPI uses the approach of having a set of "passthrough-able" devices at boot time by modifying the xen-pciback.hide kernel parameter, which does the same but at boot time.

          G 1 Reply Last reply Reply Quote 2
          • G Offline
            gb.123 @TeddyAstie
            last edited by gb.123

            @TeddyAstie @olivierlambert

            Thank you sooo much for your prompt response ! 🙂

            FYI a function-level-reset is systematically performed by Xen when doing PCI passthrough, thus your device should be reset before entering another guest (aside reset bugs like you may have).

            This is exactly what the problem is. The Dom0 is unable to perform FLR when I also pass vUSB to the Guest. However, if USB is not passed (i mean if it is not attached; even though USB passthrough is enabled in host), the FLR seems to be performed correctly and I am able to restart the guest without problems.

            If the FLR is not performed, the Guest (even if it is the same one being restarted) is unable to detect the pass-through cards and also waits for about 136 seconds (about 65.5 seconds for each card) after which it continues without adding the card. This wait time is the kernel default wait time which cannot be changed without rebuilding the kernel I think.

            I am trying to find out where the conflict is. (Basically I am unable to understand if this is a driver problem or a Xen problem; since the FLR is being performed correctly when there is no usb being passed through)
            Ideally passing USB should not have in impact on PCI-pass through.

            Current XAPI uses the approach of having a set of "passthrough-able" devices at boot time by modifying the xen-pciback.hide kernel parameter, which does the same but at boot time.

            Will this still work with XCP-ng 8.3 ?

            /opt/xensource/libexec/xen-cmdline --set-dom0 "xen-pciback.hide=(0000:07:00.0)"
            

            I am unable to use xe pci-disable-dom0-access uuid=<pci uuid> since UUID for the above pci is not generated (and not visible in xe pci-list)

            1 Reply Last reply Reply Quote 0
            • G Offline
              gb.123
              last edited by gb.123

              @olivierlambert @TeddyAstie
              Whelp !
              I tried using /opt/xensource/libexec/xen-cmdline --set-dom0 "xen-pciback.hide=(0000:07:00.0)" but now XCP-ng refuses to boot !
              Any way I can reverse the command by booting in safe mode ?

              UPDATE :
              Manually edited /boot/efi/EFI/xenserver/grub.cfg and removed the entry, now the server boots.

              Hoping that /opt/xensource/libexec/xen-cmdline --set-dom0 "xen-pciback.hide=(0000:07:00.0)" only alters /boot/efi/EFI/xenserver/grub.cfg and not some internal Xen settings.

              A clarification on the above would be highly appreciated!

              1 Reply Last reply Reply Quote 0
              • stormiS Offline
                stormi Vates 🪐 XCP-ng Team
                last edited by

                I moved this discussion to its own topic, as we need the other one for update candidate testing.

                1 Reply Last reply Reply Quote 0
                • G Offline
                  gb.123
                  last edited by

                  @stormi
                  Thanks!

                  @olivierlambert @TeddyAstie

                  I am unable to test further since I don't have a 'passable'/'assignable usb controller which can be passed through. Have ordered one and would keep you guys posted once I get it and test it.

                  Though I can confirm that the bug (dunno if it is the driver / xcp-ng) persists.

                  1 Reply Last reply Reply Quote 0
                  • First post
                    Last post