XCP-ng

    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups

    Nvidia Quadro P400 not working on Ubuntu server via GPU/PCIe passthrough

    Compute
    6
    102
    5480
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • T
      TheFrisianClause last edited by

      Now back on Proxmox, although I also had some RMINIT errors on here, but these were related to the Hypervisor which were resolved pretty quick.

      Also on Proxmox I have to change some Grub parameters and such, isn't this something that has to be done on Xen as well? And then on hypervisor level?

      1 Reply Last reply Reply Quote 0
      • olivierlambert
        olivierlambert Vates 🪐 Founder & CEO 🦸 last edited by

        Like what parameter exactly? I think Xen doesn't support yet hiding the hypervisor information.

        T 1 Reply Last reply Reply Quote 0
        • T
          TheFrisianClause @olivierlambert last edited by

          @olivierlambert

          Parameters such as these in /etc/default/grub:

          GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on"
          GRUB_CMDLINE_LINUX="textonly video=astdrmfb video=efifb:off"

          Also someone replied to the topic I created on Nvidia forums:
          https://forums.developer.nvidia.com/t/xcp-ng-ubuntu-vm-error-quadro-p400/199084

          1 Reply Last reply Reply Quote 0
          • olivierlambert
            olivierlambert Vates 🪐 Founder & CEO 🦸 last edited by

            That answer is incorrect. Passing an entire PCIe device shouldn't make a diff.

            Maybe it's a problem on the IOMMU side, I don't know. It will be easier to work on it with an actual card.

            T 1 Reply Last reply Reply Quote 0
            • T
              TheFrisianClause @olivierlambert last edited by

              @olivierlambert In that case, lets hope you can resolve it once the P400 is delivered. Curious what you find and how there is a way to resolve the issue...

              1 Reply Last reply Reply Quote 0
              • olivierlambert
                olivierlambert Vates 🪐 Founder & CEO 🦸 last edited by

                I have no idea and not great hope due to our other priorities but we'll see.

                T 1 Reply Last reply Reply Quote 0
                • T
                  TheFrisianClause @olivierlambert last edited by

                  @olivierlambert I think the conclusion we can make is that I need to hide the Hypervisor from the Nvidia driver which is also mentioned here: https://www.reddit.com/r/XenServer/comments/r12p0q/pci_passthrough_quadro_p400_to_ubuntucentos_vm/

                  So it is an Error 43, as I think this is plausible as for Proxmox I do this as well by adding this into the vm .conf file

                  cpu: host,hidden=1,flags=+pcid

                  Which hides the hypervisor from the VM in KVM perspective.
                  Is there an equivalent for XCP-NG?

                  1 Reply Last reply Reply Quote 0
                  • olivierlambert
                    olivierlambert Vates 🪐 Founder & CEO 🦸 last edited by

                    1. No equivalent in Xen yet
                    2. Nvidia changed its policty recently to avoid blocking virt in their drivers. So the problem should not be here.
                    T 1 Reply Last reply Reply Quote 0
                    • T
                      TheFrisianClause @olivierlambert last edited by

                      @olivierlambert Could it be something with the 'VFIO' modules maybe in KVM? I honestly have no clue anymore... So I think my best guess is to wait your research on this out...

                      1 Reply Last reply Reply Quote 0
                      • olivierlambert
                        olivierlambert Vates 🪐 Founder & CEO 🦸 last edited by

                        I just got the card, but my agenda is very very busy ATM. I'll try to do the PCI passthrough on my spare time (which is not very often either)

                        T 1 Reply Last reply Reply Quote 0
                        • T
                          TheFrisianClause @olivierlambert last edited by

                          @olivierlambert No problem take your time, I will check in regularly to see if there has been an update or some sorts... 🙂

                          1 Reply Last reply Reply Quote 0
                          • olivierlambert
                            olivierlambert Vates 🪐 Founder & CEO 🦸 last edited by

                            Okay so doing some tests now, I can reproduce the issue. So the questions are:

                            1. Did it work before? (older versions of XCP-ng?) -> removing regressions from the equation
                            2. Is P400 limited for PCI passthough by NV driver? It's still not clear. If it's the problem, this require a code change in upstream Xen to be able to hide it.
                            T 1 Reply Last reply Reply Quote 0
                            • T
                              TheFrisianClause @olivierlambert last edited by

                              @olivierlambert

                              I have no idea if it worked on the earlier versions of XCP-NG, I don't think so as I have I believe tested this on XCP-NG 8.0 and 8.1 if I remember correctly. (Also created forum posts about this in 2019/2020).

                              I dont think it is limited for PCI passthrough as I am using an NVidia driver on the VM within proxmox without any issues.

                              I am pasting a screenshot of the Nvidia driver I am currently using on the VM inside proxmox:
                              96f5b776-37e8-4212-b345-83e9664f7804-image.png

                              1 Reply Last reply Reply Quote 0
                              • olivierlambert
                                olivierlambert Vates 🪐 Founder & CEO 🦸 last edited by

                                That might be because Proxmox is hiding the hypervisor underneath. Hard to tell because of this fracking drivers 😕

                                T 1 Reply Last reply Reply Quote 0
                                • T
                                  TheFrisianClause @olivierlambert last edited by TheFrisianClause

                                  @olivierlambert
                                  Hmm yeah its quite a hassle with these drivers for some reason.... If you need some extra information which could help let me know... I can send some other details which are now on the proxmox host and the Ubuntu VM?

                                  How about the VFIO modules which I also mentioned earlier? Is this something that has to be added to XCP-NG maybe? As I also have a topic on Reddit and this person has the same problem but then with an T400.
                                  https://www.reddit.com/r/XenServer/comments/r12p0q/pci_passthrough_quadro_p400_to_ubuntucentos_vm/hrlaqxl/?context=3

                                  1 Reply Last reply Reply Quote 0
                                  • olivierlambert
                                    olivierlambert Vates 🪐 Founder & CEO 🦸 last edited by

                                    If it's hypervisor detection, the "only" thing needed is a Xen modification, but this is not trivial (if it's really that). I can assume it's the case.

                                    In the meantime, can you double check if XCP-ng 7.6 is affected too? (last hope to check if it's not a regression).

                                    T 1 Reply Last reply Reply Quote 0
                                    • T
                                      TheFrisianClause @olivierlambert last edited by

                                      @olivierlambert Can try that on my spare server, will try and see if I can do it today. I will update this once I am finished.

                                      T 1 Reply Last reply Reply Quote 0
                                      • T
                                        TheFrisianClause @TheFrisianClause last edited by

                                        Currently I have no time to test this as the machine itself is also heavily used by other users.... But I believe the 7.6 version has this issue as well, as I remember testing this on version 7.x.

                                        1 Reply Last reply Reply Quote 0
                                        • T
                                          TheFrisianClause last edited by

                                          Alright tested it with 7.6
                                          Seems to not work as well...

                                          [  165.594038] [drm] [nvidia-drm] [GPU ID 0x00000006] Loading driver
                                          [  165.594040] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:00:06.0 on minor 1
                                          [  171.958377] NVRM: GPU 0000:00:06.0: RmInitAdapter failed! (0x22:0x56:667)
                                          [  171.958424] NVRM: GPU 0000:00:06.0: rm_init_adapter failed, device minor number 0
                                          [  171.963805] NVRM: GPU 0000:00:06.0: RmInitAdapter failed! (0x22:0x56:667)
                                          [  171.963848] NVRM: GPU 0000:00:06.0: rm_init_adapter failed, device minor number 0
                                          

                                          Same error....

                                          1 Reply Last reply Reply Quote 0
                                          • olivierlambert
                                            olivierlambert Vates 🪐 Founder & CEO 🦸 last edited by

                                            Okay so at least "it's good news": it's not a regression 🙂 So I assume it's required to hide the hypervisor underneath.

                                            T W 2 Replies Last reply Reply Quote 0
                                            • First post
                                              Last post