XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. lightspeed
    L
    Offline
    • Profile
    • Following 0
    • Followers 0
    • Topics 2
    • Posts 5
    • Groups 0

    lightspeed

    @lightspeed

    1
    Reputation
    1
    Profile views
    5
    Posts
    0
    Followers
    0
    Following
    Joined
    Last Online

    lightspeed Unfollow Follow

    Latest posts made by lightspeed

    • Passthru GPUs disappearing

      I have raised this topic before but with no real resolution but I am hoping to readdress this issue.

      • We're on 8.2.1 XCP-ng with latest XOA etc
      • In this particular environment servers can have a maximum of 9 PCI cards that are 16x, and 1 card that can run 8x.
      • Physical servers are 100% patched, firmware patched etc, everything that can be updated is updated

      What we're seeing in host servers is that each server essentially can lose 1-2 of their GPUs. We're using NVIDIA Quadro T1000 (8GB) cards with 1 card being assigned to 1 VM using Passthru.

      What will happen is that a user is working and then poof their GPU disappears from windows, they get an alert etc. That GPU will be "gone" until I reboot the physical host server, it will come back and be useable but then within 24 hours of use it will disappear again.

      This issue doesn't happen on ALL cards, just a few. I have done some digging to see what the chances are that there's a physical card problem but the cards are all showing in the OS and lspci. I can see those cards are there, but they essentially get locked and are no longer assignable even if I restart the toolstack.

      I am at a loss, it's puzzling and causing a lot of issues lol

      posted in Hardware
      L
      lightspeed
    • RE: PCI Passthru Error not working on 8.2 but was 8.1

      @olivierlambert said in PCI Passthru Error not working on 8.2 but was 8.1:

      Hi!

      IOMMU should be enable in the BIOS. Double check that 🙂

      Also please share your grub config to see if it's correctly written.

      After playing with it more, the issue appears to be passing multiple devices of the same type. In this case RADEON WX7100. If I take it down to one card it works as expected. If I add more than 1, pick any quantity, then I run into the issue.

      The issue goes away once I reboot the host, and then I can assign a card, but the second the VM reboots the error comes back

      posted in Xen Orchestra
      L
      lightspeed
    • RE: PCI Passthru Error not working on 8.2 but was 8.1

      @lightspeed no one having this issue?

      posted in Xen Orchestra
      L
      lightspeed
    • PCI Passthru Error not working on 8.2 but was 8.1

      The problem
      I have single host I am using to test updates and patches in 8.2 before rolling out to production. I am on the very latest 8.2 stable release package, all patches. Lates XOA, all updates, brand new XOA image etc.

      When I attempt to start a VM and pass it a GPU I get the following error:

      INTERNAL_ERROR(xenopsd internal error: (Unix.Unix_error "No such device" single_write ""))
      

      There's no way to get this VM to boot, other than to remove the GPU pass thru.

      The second question I thought should be address is: are there a base set of BIOS settings that are important to have set for PCI pass thru to work as intended?

      posted in Xen Orchestra
      L
      lightspeed
    • RE: vGPU - which graphics card supported?

      We have a new expansion project underway. We currently use the FirePro 7150x2 with Xen 7.2 and are using XCP-ng latest in the new infrastructure build. I just wanted to check in and see if it's in the roadmap plans to support the MxGPU drivers as time goes on with XCP-ng? I did follow the github issue related to this and see activity happening, but wasn't sure if this was "best effort" type of support or planned?

      In other-words, if we spend $$$ to continue building on FirePro MxGPU platform, are we eventually going to get stuck without ability to upgrade?

      posted in Development
      L
      lightspeed