XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Coral TPU PCI Passthrough

    Scheduled Pinned Locked Moved Compute
    26 Posts 5 Posters 6.2k Views 9 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • olivierlambertO Online
      olivierlambert Vates 🪐 Co-Founder CEO
      last edited by

      @andSmv I discussed with Marek from Qubes, he told me that might be relevant (or not): https://lore.kernel.org/xen-devel/20221114192100.1539267-2-marmarek@invisiblethingslab.com/

      What do you think?

      1 Reply Last reply Reply Quote 0
      • andSmvA Offline
        andSmv Vates 🪐 XCP-ng Team Xen Guru
        last edited by andSmv

        Hello, sorry for late response (just discovered the topic) 🙏

        With regards of Marek patches, I'm actually think it can worth a try (at least the patch seems to treat the problem where MSI-x PBA page is shared with other regs of the device), but there's some cons too:

        • the patches are quite new (doesn't seems to be integrated yet).
        • the patches can be applied to more recent Xen (not XCP-ng Xen), and even we could probably backport them, it potentially will require some significant work
        • we are not 100% sure it's the issue (or the only issue)

        So If this is a must have, we can go and do some digging to make it work (but still in the scope of "exeperimental" platform, not the production platform)

        1 Reply Last reply Reply Quote 1
        • olivierlambertO Online
          olivierlambert Vates 🪐 Co-Founder CEO
          last edited by

          We could probably try on a non-XCP-ng platform with a very recent "vanilla" Xen (+Marek patches) and see if it's fixed. If it is, then we could think about a potential backport when 8.3 will include a more recent Xen version 🙂

          1 Reply Last reply Reply Quote 0
          • andSmvA Offline
            andSmv Vates 🪐 XCP-ng Team Xen Guru
            last edited by andSmv

            @logical-systems I will check which Xen version the patches are easily applied and If you want I could give you a hand (if needed) to build and install your builded XEN, so you can test if this resolve your issue.

            Unfortunatly we don't have the related HW (Coral TPU) to test it by ourselves.

            UPDATE: the both patches apply to xen 4.17 (tag RELEASE-4.17.0)

            NornodeN R 2 Replies Last reply Reply Quote 0
            • J jmccoy555 referenced this topic on
            • NornodeN Offline
              Nornode @andSmv
              last edited by

              @andSmv // @logical-systems

              Hi,

              I'm researching XCP-NG as an alternative to my homelab VMware hypervisor.
              A goal for me is to get proper USB passthrough of the Google Coral TPU.

              Did these patches make it work so passthrough to a VM is confirmed to be working?

              1 Reply Last reply Reply Quote 0
              • R Offline
                redakula @andSmv
                last edited by

                @andSmv said in Coral TPU PCI Passthrough:

                @logical-systems I will check which Xen version the patches are easily applied and If you want I could give you a hand (if needed) to build and install your builded XEN, so you can test if this resolve your issue.

                Unfortunatly we don't have the related HW (Coral TPU) to test it by ourselves.

                UPDATE: the both patches apply to xen 4.17 (tag RELEASE-4.17.0)

                So the above mentioned patches are included in the 4.17 that is currently available as a test version?

                Or did you mean the patches worked on that version? 🙂

                andSmvA 1 Reply Last reply Reply Quote 0
                • olivierlambertO Online
                  olivierlambert Vates 🪐 Co-Founder CEO
                  last edited by

                  Now we have Xen 4.17 in XCP-ng 8.3, that might work (ping @andSmv )

                  1 Reply Last reply Reply Quote 0
                  • andSmvA Offline
                    andSmv Vates 🪐 XCP-ng Team Xen Guru @redakula
                    last edited by

                    @redakula Hello, unfortunately these patches are not in 4.17 Xen (and was never integrated in more recent Xen). So, to test it, you have to manually apply patches (normally should apply as is to 4.17) and rebuild your Xen.

                    R 2 Replies Last reply Reply Quote 1
                    • R Offline
                      redakula @andSmv
                      last edited by

                      @andSmv
                      Damn - i was quick and have a coral m2 A+E coming in a few days 😆

                      It's just for fun/learning so as long as it doesn't break my homelab too much i will be willing to test so we might get it included 👍
                      Already using the 4.17 test version without a hitch since it came out.

                      1 Reply Last reply Reply Quote 1
                      • R Offline
                        redakula @andSmv
                        last edited by

                        @andSmv
                        As expected the VM with the coral m2 crashes on boot.

                        Where would i start with building a custom Xen? The Koji docs seem directed at authorized package maintainers so would i need to build the sources directly from Xen?
                        Feeling old admitting it was in the 2.6 days i last regularly built custom kernels 😊

                        andSmvA 1 Reply Last reply Reply Quote 0
                        • andSmvA Offline
                          andSmv Vates 🪐 XCP-ng Team Xen Guru @redakula
                          last edited by

                          @redakula I'm on it. I keep you posted.

                          andSmvA 1 Reply Last reply Reply Quote 1
                          • andSmvA Offline
                            andSmv Vates 🪐 XCP-ng Team Xen Guru @andSmv
                            last edited by andSmv

                            @andSmv
                            Hello,
                            I integrated Marek's patch and builded a rpm, so you can install (may be need to force rpm install or extract the xen.gz from rpm and install it manually if you prefer)

                            Obviously there's no guarantee, it'll work in your case. Moreover, I didn't test the patch, so please backup all your data. It should be harmless, but....

                            Here's the link you can download the rpm (should be operational until the end of the month) https://nextcloud.vates.fr/index.php/s/gd7kMwxHtNEP329

                            Don't hesitate to ping me if you experience any issue to download/install/... the patched xen.

                            Hope it helps!

                            P.S. Be sure you're running 8.3 XCP-ng, as I only uploaded xen hypervisor rpm (and not libs/tools which come within)

                            R 1 Reply Last reply Reply Quote 0
                            • R Offline
                              redakula @andSmv
                              last edited by

                              @andSmv Thanks! 👍 😄

                              I tried to be as uninvasive as possible and changed the symbolic link xen.gz to point to the xen.gz from the RPM you created.

                              Unfortunately still the same error (It does seem to boot the xen from the RPM as this has version 4.17.3-3 vs. the one currently in the repos which has version 4.17.3-4).

                              [2024-05-24 17:06:33] (XEN) [  674.051176] Domain 14 (vcpu#2) crashed on cpu#22:
                              [2024-05-24 17:06:33] (XEN) [  674.051178] ----[ Xen-4.17.3-3  x86_64  debug=n  Not tainted ]----
                              [2024-05-24 17:06:33] (XEN) [  674.051179] CPU:    22
                              [2024-05-24 17:06:33] (XEN) [  674.051180] RIP:    0010:[<ffffffffa8581584>]
                              [2024-05-24 17:06:33] (XEN) [  674.051180] RFLAGS: 0000000000000286   CONTEXT: hvm guest (d14v2)
                              [2024-05-24 17:06:33] (XEN) [  674.051182] rax: ffffbd9c00149800   rbx: ffff9e9247cc9000   rcx: 0000000000000000
                              [2024-05-24 17:06:33] (XEN) [  674.051182] rdx: 00000000fee77000   rsi: 0000000000000000   rdi: 0000000000000000
                              [2024-05-24 17:06:33] (XEN) [  674.051183] rbp: ffffbd9c00327690   rsp: ffffbd9c00327658   r8:  0000000000000000
                              [2024-05-24 17:06:33] (XEN) [  674.051183] r9:  0000000000000000   r10: 0000000000000000   r11: 0000000000000000
                              [2024-05-24 17:06:33] (XEN) [  674.051184] r12: ffffbd9c003276ac   r13: 0000000000000011   r14: ffff9e92413390c0
                              [2024-05-24 17:06:33] (XEN) [  674.051185] r15: 0000000000000077   cr0: 0000000080050033   cr4: 0000000000750ef0
                              [2024-05-24 17:06:33] (XEN) [  674.051185] cr3: 0000000103806000   cr2: 0000000000000000
                              [2024-05-24 17:06:33] (XEN) [  674.051186] fsb: 00007b6e7a42a8c0   gsb: ffff9e925b500000   gss: 0000000000000000
                              [2024-05-24 17:06:33] (XEN) [  674.051186] ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0018   cs: 0010
                              

                              It does appear that there is some movement upstream on this (if i interpret the xen mailing list correctly).
                              This patch series references the same title as the patch in this thread from 2022 and a bunch of other related work:
                              https://lore.kernel.org/xen-devel/cover.33fb4385b7dd6c53bda4acf0a9e91748b3d7b1f7.1715313192.git-series.marmarek@invisiblethingslab.com/

                              andSmvA 1 Reply Last reply Reply Quote 0
                              • andSmvA Offline
                                andSmv Vates 🪐 XCP-ng Team Xen Guru @redakula
                                last edited by

                                @redakula
                                Well, this was unfortunately one of the potential outcome. Unfortunately we don't have the hardware to make more "in deep" debug. I will talk to Marek next week (on Xen Summit) about this patch series and if we could expect it eventually fix the issue with Coral TPU.
                                Will keep you posted.

                                R 1 Reply Last reply Reply Quote 2
                                • R Offline
                                  redakula @andSmv
                                  last edited by

                                  @andSmv

                                  Thanks 🙂
                                  Let me know and i will be happy to continue testing 👍

                                  1 Reply Last reply Reply Quote 0
                                  • R redakula referenced this topic on
                                  • First post
                                    Last post