XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Non-server CPU compatibility - Ryzen and Intel

    Scheduled Pinned Locked Moved Compute
    116 Posts 19 Posters 97.5k Views 18 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • olivierlambertO Offline
      olivierlambert Vates 🪐 Co-Founder CEO
      last edited by

      You can probably pass the entire USB controller (but then you won't be able to get any USB port used outside the passed-through VM)

      1 Reply Last reply Reply Quote 0
      • M Offline
        mgales @olivierlambert
        last edited by mgales

        @olivierlambert - I've got dmesg info captured that I can pass along. My friend also decided to build a new Zen 4 box for XCP-ng. He went with (what I think is) a similar motherboard to what you have? (PRIME B650M-A AX):
        https://www.asus.com/us/motherboards-components/motherboards/prime/prime-b650m-a-ax/

        The BIOS's are the same (ver 1222 - 2023/02/24), we loaded the same XCP-ng version (8.3alpha), the same xo-ce version (5.10.0-21-amd64), and the same Linux Mint version (21.1). Processors are similar Ryzen 9 7900 & 7900X. Both have Local APIC Mode set to X2APIC in the BIOS.

        His box runs Linux Mint 21.1 without issues (as does yours). We captured dmesg output from all 3 (XCP-ng, xo-ce, Linux Mint 21.1 cinnamon) on both boxes and I've compared them side-by-side and each of the 3 have problems/errors on my machine. I summarized the differences/errors occurring on my box for 2 of the 3 dmesg captures below (XCP-ng and xo-ce).

        I'm hoping that this might lead to some patches in XCP-ng software that will allow the Asus ROG STRIX B650E-F Gaming WIFI motherboard to fully work with XCP-ng. (XCP-ng works fine running Windows VMs, but not Linux VMs).

        If a patch/fix from XCP-ng doesn't seem likely, then I'll probably replace the current motherboard with the PRIME version.

        Below is a summary of what I've noticed in comparisons of XCP-ng dmesg files and xo-ce dmesg files.

        In order of appearance, the differences I'm seeing in XCP-ng dmesg from my box are:

        1. Hypervisor detected: Xen PV
          tsc: Fast TSC calibration failed

          Instead of:
          tsc: Fast TSC calibration using PIT

        2. no TSC line listed

          Instead of:
          tsc: Detected 3693.204 MHz TSC

        3. ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
          ACPI BIOS Error (bug): Could not resolve [_SB.PCI0.GPP7.UP00.DP40.UP00.DP68], AE_NOT_FOUND (20180810/dswload2-160)
          ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20180810/psobject-221)
          ACPI Error: Ignore error and continue table load (20180810/psobject-604)
          ACPI Error: Skip parsing opcode OpcodeName unavailable (20180810/psloop-543)
          ACPI: 14 ACPI AML tables successfully acquired and loaded

          Instead of:
          ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
          ACPI: 13 ACPI AML tables successfully acquired and loaded

        4. This sequence happened:
          [ 0.412184] usbcore: registered new device driver usb
          [ 0.412184] WARNING: CPU: 0 PID: 1 at drivers/i2c/busses/i2c-designware-common.c:245 i2c_dw_clk_rate+0x16/0x30
          [ 0.412184] Modules linked in:
          [ 0.412184] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0+1 #1
          [ 0.412184] Hardware name: ASUS System Product Name/ROG STRIX B650E-F GAMING WIFI, BIOS 0821 11/15/2022
          [ 0.412184] RIP: e030:i2c_dw_clk_rate+0x16/0x30
          [ 0.412184] Code: 00 48 c7 c6 5e ee e7 81 31 c0 5d e9 d4 60 f6 ff 0f 1f 40 00 0f 1f 44 00 00 48 8b 47 48 48 85 c0 74 08 e8 1d 35 43 00 89 c0 c3 <0f> 0b 0f 1f 84 00 00 00 00 00 c3 0f 1f 44 00 00 66 2e 0f 1f 84 00
          [ 0.412184] RSP: e02b:ffffc9004006fcf8 EFLAGS: 00010246
          [ 0.412184] RAX: 0000000000000000 RBX: ffff8881384d2018 RCX: 00000000aeffff00
          INFO DELETED
          [ 0.412184] ---[ end trace eae5bc73295d4325 ]---
          [ 0.412941] pps_core: LinuxPPS API ver. 1 registered
          [ 0.412941] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti giometti@linux.it
          [ 0.412942] PTP clock support registered

          Instead of:
          [ 0.476402] usbcore: registered new device driver usb
          [ 0.476402] pps_core: LinuxPPS API ver. 1 registered
          [ 0.476402] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti giometti@linux.it
          [ 0.476402] PTP clock support registered

        In xo-ce dmesg on my box, I saw:
        rcu_sched self-detected stall on CPU (2 occurrences)

        Would it be useful for me to upload the 3 dmesg capture files from both boxes (i.e. XCP-ng, xo-ce, Linux Mint 21.1)?

        1 Reply Last reply Reply Quote 0
        • olivierlambertO Offline
          olivierlambert Vates 🪐 Co-Founder CEO
          last edited by

          Well, a pretty buggy BIOS on your side doesn't help I suppose šŸ˜ž

          1 Reply Last reply Reply Quote 0
          • B Offline
            BlueBadger
            last edited by

            I built a system with a Ryzen 7950x on an ASRock B650M PG Riptide motherboard and was having similar issues as mgales. I switch to an ASUS Prime B650M-A II-CSM without any improvement.

            With the ASUS Prime, there were no BIOS errors reported.
            (I was able to get rid of 'ACPI BIOS Error (bug): Could not resolve [_SB.PCI0.GPP7.UP00.DP40.UP00.DP68], AE_NOT_FOUND (20180810/dswload2-160)' by enabling the onboard audio.)

            I am able to run imported Windows VMs (Windows 10 and Server 2022) without any apparent issues.
            I can run an imported AlmaLinux 8 VM with the nopv kernel option.
            I can run the AlmaLinux 8 installer with the nopv option.
            I can run Xen Orchestra with the nopv option.
            I can also run an imported CentOS 6 VM without any additional options.

            The main issue seems to be a stuck CPU on the Linux VMs when using PV drivers.

            Could there be issues specific to Rzyen 7900x and 7950x?

            1 Reply Last reply Reply Quote 0
            • olivierlambertO Offline
              olivierlambert Vates 🪐 Co-Founder CEO
              last edited by

              šŸ™ so it seems I need to purchase a 7900. I wonder if a non-X will do it šŸ¤”

              M 1 Reply Last reply Reply Quote 0
              • M Offline
                mgales @olivierlambert
                last edited by

                @olivierlambert - my friend with the Ryzen 7900 isn't experiencing the same issues that @BlueBadger and myself (with the 7900X) are having - his system is working fine (both of us are running xcp-ng-8.3.testing-2023.02.15-12.19-install.iso).

                1 Reply Last reply Reply Quote 0
                • olivierlambertO Offline
                  olivierlambert Vates 🪐 Co-Founder CEO
                  last edited by

                  With the same motherboard and the same BIOS settings/version?

                  M 1 Reply Last reply Reply Quote 0
                  • M Offline
                    mgales @olivierlambert
                    last edited by mgales

                    His is closer to your board (I believe), and to one that @BlueBadger tried (ASUS Prime B650M-A II-CSM)

                    My friend's board is: ASUS Prime B650M-A AX:
                    https://www.asus.com/us/motherboards-components/motherboards/prime/prime-b650m-a-ax/

                    He's using the Ryzen 9 7900 and isn't experiencing any problems with Linux VMs in XCP-ng.

                    1 Reply Last reply Reply Quote 0
                    • olivierlambertO Offline
                      olivierlambert Vates 🪐 Co-Founder CEO
                      last edited by

                      An idea investigation would be to swap the X and non-X CPU and see if there's a diff.

                      I'm under the impression it's more a motherboard issue (BIOS, or version) than anything else however šŸ¤”

                      M 1 Reply Last reply Reply Quote 0
                      • M Offline
                        mgales @olivierlambert
                        last edited by

                        I talked to my friend about doing a processor swap test, but he's happy with the way his system is running and doesn't want to take a chance of messing something up. Sorry about that šŸ˜ž

                        1 Reply Last reply Reply Quote 0
                        • olivierlambertO Offline
                          olivierlambert Vates 🪐 Co-Founder CEO
                          last edited by

                          Maybe there's others people in the community that could bring that info šŸ™‚

                          1 Reply Last reply Reply Quote 0
                          • B Offline
                            BlueBadger
                            last edited by

                            I ordered a Ryzen 7900 last night and got it this morning. (Thanks Amazon).
                            I just replaced the 7950x with the 7900 and things seem to work better.

                            I can run now a AlmaLinux 8 VM without the nopv flag.
                            I can now run Xen Orchestra without the nopv flag.

                            I will do some more testing.

                            M 1 Reply Last reply Reply Quote 1
                            • M Offline
                              mgales @BlueBadger
                              last edited by

                              @BlueBadger - Thank you! Looking forward to see what else you learn.

                              1 Reply Last reply Reply Quote 0
                              • B Offline
                                BlueBadger
                                last edited by

                                I was having issues (download stalls) with the onboard 2.5Gb NIC (RTL8125) on the ASUS Prime B650M-A II-CSM motherboard even after I switched from the Ryzen 7950x to the 7900.
                                My setup also includes a X540 10Gb NIC which seemed to be working well.

                                I swapped the motherboard back to the ASRock B650M PG Riptide and was still having issues with the onboard 2.5Gb NIC.

                                I disabled the onboard NIC and installed a second X540 and have not have any network issues so far.

                                I'm guessing there might be an issue with the r8125 driver.

                                Excluding the onboard 2.5Gb NIC, XCP-ng seems to run well on both motherboards.

                                The BIOS errors in dmesg don't seem to be causing any issues.
                                (The ASRock B650M PG Riptide seems like a nicer motherboard.)

                                1 Reply Last reply Reply Quote 0
                                • olivierlambertO Offline
                                  olivierlambert Vates 🪐 Co-Founder CEO
                                  last edited by

                                  That's… interesting šŸ¤” So the "X" series seems to have some issues in the end? It's weird since it should be very different than it's non-X counterpart.

                                  1 Reply Last reply Reply Quote 0
                                  • A Offline
                                    andyhhp Xen Guru
                                    last edited by

                                    So, we've had reports on xen-devel which look a little like this.

                                    @BlueBadger are you able to switch back to your 7950x and try booting Xen with x2apic_phys=true ? It appears that the -X processors are missing a feature in their IOMMU and Xen was getting confused when setting up interrupt handling.

                                    https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=0d2686f6b66b4b1b3c72c3525083b0ce02830054 is at least part of the fix, but so far feedback on the mailing lists suggests it's not a complete fix.

                                    B 2 Replies Last reply Reply Quote 1
                                    • B Offline
                                      BlueBadger @andyhhp
                                      last edited by

                                      @andyhhp Thanks for the info.
                                      I plan to leave my current machine (Ryzen 7900) as is since it seems to be running well.
                                      I plan to build a new machine with the extra 7950x. The motherboard is on back order.
                                      I will try the new setting once it is built.

                                      1 Reply Last reply Reply Quote 1
                                      • B Offline
                                        BlueBadger @andyhhp
                                        last edited by

                                        @andyhhp I built a new machine with my Ryzen 7950x.
                                        Booting Xen with x2apic_phys=true did not seem to fix any issues.
                                        šŸ˜ž

                                        1 Reply Last reply Reply Quote 0
                                        • olivierlambertO Offline
                                          olivierlambert Vates 🪐 Co-Founder CEO
                                          last edited by

                                          Interesting, thanks for the feedback. @andyhhp should we provide a Xen version with the initial fix and see if it's better? (maybe combined to the x2apic param)

                                          1 Reply Last reply Reply Quote 0
                                          • S Offline
                                            Sam
                                            last edited by

                                            I'm testing this combo:

                                            • AMD RYZEN 9 7900X
                                            • ASUS PRIME X670-P WIFI bios 1406
                                            • 2x32GB KINGSTON 5600 CL40 (max QLV)
                                            • Boot drive NVME 250gb (chipset) and SN850X 4.0 1TB on CPU.

                                            With setting: Local APIC Mode = X2APIC and UEFI set to Other OS. Installed 8.3 alpha and updated, got errors. Test installing XOA took too much time and booting was painfully slow using only 1 SSD on chipset NVME.

                                            Tried disabling IOMMU, but the same issue.

                                            1 Reply Last reply Reply Quote 0

                                            Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                                            Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                                            With your input, this post could be even better šŸ’—

                                            Register Login
                                            • First post
                                              Last post