XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Non-server CPU compatibility - Ryzen and Intel

    Scheduled Pinned Locked Moved Compute
    111 Posts 17 Posters 55.7k Views 16 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • olivierlambertO Offline
      olivierlambert Vates 🪐 Co-Founder CEO
      last edited by olivierlambert

      So I installed Linux Mint 21.1 on my Zen4 setup, using a Debian 11 template, with 4vCPUs and 4GiB RAM and 40GiB virtual disk:

      • the ISO booted on the OS is less than 1 minute (likely 20 secs)
      • the installation itself was 5/6 minutes long tops

      After the install, booting the OS took around 15 seconds (to get to the login)

      Note: I'm using a cheap Kingston 120GiB SSD, not even an NVMe.

      edit: I had to do bash /mnt/Linux/install.sh -d debian -m 11 to install the tools, but that's it.

      1 Reply Last reply Reply Quote 0
      • olivierlambertO Offline
        olivierlambert Vates 🪐 Co-Founder CEO
        last edited by

        So it's likely an issue with your physical machine (buggy BIOS? thing not enabled?). I would start by the usual dmesg in the dom0, and also a xl dmesg.

        1 Reply Last reply Reply Quote 0
        • M Offline
          mgales
          last edited by

          Thanks @olivierlambert for going to those lengths to recreate the same steps that I had done.
          It's good to know that the new generation of Ryzen processors and motherboards should work without issues for most.
          I appreciate the time you spent - it's beyond what I expected.

          I'll try your suggestions with dmesg, experiment with BIOS settings, and do the install to a regular hard drive for starters. I'm not familiar with dmesg, but I'll read up on that - looks like it could reveal boot issues.

          It puzzles me why the only issues I'm having are with Linux VMs on my hardware, but hopefully I can figure out what's causing it. Forgot to mention that I had updated to the latest BIOS update from Asus before doing this install, but it didn't change the behavior from what I was experiencing prior to the update.

          If I'm able to get it working, I'll post back here what helped.

          • Mike ( @mgales)
          1 Reply Last reply Reply Quote 0
          • olivierlambertO Offline
            olivierlambert Vates 🪐 Co-Founder CEO
            last edited by

            Okay keep us posted 🙂 Since it's work here, it's not obvious on what to do, because not reproducing a problem makes it harder to solve 😞

            M 1 Reply Last reply Reply Quote 0
            • M Offline
              mgales
              last edited by mgales

              BTW, here's the full model number of the motherboard (purchased as part of a combo CPU/RAM/MB package):
              ROG STRIX B650E-F Gaming WIFI.
              Product URL: https://rog.asus.com/motherboards/rog-strix/rog-strix-b650e-f-gaming-wifi-model/

              1 Reply Last reply Reply Quote 0
              • S Offline
                scot1297 @olivierlambert
                last edited by scot1297

                @olivierlambert I have a question for you on the AMD platform you have listed in this thread. I am looking at updating my current setup to a newer amd zen 4 cpu and motherboard. My question is are you able to pass through some of the usb ports from this motherboard or would I still need to use a dedicated pci USB device. I'm asking because currently I have a separate PCI USB device for pass through, but I am hopefull that in my next build I don't need to have a separate USB PCI slot taken and can just use the USB ports on the motherboard?

                Thanks for any insight. I do know this is an edge case and most don't need the USB pass through like this, but I have a few USB devices I need to provide to a VM.

                Quick Edit, I know you can do just a plan USB passthrough, but I am looking at the PCI level so that on host restarts I dont' have to redo USB passthrough each time. Because right now with my gpu and USB pci card on host restart I dont' have to re-pass through these devices.

                Thanks,
                Scot

                1 Reply Last reply Reply Quote 0
                • olivierlambertO Offline
                  olivierlambert Vates 🪐 Co-Founder CEO
                  last edited by

                  You can probably pass the entire USB controller (but then you won't be able to get any USB port used outside the passed-through VM)

                  1 Reply Last reply Reply Quote 0
                  • M Offline
                    mgales @olivierlambert
                    last edited by mgales

                    @olivierlambert - I've got dmesg info captured that I can pass along. My friend also decided to build a new Zen 4 box for XCP-ng. He went with (what I think is) a similar motherboard to what you have? (PRIME B650M-A AX):
                    https://www.asus.com/us/motherboards-components/motherboards/prime/prime-b650m-a-ax/

                    The BIOS's are the same (ver 1222 - 2023/02/24), we loaded the same XCP-ng version (8.3alpha), the same xo-ce version (5.10.0-21-amd64), and the same Linux Mint version (21.1). Processors are similar Ryzen 9 7900 & 7900X. Both have Local APIC Mode set to X2APIC in the BIOS.

                    His box runs Linux Mint 21.1 without issues (as does yours). We captured dmesg output from all 3 (XCP-ng, xo-ce, Linux Mint 21.1 cinnamon) on both boxes and I've compared them side-by-side and each of the 3 have problems/errors on my machine. I summarized the differences/errors occurring on my box for 2 of the 3 dmesg captures below (XCP-ng and xo-ce).

                    I'm hoping that this might lead to some patches in XCP-ng software that will allow the Asus ROG STRIX B650E-F Gaming WIFI motherboard to fully work with XCP-ng. (XCP-ng works fine running Windows VMs, but not Linux VMs).

                    If a patch/fix from XCP-ng doesn't seem likely, then I'll probably replace the current motherboard with the PRIME version.

                    Below is a summary of what I've noticed in comparisons of XCP-ng dmesg files and xo-ce dmesg files.

                    In order of appearance, the differences I'm seeing in XCP-ng dmesg from my box are:

                    1. Hypervisor detected: Xen PV
                      tsc: Fast TSC calibration failed

                      Instead of:
                      tsc: Fast TSC calibration using PIT

                    2. no TSC line listed

                      Instead of:
                      tsc: Detected 3693.204 MHz TSC

                    3. ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
                      ACPI BIOS Error (bug): Could not resolve [_SB.PCI0.GPP7.UP00.DP40.UP00.DP68], AE_NOT_FOUND (20180810/dswload2-160)
                      ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20180810/psobject-221)
                      ACPI Error: Ignore error and continue table load (20180810/psobject-604)
                      ACPI Error: Skip parsing opcode OpcodeName unavailable (20180810/psloop-543)
                      ACPI: 14 ACPI AML tables successfully acquired and loaded

                      Instead of:
                      ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
                      ACPI: 13 ACPI AML tables successfully acquired and loaded

                    4. This sequence happened:
                      [ 0.412184] usbcore: registered new device driver usb
                      [ 0.412184] WARNING: CPU: 0 PID: 1 at drivers/i2c/busses/i2c-designware-common.c:245 i2c_dw_clk_rate+0x16/0x30
                      [ 0.412184] Modules linked in:
                      [ 0.412184] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0+1 #1
                      [ 0.412184] Hardware name: ASUS System Product Name/ROG STRIX B650E-F GAMING WIFI, BIOS 0821 11/15/2022
                      [ 0.412184] RIP: e030:i2c_dw_clk_rate+0x16/0x30
                      [ 0.412184] Code: 00 48 c7 c6 5e ee e7 81 31 c0 5d e9 d4 60 f6 ff 0f 1f 40 00 0f 1f 44 00 00 48 8b 47 48 48 85 c0 74 08 e8 1d 35 43 00 89 c0 c3 <0f> 0b 0f 1f 84 00 00 00 00 00 c3 0f 1f 44 00 00 66 2e 0f 1f 84 00
                      [ 0.412184] RSP: e02b:ffffc9004006fcf8 EFLAGS: 00010246
                      [ 0.412184] RAX: 0000000000000000 RBX: ffff8881384d2018 RCX: 00000000aeffff00
                      INFO DELETED
                      [ 0.412184] ---[ end trace eae5bc73295d4325 ]---
                      [ 0.412941] pps_core: LinuxPPS API ver. 1 registered
                      [ 0.412941] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti giometti@linux.it
                      [ 0.412942] PTP clock support registered

                      Instead of:
                      [ 0.476402] usbcore: registered new device driver usb
                      [ 0.476402] pps_core: LinuxPPS API ver. 1 registered
                      [ 0.476402] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti giometti@linux.it
                      [ 0.476402] PTP clock support registered

                    In xo-ce dmesg on my box, I saw:
                    rcu_sched self-detected stall on CPU (2 occurrences)

                    Would it be useful for me to upload the 3 dmesg capture files from both boxes (i.e. XCP-ng, xo-ce, Linux Mint 21.1)?

                    1 Reply Last reply Reply Quote 0
                    • olivierlambertO Offline
                      olivierlambert Vates 🪐 Co-Founder CEO
                      last edited by

                      Well, a pretty buggy BIOS on your side doesn't help I suppose 😞

                      1 Reply Last reply Reply Quote 0
                      • B Offline
                        BlueBadger
                        last edited by

                        I built a system with a Ryzen 7950x on an ASRock B650M PG Riptide motherboard and was having similar issues as mgales. I switch to an ASUS Prime B650M-A II-CSM without any improvement.

                        With the ASUS Prime, there were no BIOS errors reported.
                        (I was able to get rid of 'ACPI BIOS Error (bug): Could not resolve [_SB.PCI0.GPP7.UP00.DP40.UP00.DP68], AE_NOT_FOUND (20180810/dswload2-160)' by enabling the onboard audio.)

                        I am able to run imported Windows VMs (Windows 10 and Server 2022) without any apparent issues.
                        I can run an imported AlmaLinux 8 VM with the nopv kernel option.
                        I can run the AlmaLinux 8 installer with the nopv option.
                        I can run Xen Orchestra with the nopv option.
                        I can also run an imported CentOS 6 VM without any additional options.

                        The main issue seems to be a stuck CPU on the Linux VMs when using PV drivers.

                        Could there be issues specific to Rzyen 7900x and 7950x?

                        1 Reply Last reply Reply Quote 0
                        • olivierlambertO Offline
                          olivierlambert Vates 🪐 Co-Founder CEO
                          last edited by

                          🙁 so it seems I need to purchase a 7900. I wonder if a non-X will do it 🤔

                          M 1 Reply Last reply Reply Quote 0
                          • M Offline
                            mgales @olivierlambert
                            last edited by

                            @olivierlambert - my friend with the Ryzen 7900 isn't experiencing the same issues that @BlueBadger and myself (with the 7900X) are having - his system is working fine (both of us are running xcp-ng-8.3.testing-2023.02.15-12.19-install.iso).

                            1 Reply Last reply Reply Quote 0
                            • olivierlambertO Offline
                              olivierlambert Vates 🪐 Co-Founder CEO
                              last edited by

                              With the same motherboard and the same BIOS settings/version?

                              M 1 Reply Last reply Reply Quote 0
                              • M Offline
                                mgales @olivierlambert
                                last edited by mgales

                                His is closer to your board (I believe), and to one that @BlueBadger tried (ASUS Prime B650M-A II-CSM)

                                My friend's board is: ASUS Prime B650M-A AX:
                                https://www.asus.com/us/motherboards-components/motherboards/prime/prime-b650m-a-ax/

                                He's using the Ryzen 9 7900 and isn't experiencing any problems with Linux VMs in XCP-ng.

                                1 Reply Last reply Reply Quote 0
                                • olivierlambertO Offline
                                  olivierlambert Vates 🪐 Co-Founder CEO
                                  last edited by

                                  An idea investigation would be to swap the X and non-X CPU and see if there's a diff.

                                  I'm under the impression it's more a motherboard issue (BIOS, or version) than anything else however 🤔

                                  M 1 Reply Last reply Reply Quote 0
                                  • M Offline
                                    mgales @olivierlambert
                                    last edited by

                                    I talked to my friend about doing a processor swap test, but he's happy with the way his system is running and doesn't want to take a chance of messing something up. Sorry about that 😞

                                    1 Reply Last reply Reply Quote 0
                                    • olivierlambertO Offline
                                      olivierlambert Vates 🪐 Co-Founder CEO
                                      last edited by

                                      Maybe there's others people in the community that could bring that info 🙂

                                      1 Reply Last reply Reply Quote 0
                                      • B Offline
                                        BlueBadger
                                        last edited by

                                        I ordered a Ryzen 7900 last night and got it this morning. (Thanks Amazon).
                                        I just replaced the 7950x with the 7900 and things seem to work better.

                                        I can run now a AlmaLinux 8 VM without the nopv flag.
                                        I can now run Xen Orchestra without the nopv flag.

                                        I will do some more testing.

                                        M 1 Reply Last reply Reply Quote 1
                                        • M Offline
                                          mgales @BlueBadger
                                          last edited by

                                          @BlueBadger - Thank you! Looking forward to see what else you learn.

                                          1 Reply Last reply Reply Quote 0
                                          • B Offline
                                            BlueBadger
                                            last edited by

                                            I was having issues (download stalls) with the onboard 2.5Gb NIC (RTL8125) on the ASUS Prime B650M-A II-CSM motherboard even after I switched from the Ryzen 7950x to the 7900.
                                            My setup also includes a X540 10Gb NIC which seemed to be working well.

                                            I swapped the motherboard back to the ASRock B650M PG Riptide and was still having issues with the onboard 2.5Gb NIC.

                                            I disabled the onboard NIC and installed a second X540 and have not have any network issues so far.

                                            I'm guessing there might be an issue with the r8125 driver.

                                            Excluding the onboard 2.5Gb NIC, XCP-ng seems to run well on both motherboards.

                                            The BIOS errors in dmesg don't seem to be causing any issues.
                                            (The ASRock B650M PG Riptide seems like a nicer motherboard.)

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post