XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. mgales
    M
    Offline
    • Profile
    • Following 0
    • Followers 0
    • Topics 0
    • Posts 9
    • Groups 0

    mgales

    @mgales

    2
    Reputation
    5
    Profile views
    9
    Posts
    0
    Followers
    0
    Following
    Joined
    Last Online

    mgales Unfollow Follow

    Best posts made by mgales

    • RE: Non-server CPU compatibility - Ryzen and Intel

      It's looking very good on my 7900X since applying the fixes:

      • Rolled back all the "nopv" boot options on Linux VMs (Ubuntu, Mint, Rocky) and they boot quickly.
      • Xen Orchestra has its IP address within a minute instead of around 10 minutes
      • Xen tools status on the Linux VMs are now showing "installed" (vs "not installed" before fixes)
      • No "stuck CPU" messages observed anywhere

      What a difference! Thanks!!

      posted in Compute
      M
      mgales

    Latest posts made by mgales

    • RE: Non-server CPU compatibility - Ryzen and Intel

      It's looking very good on my 7900X since applying the fixes:

      • Rolled back all the "nopv" boot options on Linux VMs (Ubuntu, Mint, Rocky) and they boot quickly.
      • Xen Orchestra has its IP address within a minute instead of around 10 minutes
      • Xen tools status on the Linux VMs are now showing "installed" (vs "not installed" before fixes)
      • No "stuck CPU" messages observed anywhere

      What a difference! Thanks!!

      posted in Compute
      M
      mgales
    • RE: Non-server CPU compatibility - Ryzen and Intel

      @BlueBadger - Thank you! Looking forward to see what else you learn.

      posted in Compute
      M
      mgales
    • RE: Non-server CPU compatibility - Ryzen and Intel

      I talked to my friend about doing a processor swap test, but he's happy with the way his system is running and doesn't want to take a chance of messing something up. Sorry about that 😞

      posted in Compute
      M
      mgales
    • RE: Non-server CPU compatibility - Ryzen and Intel

      His is closer to your board (I believe), and to one that @BlueBadger tried (ASUS Prime B650M-A II-CSM)

      My friend's board is: ASUS Prime B650M-A AX:
      https://www.asus.com/us/motherboards-components/motherboards/prime/prime-b650m-a-ax/

      He's using the Ryzen 9 7900 and isn't experiencing any problems with Linux VMs in XCP-ng.

      posted in Compute
      M
      mgales
    • RE: Non-server CPU compatibility - Ryzen and Intel

      @olivierlambert - my friend with the Ryzen 7900 isn't experiencing the same issues that @BlueBadger and myself (with the 7900X) are having - his system is working fine (both of us are running xcp-ng-8.3.testing-2023.02.15-12.19-install.iso).

      posted in Compute
      M
      mgales
    • RE: Non-server CPU compatibility - Ryzen and Intel

      @olivierlambert - I've got dmesg info captured that I can pass along. My friend also decided to build a new Zen 4 box for XCP-ng. He went with (what I think is) a similar motherboard to what you have? (PRIME B650M-A AX):
      https://www.asus.com/us/motherboards-components/motherboards/prime/prime-b650m-a-ax/

      The BIOS's are the same (ver 1222 - 2023/02/24), we loaded the same XCP-ng version (8.3alpha), the same xo-ce version (5.10.0-21-amd64), and the same Linux Mint version (21.1). Processors are similar Ryzen 9 7900 & 7900X. Both have Local APIC Mode set to X2APIC in the BIOS.

      His box runs Linux Mint 21.1 without issues (as does yours). We captured dmesg output from all 3 (XCP-ng, xo-ce, Linux Mint 21.1 cinnamon) on both boxes and I've compared them side-by-side and each of the 3 have problems/errors on my machine. I summarized the differences/errors occurring on my box for 2 of the 3 dmesg captures below (XCP-ng and xo-ce).

      I'm hoping that this might lead to some patches in XCP-ng software that will allow the Asus ROG STRIX B650E-F Gaming WIFI motherboard to fully work with XCP-ng. (XCP-ng works fine running Windows VMs, but not Linux VMs).

      If a patch/fix from XCP-ng doesn't seem likely, then I'll probably replace the current motherboard with the PRIME version.

      Below is a summary of what I've noticed in comparisons of XCP-ng dmesg files and xo-ce dmesg files.

      In order of appearance, the differences I'm seeing in XCP-ng dmesg from my box are:

      1. Hypervisor detected: Xen PV
        tsc: Fast TSC calibration failed

        Instead of:
        tsc: Fast TSC calibration using PIT

      2. no TSC line listed

        Instead of:
        tsc: Detected 3693.204 MHz TSC

      3. ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
        ACPI BIOS Error (bug): Could not resolve [_SB.PCI0.GPP7.UP00.DP40.UP00.DP68], AE_NOT_FOUND (20180810/dswload2-160)
        ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20180810/psobject-221)
        ACPI Error: Ignore error and continue table load (20180810/psobject-604)
        ACPI Error: Skip parsing opcode OpcodeName unavailable (20180810/psloop-543)
        ACPI: 14 ACPI AML tables successfully acquired and loaded

        Instead of:
        ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
        ACPI: 13 ACPI AML tables successfully acquired and loaded

      4. This sequence happened:
        [ 0.412184] usbcore: registered new device driver usb
        [ 0.412184] WARNING: CPU: 0 PID: 1 at drivers/i2c/busses/i2c-designware-common.c:245 i2c_dw_clk_rate+0x16/0x30
        [ 0.412184] Modules linked in:
        [ 0.412184] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0+1 #1
        [ 0.412184] Hardware name: ASUS System Product Name/ROG STRIX B650E-F GAMING WIFI, BIOS 0821 11/15/2022
        [ 0.412184] RIP: e030:i2c_dw_clk_rate+0x16/0x30
        [ 0.412184] Code: 00 48 c7 c6 5e ee e7 81 31 c0 5d e9 d4 60 f6 ff 0f 1f 40 00 0f 1f 44 00 00 48 8b 47 48 48 85 c0 74 08 e8 1d 35 43 00 89 c0 c3 <0f> 0b 0f 1f 84 00 00 00 00 00 c3 0f 1f 44 00 00 66 2e 0f 1f 84 00
        [ 0.412184] RSP: e02b:ffffc9004006fcf8 EFLAGS: 00010246
        [ 0.412184] RAX: 0000000000000000 RBX: ffff8881384d2018 RCX: 00000000aeffff00
        INFO DELETED
        [ 0.412184] ---[ end trace eae5bc73295d4325 ]---
        [ 0.412941] pps_core: LinuxPPS API ver. 1 registered
        [ 0.412941] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti giometti@linux.it
        [ 0.412942] PTP clock support registered

        Instead of:
        [ 0.476402] usbcore: registered new device driver usb
        [ 0.476402] pps_core: LinuxPPS API ver. 1 registered
        [ 0.476402] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti giometti@linux.it
        [ 0.476402] PTP clock support registered

      In xo-ce dmesg on my box, I saw:
      rcu_sched self-detected stall on CPU (2 occurrences)

      Would it be useful for me to upload the 3 dmesg capture files from both boxes (i.e. XCP-ng, xo-ce, Linux Mint 21.1)?

      posted in Compute
      M
      mgales
    • RE: Non-server CPU compatibility - Ryzen and Intel

      BTW, here's the full model number of the motherboard (purchased as part of a combo CPU/RAM/MB package):
      ROG STRIX B650E-F Gaming WIFI.
      Product URL: https://rog.asus.com/motherboards/rog-strix/rog-strix-b650e-f-gaming-wifi-model/

      posted in Compute
      M
      mgales
    • RE: Non-server CPU compatibility - Ryzen and Intel

      Thanks @olivierlambert for going to those lengths to recreate the same steps that I had done.
      It's good to know that the new generation of Ryzen processors and motherboards should work without issues for most.
      I appreciate the time you spent - it's beyond what I expected.

      I'll try your suggestions with dmesg, experiment with BIOS settings, and do the install to a regular hard drive for starters. I'm not familiar with dmesg, but I'll read up on that - looks like it could reveal boot issues.

      It puzzles me why the only issues I'm having are with Linux VMs on my hardware, but hopefully I can figure out what's causing it. Forgot to mention that I had updated to the latest BIOS update from Asus before doing this install, but it didn't change the behavior from what I was experiencing prior to the update.

      If I'm able to get it working, I'll post back here what helped.

      • Mike ( @mgales)
      posted in Compute
      M
      mgales
    • RE: Non-server CPU compatibility - Ryzen and Intel

      With the Ryzen 9 7900X, an Asus B650E series motherboard, and 32 GiB DDR5, the results I've gotten aren't favorable. I was able to install xcp-ng-8.3.testing-2023.02.15-12.19-install.iso without enabling x2API in the BIOS, and XCP-ng comes up fine. While xo-ce (running on the same box) comes up right away, it takes about 8-10 minutes before it acquires it's IP address (both XCP-ng and xo-ce use DHCP to get their IPs from a DHCP/DNS).

      After xo-ce is up and accessible, Windows 10 works well (install from ISO, boot & run performance is good). OSs from various Linux distros are horrendously slow. It took about 2 hrs to install Linux Mint 21.1 from an ISO. It takes several minutes, with any of the Linux ISOs I've tried (CentOS Stream 9.1, Ubuntu 22.04.1, Linux Mint 21.1) for a 2nd screen to appear after the initial install screen (which comes up fine).

      The install of Mint took about 2 hrs to complete.
      VM configuration: 8-CPUs, 8 GiB RAM, 22 GiB hard drive.

      Linux Mint OS comes up, but with similar slow boot time (29 minutes before LM logo, CPU usage at 100% on Stats tab during that 29 minutes, but oddly everything else at 0. Then I noticed CPU usage dropped and both Network and Disk throughput had activity, but still 0B of 8GiB RAM used, and I switched tabs and these 2 messages appear in the console screen:

      1. [  1.778229] vbd vbd-5696: 19 xenbus_dev_probe on device/vbd/5696
      2. [165.437239] piix4_smbus 0000:00:01.3: SMBus Host Controller not enabled!).
      

      (LM logo appeared on the console window shortly after the 2 messages above)

      At 48 minutes (per the VM_Started time on the Logs tab) the arrowhead cursor first appeared (console window), and the login screen appeared at 52 minutes. At 57 minutes the desktop is displayed (still at 0 GiB RAM usage) and system response is extremely slow.

      Another Note - these messages appeared on the xo-ce console window:

      [ 3684.425914] rcu: INFO: rcu_sched self-detected stall on CPU
      [ 3684.425914] rcu: #1-...!: (1 ticks this GP) idle=d22/0/0x1 softirq=3206/3206 fqs=1
      [ 3684.425914] rcu: rcu_sched thread starved for 4463 jiffies! g321 f0x0 RCU_GP_WAIT_FQS(5) -> state=0X402 ->cpu=1
      [ 3684.425914] rcu: #unless rcu_sched thread gets sufficient CPU time, OOM is now expected behavior.
      [ 3684.425914] rcu: RCU grace-period kthread stack dump:
      {Nothing after that on the screen}
      

      Side Note: my original Intel XCP-ng box shows all 4 (CPU, Memory, Network and Disk all peaking roughly at the same time that the Linux Mint OS boots, but on this new Ryzen box, 0B Memory at all times on the Stats screen for the Linux Mint VM. The xo-ce VM does show about 712 MiB Memory usage.

      I did try a separate boot, with X2API enabled in the motherboard BIOS before booting XCP-ng, but it didn't make any noticeable difference compared to having it set to Default.

      Repeating: I didn't experience sluggishness, problems installing, booting or running Windows 10 from an ISO under XCP-ng on this Ryzen box, only Linux distros.

      Far Side Notes: I used the same Linux Mint 21.1 ISO from above and installed it directly on a separate hard drive on the same Ryzen box and it works good. I'm running VMware running under Linux Mint 21.1 and Rocky Linux 9 loaded directly on this Ryzen box on separate hard drives and VMware works good when I boot on those drives (XCP-ng/xo-ce is running on a Samsung 970 plus M.2).

      I hope some of the information will be useful to the XCP-ng development team.
      I'd be glad to test a fix on my Ryzen box if that would help.
      It's a home lab box, so starting over isn't an issue.

      I'm fairly new to XCP-ng. I'm impressed by the support of the XCP-ng team for home users too.

      Addendum - inside the Linux Mint VM, I entered the free command in a terminal window.
      Results:

      	total	used	free	shared	buff/cache	available
      Mem:	8119596	727396	6087584	20004	1304616		7103828
      Swap:	1043340	     0	1043340
      
      posted in Compute
      M
      mgales