@planedrop @olivierlambert @probain so I installed Ubuntu 22.04 on these last night and came back to the same frozen lockup as I was having with XCP-NG so it looks like I somehow received two equivalent servers from OnLogic that were both faulty to some degree. So definitely not an issue with XCP-NG in this case. Thank you for your help, I will be processing a return on these servers and go with a different product altogether.
Best posts made by R2rho
-
RE: Server Locks Up Periodically with ASRock X570D4I-2T AMD Ryzen 9 3900X and Intel X550-AT2
Latest posts made by R2rho
-
RE: Server Locks Up Periodically with ASRock X570D4I-2T AMD Ryzen 9 3900X and Intel X550-AT2
@dave That's pretty brutal honestly, I'm thinking about just calling it a day and moving away from Asrock servers entirely. I'm looking to set XCP-NG up on some IOT/Edge servers on some short-depth racks in a factory environment, so I really liked the form factor of these from OnLogic, but I've had the worst experience, and seeing your feedback definitely makes me want to go a different direction. I'm looking at some short-depth servers from SuperMicro geared specifically for IOT/Edge that I think will work out much better.
-
RE: Server Locks Up Periodically with ASRock X570D4I-2T AMD Ryzen 9 3900X and Intel X550-AT2
@planedrop @olivierlambert @probain so I installed Ubuntu 22.04 on these last night and came back to the same frozen lockup as I was having with XCP-NG so it looks like I somehow received two equivalent servers from OnLogic that were both faulty to some degree. So definitely not an issue with XCP-NG in this case. Thank you for your help, I will be processing a return on these servers and go with a different product altogether.
-
RE: Server Locks Up Periodically with ASRock X570D4I-2T AMD Ryzen 9 3900X and Intel X550-AT2
Thank you guys for the feedback. Strangely enough, I have two of these exact same servers as I was attempting to configure them as a pool. I installed XCP-NG on them separately and am having the exact same issue on both servers. They just lock up and stop responding. It could be a hardware issue, especially since I did see the memtest failures, but seems weird if its happening on both. I initially thought it was a RAM incompatibility issue because I added RAM to these after they arrived and saw all of these issues. But I've since removed the additional RAM and went back to what it had originally, but still having the issues.
I'm probably not going to remove the CPU because I will most likely return these, but I am going to install Ubuntu and see if they continue to be problematic. If that doesn't have any issues, then I think there's some underlying incompatibility with this AsRock Rack that probably needs further diagnosing and evaluation. Either way I'll probably go with something else.
-
RE: Server Locks Up Periodically with ASRock X570D4I-2T AMD Ryzen 9 3900X and Intel X550-AT2
I restarted the server and watched the log files up until the crash, which are attached here. This time there definitely seems to be something up, there was a bunch of null entries in the log files right when the crash happened.:
Dec 9 12:45:16 xcp-ng-host xapi: [debug||3483 /var/lib/xcp/xapi|post_root|dummytaskhelper] task dispatch:event.from D:66f38c9020de created by task D:9e902ea2f4f9 Dec 9 12:45:24 xcp-ng-host xapi: [debug||3484 /var/lib/xcp/xapi|post_root|dummytaskhelper] task dispatch:session.logout D:89b6b89b97b4 created by task D:b2576741520e Dec 9 12:45:24 xcp-ng-host xapi: [ info||3484 /var/lib/xcp/xapi|session.logout D:31f3c633c030|xapi_session] Session.destroy trackid=40fcb26a14999de91feb67ecb9771bc4 Dec 9 12:45:24 xcp-ng-host xapi: [debug||3485 /var/lib/xcp/xapi|post_root|dummytaskhelper] task dispatch:session.slave_login D:5d434bb6da87 created by task D:b2576741520e Dec 9 12:45:24 xcp-ng-host xapi: [ info||3485 /var/lib/xcp/xapi|session.slave_login D:91377f94f6db|xapi_session] Session.create trackid=9c3c9fb8e8cd899990ec90cc939c4a0c pool=true uname= originator=xapi is_local_superuser=true auth_user_sid= parent=trackid=9834f5af41c964e225f24279aefe4e49 Dec 9 12:45:24 xcp-ng-host xapi: [debug||3486 /var/lib/xcp/xapi|post_root|dummytaskhelper] task dispatch:pool.get_all D:d89558a6c493 created by task D:91377f94f6db Dec 9 12:45:24 xcp-ng-host xapi: [debug||3487 /var/lib/xcp/xapi|post_root|dummytaskhelper] task dispatch:event.from D:9018b4d47aa2 created by task D:b2576741520e Dec 9 12:45:42 xcp-ng-host xapi: [debug||3490 /var/lib/xcp/xapi|post_root|dummytaskhelper] task dispatch:session.logout D:b3c50aed0bdd created by task D:001a2b86b7e7 Dec 9 12:45:42 xcp-ng-host xapi: [ info||3490 /var/lib/xcp/xapi|session.logout D:182495298773|xapi_session] Session.destroy trackid=f7523433dad5baa1f212e9bf56450726 Dec 9 12:45:42 xcp-ng-host xapi: [debug||356 |watching networks for NBD-related changes D:001a2b86b7e7|network_event_loop] Not updating the firewall, because the set of interfaces to use for NBD did not change: [] Dec 9 12:45:47 xcp-ng-host xapi: [debug||3491 /var/lib/xcp/xapi|post_root|dummytaskhelper] task dispatch:session.slave_login D:654bf5b32b3b created by task D:001a2b86b7e7 Dec 9 12:45:47 xcp-ng-host xapi: [ info||3491 /var/lib/xcp/xapi|session.slave_login D:966d08cb98ae|xapi_session] Session.create trackid=860c6ab7ca617a23222174cf41168464 pool=true uname= originator=xapi is_local_superuser=true auth_user_sid= parent=trackid=9834f5af41c964e225f24279aefe4e49 Dec 9 12:45:47 xcp-ng-host xapi: [debug||3492 /var/lib/xcp/xapi|post_root|dummytaskhelper] task dispatch:pool.get_all D:8dc884754841 created by task D:966d08cb98ae Dec 9 12:45:47 xcp-ng-host xapi: [debug||3493 /var/lib/xcp/xapi|post_root|dummytaskhelper] task dispatch:event.from D:a92ebd9d4e50 created by task D:001a2b86b7e7 <null><null><null><null><null><null><null><null><null><null><null><null><null><null><null><null>
The line of NULLS seems to not want to show up here so here's a screenshot of what the logs look like in my VS Code ide of the log files. I've also attached the log file here again.
Here is the log file trimmed to the relevant sections, you can see the lines of NULLS on line 9135.
xensource_12_09.txt -
RE: Server Locks Up Periodically with ASRock X570D4I-2T AMD Ryzen 9 3900X and Intel X550-AT2
@planedrop @probain I tried checking for power management and/or C-states in BIOS and I didn't see any settings related to those. I looked in CPU Configuration, Chipset Configuration, and didn't see anything there.
The only setting available on BIOS > Advanced > APCPI Configuration are:
PCIE Devices Power On [Disabled]
RTC Alarm Power On [By OS]I don't see Active-State Power Management or C-States.
-
RE: Server Locks Up Periodically with ASRock X570D4I-2T AMD Ryzen 9 3900X and Intel X550-AT2
@probain @planedrop Here is the log file, on December 6th it looks like it just froze around line 12190. This morning on Dec 9th I hard rebooted it and copied these log files. Before I hard rebooted it, even though the server was still on and had the XCP-NG console up on the display, it was not responding to any keyboard input and I couldn't find it on the network. I pulled the log file from /var/log/xensource.log and uploaded here. It was a bit long so I trimmed out the end of the Dec 9 portion of the logs to fit the file size upload limit. I'll see if this server has some power management settings that can be disabled. Appreciate your help, I have no idea what to look for in these logs.
Dec 6 10:50:31 xcp-ng-host xapi: [ info||3551 /var/lib/xcp/xapi|session.logout D:633480054271|xapi_session] Session.destroy trackid=11fe74c1adf8a4ab19fff8c5956e6896 Dec 6 10:50:31 xcp-ng-host xapi: [debug||3552 /var/lib/xcp/xapi|post_root|dummytaskhelper] task dispatch:session.slave_login D:1607dbc2d558 created by task D:38c5fbdd7624 Dec 6 10:50:31 xcp-ng-host xapi: [ info||3552 /var/lib/xcp/xapi|session.slave_login D:fc3d92b09f17|xapi_session] Session.create trackid=f33f6493a5cd650abf4a35e26ffc323d pool=true uname= originator=xapi is_local_superuser=true auth_user_sid= parent=trackid=9834f5af41c964e225f24279aefe4e49 Dec 6 10:50:31 xcp-ng-host xapi: [debug||3553 /var/lib/xcp/xapi|post_root|dummytaskhelper] task dispatch:pool.get_all D:f13a7669aea0 created by task D:fc3d92b09f17 Dec 6 10:50:31 xcp-ng-host xapi: [debug||3554 /var/lib/xcp/xapi|post_root|dummytaskhelper] task dispatch:event.from D:253d3c855775 created by task D:38c5fbdd7624 Dec 6 10:50:54 xcp-ng-host xapi: [debug||355 |xapi events D:9f1127ca7f99|dummytaskhelper] task timeboxed_rpc D:4e49110cce3d created by task D:9f1127ca7f99 Dec 6 10:50:54 xcp-ng-host xapi: [debug||3555 /var/lib/xcp/xapi|post_root|dummytaskhelper] task dispatch:event.from D:2e3c7d2e3b04 created by task D:9f1127ca7f99 Dec 9 07:53:26 xcp-ng-host squeezed: [debug||0 ||memory] squeezed version 24.19.2 starting Dec 9 07:53:26 xcp-ng-host squeezed: [debug||0 ||squeezed] Parsing [http] Dec 9 07:53:26 xcp-ng-host squeezed: [debug||0 ||squeezed] use-switch = true (true if the message switch is to be enabled) Dec 9 07:53:26 xcp-ng-host squeezed: [debug||0 ||squeezed] switch-path = /var/run/message-switch/sock (Unix domain socket path on localhost where the message switch is listening) Dec 9 07:53:26 xcp-ng-host squeezed: [debug||0 ||squeezed] search-path = (Search path for resources) Dec 9 07:53:26 xcp-ng-host squeezed: [debug||0 ||squeezed] pidfile = /var/run/squeezed.pid (Filename to write process PID) Dec 9 07:53:26 xcp-ng-host squeezed: [debug||0 ||squeezed] log = syslog:squeezed (Where to write log messages) Dec 9 07:53:26 xcp-ng-host squeezed: [debug||0 ||squeezed] daemon = false (True if we are to daemonise) Dec 9 07:53:26 xcp-ng-host squeezed: [debug||0 ||squeezed] disable-logging-for = http (A space-separated list of debug modules to suppress logging from) Dec 9 07:53:26 xcp-ng-host squeezed: [debug||0 ||squeezed] loglevel = debug (Log level)[xensource.txt](/forum/assets/uploads/files/1733758359408-xensource.txt)
-
RE: Server Locks Up Periodically with ASRock X570D4I-2T AMD Ryzen 9 3900X and Intel X550-AT2
@planedrop No I don't have the ability to see the console output, the server itself becomes completely unresponsive even to keyboard input via USB, so I can't see anything or access any logs immediately after the crash. I do have a display attached, but I didn't have the shell open, just the main console. Once I reboot it then I can navigate back to it and see the logs. I couldn't see anything particularly interesting in the logs the last time I looked, but I also don't know what to look for, so I'll go back and get the log output tomorrow and provide them here. I can also leave the shell open on the display with tail -f /var/log/xensource.log and see if I can capture what happens right at the freeze.
-
Server Locks Up Periodically with ASRock X570D4I-2T AMD Ryzen 9 3900X and Intel X550-AT2
I’ve been experiencing periodic server lockups with my setup, and I’m at a loss for what’s causing the issue. Here are the details of my configuration:
Hardware:
Server: OnLogic MK150B-40
CPU: AMD Ryzen 9 3900X
Motherboard: ASRockRack X570D4I-2T
Ethernet: 2 x RJ45 10GLAN (Intel X550-AT2)
Software:XCP-NG version: 8.3 (latest)
Symptoms:
The server becomes completely unresponsive after running for a few hours to a few days.The console is entirely frozen (no keyboard input works).
The server cannot be accessed via its assigned IP address.
Fans keep running, and the power LED stays on.I ran memtest on the RAM several times, and also replaced the RAM with new modules. I did get several memtest failures initially which seemed strange as the RAM modules were all brand new.
I didn't see the motherboard and NIC appear on the XCP-NG hardware compatibility list flagged as problematic, other than the intel X550 series not advertising 2.5Gb or 5Gb, but I'm using 1G and 10G anyway.
Checked logs (/var/log/messages, /var/log/xensource.log) for anything obvious before crashes but couldn’t identify any clear issue.Are additional or alternate drivers recommended?
Should I try using an alternate kernel?Are there additional debugging steps or logs I should check to diagnose what’s causing the periodic lockups?
Thanks in advance for your help!