Minisforum MS-01 unstable and hangs running Xcp-ng 8.3
-
@manilx Thanks for the info. I thought I had heard something that there is a fault with 13th gen intel, But I was unsure if it also applied to the mobile version.
But actually feared it after I contacted Minisforum and explained the problem and was asked to send it back to them.So I'll probably send it back. I don't really like sending it if there had been a software error, But now I have confirmed that it has happened to several people too, so it should probably be safe to send it back
-
@steff22 I have in the last days contacted Minisforum for an updated bios. They've sent me the latest but no fix for the Intel issue. And when I pressed the issue, this was their answer:
There is no mention in the release notes of how to fix the Intel CPU bug that broke the 13th and 14th generation Raptor Lake cpus.
No explanationIntel has therefore extended its warranty by two years. What is your policy on this?
We don't have a policy on thatI still have an NPB7 with the 13th gen chip. So if this one burns I'm on my own.
I have moved all xcpng VM's from this and removed it from the main pool, now on it's own pool. Will have 2 unimportant VN's running on it, all else on Protectli.
-
@imaginapix True BUT this doesn't help if you pick the higher end one, which you would for running a hypervisor.
AND their stance on this is well: you're on your own. See my other post. -
@steff22 You should. It's been very popular on Youtube (most getting their units for free for promotion) BUT nobody speeks of this (to my knowledge).
I had nothing but trouble with it. -
Speaking of Protectli, good news coming in the next months Stay tuned.
-
@olivierlambert NICE!
Can only speak VERY positive on them!!!!
-
E+P cores are not support in the kernel, have you tried running with only E or P cores enabled?
My 13 Gen MS-01’s with Intel microcode installed work great with Proxmox. Running xcp-ng on D-1541’s still, so all good there
-
@iLix Good for you. Other's may vary as we have seen.....
-
Well, at my surprise, XCP-ng works with P/E cores (even if it's more by luck than by design). I'll run extensive tests with a Protectli unit I just received
-
@iLix No, have not tried with only E or P cores.
But I thought I had triggered an error when I tried to test the cpu by assigning all 20 cores to 1 windows vm and running the cpu test test program for 15 min with 100% cpu usage. This did not cause any problems. also no problems with it getting too high a temperature.
So it is a bit strange that this error occurs when ms 01 is almost idle, the vms that are running use very little cpu. is at 0% to 4% cpu use on the xcp-ng host mostly all day
Have installed the latest bios but have not done anything with the Intel microcode.
How do I see which version of Intel microcode I have and how do I update this? -
@olivierlambert But am I right about the backup that I can't trust them?
I must have held the power button in to force reboot opposite 5 times now I think. But notice no errors on xcp-ng or other vms that are running, everything just starts working again after a reboot.How much can Xcp-ng 8.3 withstand this extreme treatment before I should fear corrupt system files on Xcp-ng itself?
-
Just received it for testing purpose
-
@olivierlambert Running also on a VP6670 with 64GB RAM (which seems is what you got). So anything you want me to test, fire away
-
I wonder about the power draw difference with and without the P cores (leaving only the E cores).
-
That is great to hear that E/P-cores are working. I now have a migration project on my hands
-
@steff22 You should not need the microcode if xcp-ng works out of the box with hybrid CPU's.
When you force the power off, are the VM's reset time the same as for the host? Example: Does the kernel-power off event correspond with your hosts hard reboot, if you have Windows VM's running?
-
@iLix don't know how to check the last activity log in the xcp-ng host before the ms-01 crashes.
But according to the smart house log on windows vm, the crash time was 2 hours before I forced the power off. So expect this time to be the same on all running vms and the xcp-ng host itself.
windows vm has lost direct contact with the m.2 storage at this point and cannot write more in the log. so to me it seems like the whole ms-01 freezes and crashes.
-
@steff22 Mine "crashed" before i added this to the grub.cfg "pcie_port_pm=off pcie_aspm.policy=performance" (Proxmox)
The Windows event log would state kernel-power error matching the same time as the forced reset, so the VM's are still running while in this state.
This happened while idle, I would see a lot of cluster sync errors, as if they lost connection to the network (Using the 2.5 NIC's)Maybe in the same realm as this issue:
https://xcp-ng.org/forum/topic/8092/add-kernel-boot-params-for-dom0 -
@iLix okay
is different With me, I have nothing in the windows event log since the whole ms-01 crashes, no writing to m.2 strorage from vms either. then it's the same as just pulling out the power cable when it crashes. so this indicates a cpu error or motherboard if I'm not mistaken. -
Yes, that sounds like generally faulty unit not only the CPU related stuff.