Minisforum MS-01 unstable and hangs running Xcp-ng 8.3
-
I run Xcp-Ng on a minisforum ms 01 which is not quite stable. So some sometimes about 1 time every 14 days to 1 a month then the whole ms 01 hangs completely losing network access and the picture on the screen is completely black, ssh to xcp-ng doesn't work either
I have to force reboot by holding the power button to get it working again.
I don't think this error is directly related to Xcp-ng 8.3, but rather hardware related.But to the question, How much can Xcp-ng 8.3 withstand this extreme treatment before I should fear corrupt system files on Xcp-ng itself and vm running on the same hard disk. Everything starts running fine again after the rest
Oh can i trust the new backups of vm. I thought that if I start getting corrupted system files on for example windows vmen. but I don't notice it for a long time see 3 months Hasn't this problem moved over to all my backups if I have backups up to 3 months old. running delta backup
-
@steff22 I actually had the same issue when I got my MS01. Not only on xcpng but also on proxmox, which I tried before.
I have sent the unit back for a refund and got a protectli VP6670, which is rock solid.
Also it has a 12th gen Intel CPU and not one of the 13th/14th one as the MS01 (https://www.theverge.com/24216305/intel-13th-14th-gen-raptor-lake-cpu-crash-news-updates-patches-fixes-motherboards) -
To be fair the MS-01 is also available with 12th gen Intel CPU.
-
@manilx Thanks for the info. I thought I had heard something that there is a fault with 13th gen intel, But I was unsure if it also applied to the mobile version.
But actually feared it after I contacted Minisforum and explained the problem and was asked to send it back to them.So I'll probably send it back. I don't really like sending it if there had been a software error, But now I have confirmed that it has happened to several people too, so it should probably be safe to send it back
-
@steff22 I have in the last days contacted Minisforum for an updated bios. They've sent me the latest but no fix for the Intel issue. And when I pressed the issue, this was their answer:
There is no mention in the release notes of how to fix the Intel CPU bug that broke the 13th and 14th generation Raptor Lake cpus.
No explanationIntel has therefore extended its warranty by two years. What is your policy on this?
We don't have a policy on thatI still have an NPB7 with the 13th gen chip. So if this one burns I'm on my own.
I have moved all xcpng VM's from this and removed it from the main pool, now on it's own pool. Will have 2 unimportant VN's running on it, all else on Protectli.
-
@imaginapix True BUT this doesn't help if you pick the higher end one, which you would for running a hypervisor.
AND their stance on this is well: you're on your own. See my other post. -
@steff22 You should. It's been very popular on Youtube (most getting their units for free for promotion) BUT nobody speeks of this (to my knowledge).
I had nothing but trouble with it. -
Speaking of Protectli, good news coming in the next months Stay tuned.
-
@olivierlambert NICE!
Can only speak VERY positive on them!!!!
-
E+P cores are not support in the kernel, have you tried running with only E or P cores enabled?
My 13 Gen MS-01’s with Intel microcode installed work great with Proxmox. Running xcp-ng on D-1541’s still, so all good there
-
@iLix Good for you. Other's may vary as we have seen.....
-
Well, at my surprise, XCP-ng works with P/E cores (even if it's more by luck than by design). I'll run extensive tests with a Protectli unit I just received
-
@iLix No, have not tried with only E or P cores.
But I thought I had triggered an error when I tried to test the cpu by assigning all 20 cores to 1 windows vm and running the cpu test test program for 15 min with 100% cpu usage. This did not cause any problems. also no problems with it getting too high a temperature.
So it is a bit strange that this error occurs when ms 01 is almost idle, the vms that are running use very little cpu. is at 0% to 4% cpu use on the xcp-ng host mostly all day
Have installed the latest bios but have not done anything with the Intel microcode.
How do I see which version of Intel microcode I have and how do I update this? -
@olivierlambert But am I right about the backup that I can't trust them?
I must have held the power button in to force reboot opposite 5 times now I think. But notice no errors on xcp-ng or other vms that are running, everything just starts working again after a reboot.How much can Xcp-ng 8.3 withstand this extreme treatment before I should fear corrupt system files on Xcp-ng itself?
-
Just received it for testing purpose
-
@olivierlambert Running also on a VP6670 with 64GB RAM (which seems is what you got). So anything you want me to test, fire away
-
I wonder about the power draw difference with and without the P cores (leaving only the E cores).
-
That is great to hear that E/P-cores are working. I now have a migration project on my hands
-
@steff22 You should not need the microcode if xcp-ng works out of the box with hybrid CPU's.
When you force the power off, are the VM's reset time the same as for the host? Example: Does the kernel-power off event correspond with your hosts hard reboot, if you have Windows VM's running?
-
@iLix don't know how to check the last activity log in the xcp-ng host before the ms-01 crashes.
But according to the smart house log on windows vm, the crash time was 2 hours before I forced the power off. So expect this time to be the same on all running vms and the xcp-ng host itself.
windows vm has lost direct contact with the m.2 storage at this point and cannot write more in the log. so to me it seems like the whole ms-01 freezes and crashes.