Windows Server 2016 & 2019 freezing on multiple hosts
-
Yeah but the Open source drivers must be updated somehow, because there is some people using Xen out there with Windows load (AWS? IBM Rackspace?)
-
The OpenSource Drivers from the XEN-Project (https://www.xenproject.org/downloads/windows-pv-drivers/winpv-drivers-8/winpv-drivers-821.html) are the same version like ours: https://github.com/xcp-ng/win-pv-drivers/releases/tag/v8.2.1-beta1
Maybe IBM or Rackspace uses own builds?
-
Quick Update
I haven't had any crashes since posting here. Before posting here it was each of my 2016 & 2019 VMs on XCP-NG that would lock up. Now it's none of them. Some of them I updated, while others I did nothing... Strange.
I'll update this post again in a few days, or sooner if anything changes.
-
Thanks a lot for your feedback @michael !
-
I had one lock up on me. Surprising the one with the shortest uptime.
EDIT: I just updated the xcp-emu-manager-0.0.9-1 that was posted about a bit ago. Not sure if this will help or not.
Have you heard of anyone else having these issues? I'm debating on doing a fresh install of XCP-NG to see if that helps.
EDIT 2: There are no error logs given in Windows. One minute it's working, the next it's not. I'm going through the XCP-NG logs at the moment to see if there is anything there.
EDIT 3: Here is a list of every line with the UUID of the VM associated with it: https://pastebin.com/ip30uyMN
-
@michael can you point to a specific time in the log file when your VM was locked up?
-
@borzel Sometime between the 7:40 and 12:12 on the first line. I'm guessing it went down right as I was posting here.
Like I said whatever it's doing it doesn't seem to log it. Is there any other log files I should check?
EDIT: I have a theory, testing it now.
EDIT 2: I'm wondering if this has something to do with XCP-NG Center. The KMS server (the one that locked up) was the only one I logged into when checking if everything was working. Trying to duplicate this now.
-
Update: The same VM crashed again, the others are working fine as of this post. I left it up as a remote desktop session last night and I came back to my office today and it was still up as a RDP session, but not responsive. I checked XCP-NG Center and the same blue screen with a cursor was there.
Still trying to find the cause of this.
-
Were you able to resolve this in the end?
-
@michael this may sound silly, but perhaps this isn’t a software issue? Maybe you have a faulty stick of RAM hiding in the machine? Bad RAM will make all kinds of strange and flaky things happen. This is just a guess, I’m just putting this idea out there because I’ve seen similar weird behavior in machines where RAM failure isn’t easy to spot (no front panel on the server with a fault light).
-
@apayne This is a good suggestion, all tho it is happening on a lot of different physical machines if im not mistaken.
Im running a couple of Windows Server 2016 on xcp-ng 7.4 and 7.5 without any issues, im hoping this isnt something related to the specific version xcp-ng 7.6 since we're planning on migrating from XenServer to this version.
-
Any updates on this?
I'm planing to run 2 HPE DL20 on XCP-ng 7.6 (Install image for 8 is currently broken to run on that - reason still unknown). -
@cg Hello
I run into the same issue with Vserver (Windows Server 2016). It looks like not only XCP has this problem. Also Citrix, VMWare and Hyper-v face this. Wondering if there is any solution!!! -
Could it be related to a bug in Windows itself?
-
iirc vmware had hit problems with windows 10 1809 and up...there was some finger pointing, but vmware released a patch in update 3. Since the reported problem popped up after 1809, then Id suspect it more of a microsoft issue...
-
For what it's worth, I was having Windows 2016 freezes regularly (every few days or week). I could never trace it back to a specific cause, though I noticed very high number of handles and threads when it was hanging.
Per my records, it stopped hanging after I upgraded the PV drivers using Guest Tool ISO 7.41.0-1.
-
I have been running Windows 10 and Windows Server 2016 now for a long time on xcp-ng.
last weekend I upgraded one of my test-servers to xcp-ng 8.0 and Windows Server 2016 still runs fine.
This is the output of systeminfoC:\Windows\system32>systeminfo Host Name: HEL-DC1 OS Name: Microsoft Windows Server 2016 Standard OS Version: 10.0.14393 N/A Build 14393 OS Manufacturer: Microsoft Corporation OS Configuration: Primary Domain Controller OS Build Type: Multiprocessor Free Registered Owner: Windows User Registered Organization: Product ID: 00377-60000-00000-AA934 Original Install Date: 2018-10-20, 17:26:33 System Boot Time: 2019-10-01, 23:13:37 System Manufacturer: Xen System Model: HVM domU System Type: x64-based PC Processor(s): 1 Processor(s) Installed. [01]: Intel64 Family 6 Model 158 Stepping 9 GenuineIntel ~3600 Mhz BIOS Version: Xen 4.7.5-5.4.1.xcp, 2018-08-02
I am using these tools:
-
FWIW -- I had a similar problem with VMs going randomly down -- it wasn't windows but various BSD and Linux VM's. Sometimes they would run a few days, other times they would lock up every other day. Logs didn't help because when the VM locked, logs were not written. I even had the xcp-ng installation lock a few times, although it was usually on of the VMs. I started searching for faulty hardware and eventually SMART tested all the drives and mem86 the RAM. Turns out I had some bad RAM modules which I later RMA'd. With new RAM (after after thoroughly testing it), I haven't had any lock ups. I'm not sure this will help you at all.