Windows Server 2016 & 2019 freezing on multiple hosts
-
I mean more simply: install latest Xen tools (so ours should be OK I assume), display the VIF driver number, then do the same on Citrix driver and compare the VIF driver number.
-
@olivierlambert the first three digits of the version number are the same, the last is usually the buildnumber... I assume they backportet some codechanges from master ... but this is nothing we can easily detect
Edit: maybe somone with IDA can compare the last two version an do a (graphical -> flowchart?) diff?
Edit2: I try to check the git logs in the evening ... maybe some change pop's up
-
I asked for help: https://bugs.xenserver.org/browse/XSO-928
-
I'm going to try the remote event viewer over the next few days to see if anything of interest comes up. As for remote desktop, and ping I get no response.
I'll pull the latest builds and see if I have any luck. I'm also going to pull the latest Windows Server updates on a few of the VMs (seems they just came out a few hours ago as I checked for updates earlier and had nothing)
-
As you can read here: https://bugs.xenserver.org/browse/XSO-928 there is no legal chance to get the changes made by CITRIX, because the originating code is BSD licensed
-
Yeah but the Open source drivers must be updated somehow, because there is some people using Xen out there with Windows load (AWS? IBM Rackspace?)
-
The OpenSource Drivers from the XEN-Project (https://www.xenproject.org/downloads/windows-pv-drivers/winpv-drivers-8/winpv-drivers-821.html) are the same version like ours: https://github.com/xcp-ng/win-pv-drivers/releases/tag/v8.2.1-beta1
Maybe IBM or Rackspace uses own builds?
-
Quick Update
I haven't had any crashes since posting here. Before posting here it was each of my 2016 & 2019 VMs on XCP-NG that would lock up. Now it's none of them. Some of them I updated, while others I did nothing... Strange.
I'll update this post again in a few days, or sooner if anything changes.
-
Thanks a lot for your feedback @michael !
-
I had one lock up on me. Surprising the one with the shortest uptime.
EDIT: I just updated the xcp-emu-manager-0.0.9-1 that was posted about a bit ago. Not sure if this will help or not.
Have you heard of anyone else having these issues? I'm debating on doing a fresh install of XCP-NG to see if that helps.
EDIT 2: There are no error logs given in Windows. One minute it's working, the next it's not. I'm going through the XCP-NG logs at the moment to see if there is anything there.
EDIT 3: Here is a list of every line with the UUID of the VM associated with it: https://pastebin.com/ip30uyMN
-
@michael can you point to a specific time in the log file when your VM was locked up?
-
@borzel Sometime between the 7:40 and 12:12 on the first line. I'm guessing it went down right as I was posting here.
Like I said whatever it's doing it doesn't seem to log it. Is there any other log files I should check?
EDIT: I have a theory, testing it now.
EDIT 2: I'm wondering if this has something to do with XCP-NG Center. The KMS server (the one that locked up) was the only one I logged into when checking if everything was working. Trying to duplicate this now.
-
Update: The same VM crashed again, the others are working fine as of this post. I left it up as a remote desktop session last night and I came back to my office today and it was still up as a RDP session, but not responsive. I checked XCP-NG Center and the same blue screen with a cursor was there.
Still trying to find the cause of this.
-
Were you able to resolve this in the end?
-
@michael this may sound silly, but perhaps this isnβt a software issue? Maybe you have a faulty stick of RAM hiding in the machine? Bad RAM will make all kinds of strange and flaky things happen. This is just a guess, Iβm just putting this idea out there because Iβve seen similar weird behavior in machines where RAM failure isnβt easy to spot (no front panel on the server with a fault light).
-
@apayne This is a good suggestion, all tho it is happening on a lot of different physical machines if im not mistaken.
Im running a couple of Windows Server 2016 on xcp-ng 7.4 and 7.5 without any issues, im hoping this isnt something related to the specific version xcp-ng 7.6 since we're planning on migrating from XenServer to this version.
-
Any updates on this?
I'm planing to run 2 HPE DL20 on XCP-ng 7.6 (Install image for 8 is currently broken to run on that - reason still unknown). -
@cg Hello
I run into the same issue with Vserver (Windows Server 2016). It looks like not only XCP has this problem. Also Citrix, VMWare and Hyper-v face this. Wondering if there is any solution!!! -
Could it be related to a bug in Windows itself?
-
iirc vmware had hit problems with windows 10 1809 and up...there was some finger pointing, but vmware released a patch in update 3. Since the reported problem popped up after 1809, then Id suspect it more of a microsoft issue...