XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. phipra
    P
    Offline
    • Profile
    • Following 0
    • Followers 0
    • Topics 1
    • Posts 3
    • Groups 0

    phipra

    @phipra

    1
    Reputation
    2
    Profile views
    3
    Posts
    0
    Followers
    0
    Following
    Joined
    Last Online

    phipra Unfollow Follow

    Best posts made by phipra

    • RE: Windows Server 2019 sporadic reboot

      Hi @andSmv

      thanks for your tips - I have been trying some time and did not have any luck with it. However one time the whole xcp-ng host went down and rebooted and then I got an entry in the Supermicro IPMI with an uncorrectable ECC memory error. Memtest x86 did not report any faults but after changing that particular DIMM module the reboots and triple faults just stopped. So I think it is safe to assume that these error occured because of a hardware failure and the VM triple faulted because of that memory segment not being available.

      I just wanted to report back for anyone encountering this. Thank you for your help.

      posted in Compute
      P
      phipra

    Latest posts made by phipra

    • RE: Windows Server 2019 sporadic reboot

      Hi @andSmv

      thanks for your tips - I have been trying some time and did not have any luck with it. However one time the whole xcp-ng host went down and rebooted and then I got an entry in the Supermicro IPMI with an uncorrectable ECC memory error. Memtest x86 did not report any faults but after changing that particular DIMM module the reboots and triple faults just stopped. So I think it is safe to assume that these error occured because of a hardware failure and the VM triple faulted because of that memory segment not being available.

      I just wanted to report back for anyone encountering this. Thank you for your help.

      posted in Compute
      P
      phipra
    • RE: Windows Server 2019 sporadic reboot

      Hello @andSmv,

      thanks a lot for your insight. I think I can rule out the second option (normal reboot). One of the VMs (the one the crash dump above is from) is doing scheduled reboots every night. Also Windows updates do occur from time to time with regular reboots.

      I think you are right with option 1. As far as my limited understanding of x86 assembly goes a triple fault is the second exception during interrupt handling. The interrupt table should be in the Windows kernel memory space, so only device drivers should be able to corrupt it (if I am not mistaken).
      Does anybody know a way to identify driver problems in a Windows XEN VM? For example in the register dump above the instrcution pointer is present. Would it be possible to save Windows kernel memory space from time to time and when a fault occurs backreference the faulty driver? Actually, I am not sure about Adress Space Layout Randomization, but I think it is possible to turn it off.
      The reboots are not very often, but they are a problem for me the last couple of months - so any help or ideas are greatly appreciated!

      Cheers,
      phipra

      posted in Compute
      P
      phipra
    • Windows Server 2019 sporadic reboot

      Hi,
      we have 3 Windows Server 2019 VMs running on xcp-ng 8.2.1 since a couple of months. We are using a local ZFS Storage (I think the storage is probably not the problem). While Ubuntu/FreeBSD VMs run without issue, the Windows VMs do random restarts approx. every 4-6 weeks - apart from that, they run fine otherwise. I have the impression, that restarts occur more frequently, when there is cpu load (but that could be wrong). There is no error or kernel dump in the windows event logs. I did install the citrix-vm-tools 9.2.3 package, Windows Update is enabled. When the error occurs xl dmesg shows:

      (XEN) [2116059.042574] d60v4 Triple fault - invoking HVM shutdown action 3
      (XEN) [2116059.042576] *** Dumping Dom60 vcpu#4 state: ***
      (XEN) [2116059.042579] ----[ Xen-4.13.4-9.27.1  x86_64  debug=n   Not tainted ]----
      (XEN) [2116059.042579] CPU:    42
      (XEN) [2116059.042580] RIP:    0033:[<00007ffae7ed9aa0>]
      (XEN) [2116059.042581] RFLAGS: 0000000000000287   CONTEXT: hvm guest (d60v4)
      (XEN) [2116059.042582] rax: 000001d375b62dd0   rbx: 000001d375b62e00   rcx: 000001d375b62dc0
      (XEN) [2116059.042583] rdx: 000001d375b62e00   rsi: 0000000000000000   rdi: 000001d375b64000
      (XEN) [2116059.042584] rbp: 0000006e841fedf0   rsp: 0000006e841fece0   r8:  ba7dc7ea4e7a3d2c
      (XEN) [2116059.042584] r9:  bbb5ec78748d7131   r10: 41be9084b28bde26   r11: 08c1d1fe828bde7b
      (XEN) [2116059.042585] r12: 0000000000000000   r13: 000000000000fdff   r14: 000001d39f1e8000
      (XEN) [2116059.042585] r15: 0000000000000001   cr0: 0000000080050033   cr4: 00000000001506f8
      (XEN) [2116059.042586] cr3: 0000002f78779000   cr2: 0000000000000000
      (XEN) [2116059.042586] fsb: 0000000000000000   gsb: 0000006efbfe8000   gss: ffffd080b289d000
      (XEN) [2116059.042587] ds: 002b   es: 002b   fs: 0053   gs: 002b   ss: 002b   cs: 0033
      

      I would be very grateful, if anyone could point me into the right direction how to solve this issue. Could this be a Guest related driver issue? Is there a way to get a memory dump from that vm when the crash occurs to use windbg and find the driver?
      Thanks to all xcp-ng people for their contributions!
      Regards,
      phipra

      posted in Compute
      P
      phipra