XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Windows 2022 VM - Reboot triggered - VM shuts down

    Scheduled Pinned Locked Moved Compute
    18 Posts 5 Posters 1.4k Views 4 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • K Offline
      KPS Top contributor
      last edited by

      Hi!

      I am using a Windows 2022 VM (fully patched) von XCP-ng 8.2.1
      The VM does a reboot every evening through a Windows-scheduler-job (shutdown -r -t 0 -f)

      This is working for weeks, now.

      Yesterday, the VM did just shut down - instead of a reboot.

      The only message in the eventlog is: Error code 7043. Couldn't stop Hypervisor Tools service after preshutdown event.
      ...but: This message is always there - also if I reboot successfully...

      After the reboot, there was a 161 Error from volmgr

      --> Did you ever see, that a VM did shutdown instead of reboot?

      Thank you for your help!

      DarkbeldinD 1 Reply Last reply Reply Quote 0
      • DarkbeldinD Offline
        Darkbeldin Vates 🪐 Pro Support Team @KPS
        last edited by

        KPS I suppose you are using Citrix drivers? With management agents? Dynamic memory?

        K 1 Reply Last reply Reply Quote 0
        • K Offline
          KPS Top contributor @Darkbeldin
          last edited by

          Darkbeldin said in Windows 2022 VM - Reboot triggered - VM shuts down:

          KPS I suppose you are using Citrix drivers? With management agents? Dynamic memory?

          Citrix Agent with management agent, but with static memory...

          DarkbeldinD 1 Reply Last reply Reply Quote 0
          • DarkbeldinD Offline
            Darkbeldin Vates 🪐 Pro Support Team @KPS
            last edited by

            KPS So quit normal setup, not sure what can be done, does it do this every time or just this once?

            K 1 Reply Last reply Reply Quote 0
            • K Offline
              KPS Top contributor @Darkbeldin
              last edited by KPS

              The problem did happen one more time. The server does a daily restart, but last night it just stopped. Same behaviour as last time:

              • Task scheduler starts "C:\Windows\System32\shutdown.exe -r -t 120 -f"
              • System starts to shut down and Eventlog just stops logging
              • XCP-ng shows, VM is off

              The last event prior to the "manual" boot is:

              Event 7036
              Service"VSS Writer for the internal windows database" is now in state "stopped" (translated)

              I see, that there is a dump-file written, but I do not really know, how to analyze it.

              Do you have any idea on how to solve this?

              Best wishes

              K 1 Reply Last reply Reply Quote 0
              • K Offline
                KPS Top contributor @KPS
                last edited by

                The analysis of the dump did show up:

                UNEXPECTED_KERNEL_MODE_TRAP (7f) / EXCEPTION_DOUBLE_FAULT

                According to Microsoft:

                => A double fault, which is a fault that occurred while processing an earlier fault, which always results in a system failure.

                => Bug check 0x7F typically occurs after you install faulty or mismatched hardware, especially memory, or if installed hardware fails.

                => Check the availability of updates for the ACPI/BIOS, the hard driver controller, or network cards from the hardware manufacturer.

                https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/bug-check-0x7f--unexpected-kernel-mode-trap

                Did you ever see that behaviour?

                K 1 Reply Last reply Reply Quote 0
                • K Offline
                  KPS Top contributor @KPS
                  last edited by

                  Hi!

                  Today, it happened again. Same behaviour. Nothing special in the logs, but VM is shut down.

                  Any ideas on how to solve this?

                  K 1 Reply Last reply Reply Quote 0
                  • K Offline
                    KPS Top contributor @KPS
                    last edited by

                    Hi!

                    It happened one more time. This time on a node with AMD-CPU and without memory dump. Another Windows 2022 VM...

                    Did you ever expect this?

                    1 Reply Last reply Reply Quote 0
                    • olivierlambertO Offline
                      olivierlambert Vates 🪐 Co-Founder CEO
                      last edited by

                      I would try different combinations in case:

                      1. create a new empty VM, attach the disk from the "old" one to the new one, check if it's the same behavior
                      2. remove Citrix tools, install XCP-ng tools and see if it continues to have the problem
                      K 1 Reply Last reply Reply Quote 0
                      • K Offline
                        KPS Top contributor @olivierlambert
                        last edited by

                        I am quite frustrated in trying to "force" the behaviour...

                        I installed 4 Windows 2022 Test-VMs:

                        • BIOS + XCP-Tools
                        • BIOS + XenTools
                        • UEFI + XCP-Tools
                        • UEFI + XenTools

                        The 4 VMs are all rebooting every 6 Minutes for the last 20h, but non on them did get "shut down", while my production VMs, that are rebooting every 24h are affected. Currently one (changing) of the 6 Windows 2022 VMs is shutdown once a month.

                        I have no more idea about the "raise condition".

                        K 1 Reply Last reply Reply Quote 0
                        • K Offline
                          KPS Top contributor @KPS
                          last edited by KPS

                          Hi!

                          I thought, I did "understand the issue, now, but I am having another one...

                          About the first one:
                          The problem seemed to end, if the MEMORY.DMP-file was deleted. Only the "second dump" did trigger the issue.

                          But:

                          Last night, i had another strange issue on a Windows 2022 VM with Citrix Tool 9.3.1.
                          The system did a scheduled reboot but after that, it was hanging on the UEFI boot manager:

                          Error: 0xc0000225
                          A Required Device Isn't Connected or Can't Be Accessed

                          The Eventlog did show a clean shutdown without errors. The only strange thing is one event: "The system has rebooted without cleanly shutting down first."

                          I started the OS-selection and the system did boot without issues.

                          I am not sure, of this is XCP-ng-related, but this is worse, than the first issue, as I cannot "solve" it by a script, that is checking the status of the VM.

                          I have never seen this before.

                          What do you think?

                          Thank you for your help!

                          T 1 Reply Last reply Reply Quote 0
                          • T Offline
                            tuxen Top contributor @KPS
                            last edited by

                            KPS When that force reboot command is issued, the VM:

                            1. Is under intensive I/O?
                            2. Has a backup job started/running?
                            K 1 Reply Last reply Reply Quote 0
                            • K Offline
                              KPS Top contributor @tuxen
                              last edited by

                              tuxen
                              The reboots are scheduled in "out-of-office-hours", but some hours before the next backup-job.

                              So, there is nearly zero load.
                              The hosts are beefy AMD Genoa-systems, that are not really in use. The problem did already happen, when only ONE VM is on an AMD 9374F.

                              ...it did just happen 15 minutes ago. Dump was written and it happened although the dump before was deleted. That single VM did have the problem 1 month ago for the last time (daily reboot).

                              T 1 Reply Last reply Reply Quote 0
                              • T Offline
                                tuxen Top contributor @KPS
                                last edited by

                                KPS I was exactly thinking about an after-hour task doing heavy storage I/O (e.g data replication or ETL-like workloads). Under this scenario, a forced reboot might cause some sort of file system corruption due to uncommitted data being lost.

                                Now, other source of issue comes to mind: automatic Windows Update. Is this service active? I'm not a Windows expert but a forced reboot during a system update might also cause an unexpected behavior.

                                Seeing all those errors, it seems that some system file or DLL got corrupted, needing a repair. It's strongly recommended taking a snapshot before running a system repair.

                                K 1 Reply Last reply Reply Quote 0
                                • K Offline
                                  KPS Top contributor @tuxen
                                  last edited by

                                  tuxen said in Windows 2022 VM - Reboot triggered - VM shuts down:

                                  Now, other source of issue comes to mind: automatic Windows Update. Is this service active?

                                  Thank you for your answer, but both is not the case. Windows Updates are only triggered manually. There is VERY low I/O and CPU-load, when the reboot is triggered.

                                  T 1 Reply Last reply Reply Quote 0
                                  • T Offline
                                    tuxen Top contributor @KPS
                                    last edited by

                                    KPS one thing is clear to me. The reboot is triggering a VM shutdown due to a system crash (kernel errors and memory dump files being a lead). Without a detailed stack trace (like Linux's kernel panic) and the difficulty in reproducing the issue, troubleshooting is a very hard task. One last thing I'd check is the /var/log/daemon.log at the VM shutdown time window.

                                    K 1 Reply Last reply Reply Quote 0
                                    • C Offline
                                      chrisfonte
                                      last edited by

                                      Is the instance of Windows licensed?

                                      1 Reply Last reply Reply Quote 0
                                      • K Offline
                                        KPS Top contributor @tuxen
                                        last edited by

                                        tuxen
                                        I did send the MEMORY.DMP-files to a microsoft specialist and did post the result on Juli 25 in that thread.
                                        daemon.log is quite hard for me to read. I did not find something, I can see as an error. It looks, like a "shutdown" - not a reboot.

                                        chrisfonte
                                        Yes, fully licensed. It is not a "shutdown because of missing licenses after 24h".

                                        1 Reply Last reply Reply Quote 1
                                        • First post
                                          Last post