VM Failing to Reboot
-
@kagbasi-ngc You should be able to get into debugging mode via the Advanced Boot Options menu (spam F8 at boot). You'll still need to enable Secure Boot. I'm not sure if you'll be able to connect without running the /dbgsettings command, but it's worth a try. Note that you'll need to have WinDbg ready and connect it as soon as you select debugging mode.
If all that fails, can you use Safe Mode or Last Known Good Configuration?
-
@dinhngtu Okay, gonna give it a shot now. Will report back shortly.
-
@dinhngtu Unfortunately, no amount of smashing of the F8 key got me into the Advanced Boot Options menu, so I gave up on that.
Instead, I've booted up with Hiren's Boot Disc, and I'm about to see if perhaps I can uninstall the guest tools this way and try again. Any pointers are welcome.
-
@kagbasi-ngc You can enable debugging from within Hiren's by mounting the Windows EFI system partition to e.g. S: then running
bcdedit /store S:\EFI\Microsoft\Boot\BCD /debug on
You can also try renaming the .sys files but normally XenBootFix should have been sufficient to disable all Xen drivers.
I forgot to ask, how did you install guest tools onto your VM, and did you install anything else to it (e.g. some 3rd-party apps) before rebooting?
-
@dinhngtu I installed the Citrix Tools into my template and then built the VM from the template (as I've always done). However, immediately prior to the reboot, I had just finished installing MediaEnable (email server) and was just trying to reboot to get some Group Policies to take effect.
Prior to that, I'd already rebooted the VM many times and not had any issues, however, I was always rebooting using the buttons in XOA not directly from within the OS.
I'm gonna go try what you've suggested now and see what happens.
-
@kagbasi-ngc Is it possible that your GPOs or MediaEnable that are causing the BSOD? Could you try to eliminate the causes?
-
@dinhngtu Sure, it's possible - though not likely. However, I'm willing to entertain you and I would uninstall them, except that I can't boot into the OS.
Anyway, I tried what you suggested by attempting to enable BCD Debug and it didn't work - got an error (even though the path is correct) :
-
@kagbasi-ngc It's not the BCD database path but the BCD entry identifier. Try
bcdedit /store ...\BCD /enum
to get the entry's identifier, thenbcdedit /store ...\BCD /set <identifier> /debug on
-
Also, one way to verify the issue is to install MediaEnable on another VM.
-
@dinhngtu Thanks, I did that but getting an error message that
The system cannot find the file specified
-
@kagbasi-ngc There's no slash in
debug
while using the/set
command:Also please don't enable debugging for bootmgr, just do it on
{default}
-
@dinhngtu said in VM Failing to Reboot:
@kagbasi-ngc There's no slash in
debug
while using the/set
command:I had already tried that, but I think I did it on {bootmgr} - so I'll retry on {default} when I get back in the lab.
Update as of 09:58AM EDT:
I tried the command against the {default} identifier, and it worked. However, I didn't see any change in behavior during the boot process. Should I continue with the instructions for exposing the VMs serial console? -
@dinhngtu said in VM Failing to Reboot:
Also, one way to verify the issue is to install MediaEnable on another VM.
So, I built a new VM from the same template, renamed it and joined it to the domain (both actions required a reboot which was initiated from inside the OS). I then placed it in the correct OU and rebooted so it will grab the correct GPOs (this reboot action was also initiated from inside the OS). Finally, I installed MailEnable and when it finished I initiated a final reboot from inside the OS.
None of the reboots caused the OS to fail to boot, and it was built from the same template (which has the Citrix Guest Tools v9.4 installed). Below is a link to a video I took of the test.
-
@kagbasi-ngc said in VM Failing to Reboot:
I tried the command against the {default} identifier, and it worked. However, I didn't see any change in behavior during the boot process. Should I continue with the instructions for exposing the VMs serial console?
Yes, it won't show anything special on screen during boot, but attaching a debugger should work if everything is configured correctly. Note that it's timing dependent (there's a small time window where you can attach before it reaches the BSOD) so it might take a few tries.
-
@dinhngtu Roger that, I'll try it now and see what happens.
-
@dinhngtu While following the instructions you shared on how to expose the VM's serial console, I'm having trouble with this command "
bcdedit /dbgsettings serial debugport:1 baudrate:115200
". I had to modify it slightly to specify the/store
parameter (as indicated previously).Since I don't have the ability to run this command inside the OS, I am running it from a command shell that I accessed via the Troubleshooting menu (from an installation disk). It accepted the first command "
bcdedit /store S:\EFI\Microsoft\Boot\BCD /set {default} debug on
" but won't accept "bcdedit /store S:\EFI\Microsoft\Boot\BCD /set {default} dbgsettings serial debugport:1 baudrate:115200
". It throws the following error:The element data type specified is not recognized or does not apply to the specified entry.
What am I missing here?
Second issue - I attempted to get the host ready by setting the parameters (as per the instructions) and opened the firewall port (I used 7001). However, when I attempt to connect to the host via SSH on tcp\7001, the connection is refused.
-
@kagbasi-ngc You'll need these commands instead:
bcdedit /store bcd /set {dbgsettings} debugtype serial bcdedit /store bcd /set {dbgsettings} debugport 1 bcdedit /store bcd /set {dbgsettings} baudrate 115200
The serial port is only accessible when the VM is running.
-
@dinhngtu Thank you for breaking it down, this worked and I was able to get the commands to be accepted on the VM. Whether they actually took effect, I can't tell.
Here's what I observed:
-
I prepped the host with the
xe vm-param-set uuid=<uuid> platform:hvm_serial=tcp::<port>,server,nodelay,nowait
command and opened the firewall (using tcp port 7001). -
I disabled Secure Boot on the VM, prepped a PuTTY session to connect to my host on TCP port 7001
-
I noticed that whenever the VM starts booting, and I simultaneously initiate the putting session, it seems like a connection is made but no data is returned to the screen. The session terminates as soon as the VM runs into the BSOD and shuts down.
I recorded a video of it: https://photos.app.goo.gl/wzJSwfoJJJWwtGk57
Next thing with Windbg. This is my first time using it, so please pardon my ignorance. However, I am unable to attach to the kernel debug, as per the instructions you provided. I see a
Kernel Debug
menu item and when I click on it, I don't see any place where I can enter the connection string specified in the instructions you provided.Here's another video capture: https://photos.app.goo.gl/ZnjVQjo5Hk589dQ96
We're making progress, and I really appreciate your patience, thank you.
-
-
@kagbasi-ngc That's the old WinDbg. You should use the new WinDbg instead: https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/
The serial connection looks to be working.
-
@kagbasi-ngc More instructions for you on how to use WinDbg:
In WinDbg, select File - Attach to kernel - Paste connection string and paste in your connection string. Once the debugger finishes attaching, click Go until WinDbg says "BUSY Debuggee is running", then wait until the VM crashes. You should see an error message (see screenshot).
Click the "!analyze -v" link, it will spit out a bunch of analysis info. This will take a while. Once it's done, paste the entire WinDbg output counting from the initial "Fatal System Error" message, including the entirety of the analysis output.