Imported multiple VMs from ESXi 6 last night, now that ESXi host is inaccessible to vCenter
-
Yesterday evening I did my first test of VM importing, to prepare for the full migration to XCP-ng. I have two XCP-ng hosts, each in their own pool. I used the import tool to connect to the ESXi host that's running less critical VMs. I select 4 or 5 of the VMs which were powered off because they are VMs I had set up a long time ago either as tests or for intermittent tasks. As of last night the import seemed to be working fine so I went home.
This morning when I checked vCenter, it says that host is not responding and all the VMs on that host are listed as disconnected (of course). I can get to the base web ui for that host, just the one that tells you it's running ESXi and how to get other resources. The management interface will not open on that host. It starts to load but never completes to give me a login to find out what's going on. When I hook a monitor up to that host, it is still running. In other words "https://hostname/" works but "https://hostname/ui/" partially loads and then stops.
The VMs that were running on that server are still running. I can remote into them, so it appears the host is working, just the various management methods aren't working now. Worst case, I can shut down those VMs remotely and reboot the host through the console. Not surprisingly, my backup software cannot backup VMs on that host and XCP-ng fails to connect if I go through the Import from VMware tab.
I can reboot the host and hopefully fix it but I'm wondering...
- What happened and how can I know this won't happen when I try to migrate the important VMs?
- Is there anything that can be learned from this, before I reboot that server, to help improve the import tool so this doesn't happen to others?
-
Are you using XOA or XO from sources? Have you tried rebooting the XO VM to see if that had any effect?
-
@Danp XO from source.
I have not tried rebooting the XO VM. I was waiting to hear some advice before I did anything that might limit the ability to find the cause. I figured once I start rebooting things we might be more limited in figuring out what failed and right now the diagnostics are more interesting to me than getting it going again.
I'll reboot the XO VM and see if that changes anything.
-
@CodeMercenary Maybe @florent will have some ideas on what caused this.
-
@Danp Good call. Rebooting the XO VM has restored connectivity for vCenter to that host. My backup software is also able to communicate with that host again.
-
-
-
@CodeMercenary we only open readonly channel to the host, so we shouldn't break anything. It is quite resource expensive though, so maybe it was too much for the host ?
-
@florent It is a very old host. In addition to moving to XCP to get away from VMware, we're upgrading to get off the oldest hosts onto a couple new hosts I just added. It's quite possible that the activity was overwhelming that rather weak old host.