Strange issue with booting XCP-NG
-
@stormi this is interesting....
To test if the network is slow to initialize, with NTP disabled I ran a ping to see how quickly it starts replying to ICMP after it reboots, I'm pinging to it's FQDN "xcp-ng-node1..." instead of it's IP, and it does start replying as soon as you can see the xsconsole screen (I have a monitor attached, for now). After you hit f8 it takes 47 seconds to boot back up and start replying to ICMP.
When I run the same test with NTP enabled, surprisingly the host starts replying exactly 47 seconds after, but on the monitor it stays stuck at the EFI_MEMMAP screen for 40 seconds, so, in total, it takes 86 seconds to show the xconsole screen (on the attached monitor).
With NTP enabled, as soon as the host started replying to ICMP I was able to SSH into it and ran a ping to time.cloudflare.com, while in the background the monitor was stuck on EFI_MEMMAP.
This does NOT represent an issue in my case because my server will be headless.
So I think I'm good, don't have any VMs running yet, but will check if it makes a difference for those to start up with NTP enabled and disabled. -
I ran into something like this once. I noticed you've got NTP set to use host names. You could see if it's maybe DNS lookups being slow to respond at that point in the boot by setting NTP to use IP addresses instead and seeing if that's much faster or not.
-
And I must add that NTP is very important if you are using more than one host in the pool, or if you're using Xen Orchestra.
-
Thanks for the advise @JeffBerntsen, i tried using cloudflare's and google's NTP IP's instead of their domains.
192.168.1.8 (debian) is a VM, configured to auto-start, and it's running inside the xcp-ng host 192.168.1.7
I'm rebooting the xcp-ng host (from the xsconsole) and pinging the debian machine 192.168.1.8 to determine how long it takes to start replying to ICMP.Here's the times I got (min:sec):
- NTP DISABLED: 1:14, 1:12, 1:12, 1:12 (avg 1:12)
- NTP enabled FQDN: 1:43, 1:43, 1:48, 1:52 (avg 1:46)
- NTP enabled IP: 1:27, 1:23, 1:18, 1:21 (avg 1:22)
So using NTP with an IP and hoping that it doesn't change seems to be the a good option.
Using the FQDN works as well since 30 seconds won't kill anyone and this is the option I'll use, but just was curious if this is something expected @stormi -
You're very welcome. My solution to the similar problem I'd had was to set up a couple of internal systems as NTP servers so that I always had something with the right time and static IP addresses and pointed everything needing NTP at them.