Strange issue with booting XCP-NG
-
@stormi Damn... And that all to 'test' if the GPU is working in BIOS mode.
-
Maybe I asked the wrong question here, but is the NVIDIA QUADRO P400 even supported? Maybe that could be an issue as well? If not I think I'll have to grab something like a Radeon WX2100..
-
It's not supported for vGPUs https://xcp-ng.org/docs/compute.html#vgpu but it should not prevent other uses of XCP-ng in theory.
-
@stormi Ahh thanks for the webpage, though I still think BIOS mode wont fix this issue I believe.. So I will capture the serial output, to see what happens.
-
As I said (or tried to say), vGPU support has nothing to do with the ability to display the host console on screen.
-
So BIOS mode might help, but there's no guarantee.
-
@stormi I wanna try this first so I can see what is going wrong in UEFI mode. So I can pass it on to you guys, so you might be able to fix this issue in UEFI mode. Because there might be other people running into this issue as well
-
@r1 @stormi I digged alot today, I even exchanged the xen.gz with the possible fix that was mentioned above, where I discovered that it made my XCP-NG useless XD. Well I fixed it so I am back to normal now (hahaha). I read some more about this issue, but it could also be 'Grub' related. I have a source here: https://askubuntu.com/questions/825687/what-could-prevent-an-ubuntu-server-from-booting-without-a-vga-connected-monitor
And then specifically 'nomodeset' I haven't tested it yet because the grub.cfg is different from the default grub.cfg and I don't want to break things again..., but could that be a possible solution to this?
I have pasted the grub.cfg here, where could I put the 'nomodeset' option?
serial --unit=0 --speed=115200 terminal_input serial console terminal_output serial console set default=0 set timeout=5 if [ -s $prefix/grubenv ]; then load_env fi if [ -n "$override_entry" ]; then set default=$override_entry fi menuentry 'XCP-ng' { search --label --set root root-cybuwv multiboot2 /boot/xen.gz dom0_mem=4304M,max:4304M watchdog ucode=scan dom0_max_vcpus=1-16 crashkernel=256M,below=4G,console=vga vga=mode-0x0311 module2 /boot/vmlinuz-4.19-xen root=LABEL=root-cybuwv ro nolvm hpet=disable console=hvc0 console=tty0 quiet vga=785 splash plymouth.ignore-serial-consoles xen.pciback.hide=(0000:07:00.0) module2 /boot/initrd-4.19-xen.img }
-
@appollonius You can set
nomodeset
option at the end of line which starts withmodule2 /boot/vmlinuz-4.19-xen
.With other user it did no make any difference.
-
@r1 Didn't work for me either.... Well I think this is then to stop the troubleshooting as I have no idea where the problem lies and we tried everything. When the monitor has been connected XCP-NG actually sees the GPU so I think it has something to do with the System parameters in UEFI or GRUB. But time will tell if someone else finds a potential solution to this...
-
Well I just had some time to reinstall xcp-ng in LEGACY BIOS MODE but unfortunately this didnt solve anything and the problem persists...
-
Alright well I lost all the data on one drive, by doing this 'joke' of reinstalling it in BIOS mode. So I thought I let VMware ESXi give it a try, well it works at VMware Apparently...
-
@olivierlambert I'm running xcp-ng 8.2.0, everything works fine, but when NTP is enabled it stays stuck for 40 seconds at the EFI_MEMMAP is not enabled line.
If I disable NTP it boots immediately without any issues. But as soon as I enable NTP same issue happens again.
I'm able to reach both the NTP servers time.cloudflare.com and time.google.com, I tested it from Network and Management Interface - Test Network - Ping custom address.
After I installed xcp-ng I SSHd as root and ran yum update, so it installed all of the updates.
Do you have any suggestions? -
It probably means that the chrony service can't join the NTP server for 40s. Network slow to initialize?
-
@stormi this is interesting....
To test if the network is slow to initialize, with NTP disabled I ran a ping to see how quickly it starts replying to ICMP after it reboots, I'm pinging to it's FQDN "xcp-ng-node1..." instead of it's IP, and it does start replying as soon as you can see the xsconsole screen (I have a monitor attached, for now). After you hit f8 it takes 47 seconds to boot back up and start replying to ICMP.
When I run the same test with NTP enabled, surprisingly the host starts replying exactly 47 seconds after, but on the monitor it stays stuck at the EFI_MEMMAP screen for 40 seconds, so, in total, it takes 86 seconds to show the xconsole screen (on the attached monitor).
With NTP enabled, as soon as the host started replying to ICMP I was able to SSH into it and ran a ping to time.cloudflare.com, while in the background the monitor was stuck on EFI_MEMMAP.
This does NOT represent an issue in my case because my server will be headless.
So I think I'm good, don't have any VMs running yet, but will check if it makes a difference for those to start up with NTP enabled and disabled. -
I ran into something like this once. I noticed you've got NTP set to use host names. You could see if it's maybe DNS lookups being slow to respond at that point in the boot by setting NTP to use IP addresses instead and seeing if that's much faster or not.
-
And I must add that NTP is very important if you are using more than one host in the pool, or if you're using Xen Orchestra.
-
Thanks for the advise @JeffBerntsen, i tried using cloudflare's and google's NTP IP's instead of their domains.
192.168.1.8 (debian) is a VM, configured to auto-start, and it's running inside the xcp-ng host 192.168.1.7
I'm rebooting the xcp-ng host (from the xsconsole) and pinging the debian machine 192.168.1.8 to determine how long it takes to start replying to ICMP.Here's the times I got (min:sec):
- NTP DISABLED: 1:14, 1:12, 1:12, 1:12 (avg 1:12)
- NTP enabled FQDN: 1:43, 1:43, 1:48, 1:52 (avg 1:46)
- NTP enabled IP: 1:27, 1:23, 1:18, 1:21 (avg 1:22)
So using NTP with an IP and hoping that it doesn't change seems to be the a good option.
Using the FQDN works as well since 30 seconds won't kill anyone and this is the option I'll use, but just was curious if this is something expected @stormi -
You're very welcome. My solution to the similar problem I'd had was to set up a couple of internal systems as NTP servers so that I always had something with the right time and static IP addresses and pointed everything needing NTP at them.