@olivierlambert Thanks a lot.
We have not SPOF and full fiber 100Gb network spine/leaf infrastructure so I will give it a go (currently we are only on a test plateform so I do as much as I need )
Best posts made by dsmteam
-
RE: HA failover reaction time question
Latest posts made by dsmteam
-
RE: XO Console: Modifier keys stuck, unable to enter passwords
@DustinB Thanks. Didn't realized the /v6 was available on XOA. Thought it was specific on the host.
Unfortunately nothing works for users who are not admin (users can't even view their VM)
There is still a lot of work for permission like Olivier mentionned. -
RE: XO Console: Modifier keys stuck, unable to enter passwords
@olivierlambert Thanks Olivier. Unfortunately in our configuration XO Lite cannot be used.
So it will be XO 6... 2025 ? -
RE: XO Console: Modifier keys stuck, unable to enter passwords
We intend to give open access to ressource pools for customers and this can only be done with orchestra.
This altgr issue is present since at least 2020 but hasn't been fixed yet -
After days of research and tinkering : a working guide for Debian 12 template with cloud-init and DHCP
I have tried for days to make a Debian template (this probably applies to other Linux OS)
The main issue I was facing was that when creating multiple machine they would get the same IP from our DHCP server.
The reason is that Debian sends the machine-id (under /etc/machine-id) as dhcp identifier.
Adding to /etc/dhcp/dhclient.conf file dhcp-client-identifier = hardware;
did not help and deleting /etc/machine-id resulted in the absence of generation of a new id by cloud init for some reason and the VM not requesting an IP at all.This is what I did:
Downloaded from https://cdimage.debian.org/images/cloud/ the latest bookworm raw file and imported it as a disk in XO
Booted a VM with a random template and some network (internet access will be usefull in a few steps)
Deleted the existing disk and attached the raw disk I uploaded
Converted the VM to template
Created a VM from this template with my ssh key
Once booted, you will need to install dmidecode (https://packages.debian.org/search?keywords=dmidecode ) due diligence on your part to get the latest .deb, install with dpgk -i
Also install xcp-ng guest tools
Then run :
sudo cloud-init clean
sudo cloud-init clean --logs
rm /home/debian/.ssh/authorized_keys
sudo mkdir -p /var/lib/cloud/scripts/per-once/ (folders get deleted on cloud-init clean)
cd /var/lib/cloud/scripts/per-once/
sudo nano generate-machine-id.sh
(coming from user modem7 on github)#!/bin/bash # KVM UUID Recreator # Use this for new VM's or templates that require a unique machine ID. if [[ $EUID -ne 0 ]]; then echo "This script must be run as root" exit 1 fi UUID=$(dmidecode -s system-uuid | tr -d '-') if grep -q "$UUID" /etc/machine-id; then echo "UUID matches" else echo "UUID does not match. Recreating." echo -n > /etc/machine-id && echo -n > /var/lib/dbus/machine-id && systemd-machine-id-setup && reboot fi
chmod +x generate-machine-id.sh
sudo cat /dev/null > ~/.bash_history && history -c && shutdown now
You can now rename the VM and it's disk, delete the network card to prevent the template to have some tags added automatically with the IPV4 and IPV6 and convert the VM to a template.
You should now have a working Debian 12 template accessible with your ssh key if you add it on deploy and DHCP working and not overlapping. Hopefully, I did not forget anything.
On first start, the VM will loop once after the first prompt. The reboot is required for the change of the machine-id to be effective.
This is a lot of work and I have no doubt there is a simpler solution but I couldn't find it.
-
RE: HA failover reaction time question
@dsmteam Still trying to browse the web and various xo forum but it looks like those parameters are in the .c and other precompile file so the build in xcp-ng are probably using those default parameters.
-
RE: HA failover reaction time question
@olivierlambert Unfortunately, the parameters are reverted back to their default value when I turn on HA. Might be hard coded somewhere.
-
RE: HA failover reaction time question
@olivierlambert I think I found what I need in the following documentation
https://xapi-project.github.io/features/HA/HA.html
Various parameters which must be the same of every hosts in /etc/xensource/xhad.conf<parameters> <HeartbeatInterval>4</HeartbeatInterval> <HeartbeatTimeout>30</HeartbeatTimeout> <StateFileInterval>4</StateFileInterval> <StateFileTimeout>30</StateFileTimeout> <HeartbeatWatchdogTimeout>30</HeartbeatWatchdogTimeout> <StateFileWatchdogTimeout>45</StateFileWatchdogTimeout> <BootJoinTimeout>90</BootJoinTimeout> <EnableJoinTimeout>90</EnableJoinTimeout> <XapiHealthCheckInterval>60</XapiHealthCheckInterval> <XapiHealthCheckTimeout>10</XapiHealthCheckTimeout> <XapiRestartAttempts>1</XapiRestartAttempts> <XapiRestartTimeout>30</XapiRestartTimeout> <XapiLicenseCheckTimeout>30</XapiLicenseCheckTimeout> </parameters>
-
RE: HA failover reaction time question
@Danp Oh..................
Indeed, much faster now. Down from 2:00 minutes to 1:20 minutes
Less than 10 seconds might be too aggressive.
This is closer to what we expect.
I can see in the GUI that when I bring a host down, the pool still takes a minute to consider the host down. Any way to decrease this timer further or there are too many dependencies ? -
RE: HA failover reaction time question
@olivierlambert Just tried but there is no change in reaction time.
After googling this parameter I found this page you wrote (small world) on xcp-ng.org website https://xcp-ng.org/blog/2024/08/22/xcp-ng-high-availability-a-guide/ which indicates that this timeout purpose is for self fencing in case of loss of network/storage (I actually had this page opened already in my browser but missed this line)
Doesn't seem to influence restart timer in case of full host failure. -
RE: HA failover reaction time question
@olivierlambert Thanks a lot.
We have not SPOF and full fiber 100Gb network spine/leaf infrastructure so I will give it a go (currently we are only on a test plateform so I do as much as I need )