Issues when upgrading with HA enabled
-
Hi, I have the same problem as @txsastre . I've a 2 nodes system in my lab, and I tried to update one of them while in "maintenance mode" via yum. And can't get out of it. When I check the notifications menu - under the events the server is stuck in a reboot state. Is there a way to force out of reboot state? The server is up and going, can ssh into it and use the internet. Hopefully its just bug to however the XCP-ng center checks the state.
-
You started with the pool master first, right?
-
@olivierlambert
Doh, hmm unfortunately no. I was planning to change the slave first and then transfer them over while updating the master. In case the master went down I thought that might cause "more" problem. Nevertheless the 2nd node should at least start, shouldn't it?Edit: Does that mean that it might function properly if I upgrade the master as well? I mean if there are some sync problems?
One more thing regarding the update info in this topic: "then either keep the xcp-ng-7.5.repo file to benefit from bugfix updates during the RC phase or remove it, in any case remember to remove it when 7.5 final is released, or it will pollute your repository list for nothing." - Maybe it would be nice if the full version check if the file(s) exists and then ask if the user want to remove them? So it doesn't cause more problem.
-
You host is stuck in maintenance because it can't communicate with the master, which is in an older version. That's why we specified in bold that you ALWAYS START TO UPGRADE THE MASTER:
Reminder: always upgrade the master first (reboot it first, THEN the slaves).
As soon the master is upgraded/rebooted, everything will come back to life.
-
@olivierlambert
Ha, even though I actually read the text I didn't see any bold text But now when you point it out I found it. Well I'll try that too, sounds fair -
No worries, we'll put a very very big warning for all users who want to upgrade for the final release
You are not the first to experience this, some customers we got made the same mistake with XenServer upgrades in the past!
-
@olivierlambert
Maybe make a notice during the pre-check? Saying that you have to upgrade the master first (type: ok?). fail-proof. -
The only way is to do that in XOA, but we can't prevent people to do
yum update
in the host directly. -
@olivierlambert @fraggan Hi there, I did it starting with the Master... but I think I should disable HA before doing it ?
well, as I said is no worry, 2/3 of the pool still working, gonna wait to the final release of 7.5 and then upgrade all of them with CD/USB install. By the way as a new feature of 7.5 I will have to recreate the storage, because its a FC storage hardware, and as I read the 7.5 will allow to do thin provisioning over FC storages (at this moment it were only working on NFS as I can remember)
EDIT disabled HA of the 2 remaining hosts, and upgraded both via yum with no problems
gonna try if I can recover the 1/3 that had the problem, at this moment it says that there is no network card. -
@txsastre yes, Citrix officially recommend to disable HA before upgrading. And always starting with the master.
-
I added a warning in the RC post.
-
@olivierlambert
The reason to use HA kinda is useless if I have to disable it everytime I'm doing something In my way on thinking: I should enter maintenance mode, so the host moves all running VMs over to the host2. And then upgrade it, move everything to node1 and repeat. At least there is not downtime. If I need to disable it...unless I move everything and then disable HA on master, reboot and upgrade it. And then re-enable HA again. But then I would like to move the master role back to host1 (1st master). I couldnt find a way to transfer the master role. Is it possible, if yes, where can I find it?@txsastre I didnt upgrade the master when going from 7.4 to 7.5-rc1. That's why I lost connection (stuck in reboot state). I shutdown every VM and upgrade the slave too. Then reboot both hosts and now 7.5-rc1 is working.
EDIT: Btw this time I was talking about updating through CLI, locally on the hosts with "yum update".
-
Hmmm. Correction: it's not "everytime I'm doing something", it's "when I'm upgrading between XCP-ng versions", which isn't an operation you'll do everyday. This is a Citrix recommendation, not our own, I suppose they got enough experience to know that their own HA is doing crazy stuff when you upgrade your pool.
Disabling HA is straightforward, and just time to reboot your pool master (first).
-
Hmm I've a feeling there is misunderstanding here Disabling HA doesnt effect the VMs running on the slave at all? And then when I've rebooted the master and enable HA again on the "slave (old master) I can transfer the VMS over back to it and then repeat the steps for all hosts? If thats the case then its ok It's just that I understood it as, if I disable the HA I can't get the connections back between the hosts without have to shut everything down and upgrade all hosts and then start everything up again.
EDIT: "Disabling HA is straightforward, and just time to reboot your pool master (first)." - Yes I have learned my lesson the first time
If I just wanna upgrade files with "yum update" -command through the CLI I dont need to disable it everytime?
Is there a way to manually move the "master role"? -
Disabling HA won't do anything. Just migrate your VMs from the master (to any slave host), reboot it. That's pretty much it. Do the same for all your hosts (migrating VMs, rebooting, re-migrate VMs on it).
When it's done, just re-enable HA.
-
Don't mix things.
yum update
can be used to upgrade (from 7.4 to 7.5) but most of the time it's use to do simple updates (some doesn't require even to reboot the host).HA won't interfere until you decide to reboot a host.
-
@olivierlambert
I was answering txsastre in the same post thus the misunderstanding I think. And when I was going to disable HA on my host I got a message that indicated that it would break everything.
But the way you describe it, it sounds ok. So basically, everytime one need to upgrade something, major/minor thats the way to do it, regardless if reboot is necessary or not? -
If you don't need to reboot, don't touch HA HA is tricky when it comes to have hosts to reboot. I would avoid it in general BTW… more dangerous than helping in most of the cases.
-
@olivierlambert
Okej, its that part that's tricky Yea maybe so, but it's cool. And makes the uptime better even if it's just a lab. I've been using vmware for a while at my last work. But the licensing is not nice for a homelab. So this is my first time with xcp-ng (xenserver). I'll try updating tomorrow. Thanks.