Issues when upgrading with HA enabled



  • hi there.

    I have to say that I have upgrade it from 7.4.1 to 7.5 and is hasn't work.

    It has installed everything, but once I rebooted the server it stays in maintenace mode and if I try to set it out of it, it says that the server is still booting (more than 12 hours from the upgrade)

    I can connect via ssh, and I can see it from XOCE and XCP-center, but anything else.



  • Maintenance mode means storage are probably stuck to connect. Check in the host view, which SR are connected or not to the host.



  • @olivierlambert looks like that all the devices are correctly connected.

    I have disabled HA, still can connect to the host via SSH, but now I cannot connect to it via XenCenter or XOCE...
    tried to put it out from maintenance, but no luck.

    well its not a big problem as that is one of the development host and the other two still work with 7.4.1

    0_1533118680016_SelecciΓ³_022.jpg



  • How did you upgrade? ISO or yum?

    HA is always a quagmire in XenServer code πŸ˜•



  • I upgraded via yum and HA was enabled.



  • That's probably the most dangerous test that you could imagine πŸ˜„



  • @olivierlambert well someone had to test it 😁



  • Hi, I have the same problem as @txsastre . I've a 2 nodes system in my lab, and I tried to update one of them while in "maintenance mode" via yum. And can't get out of it. When I check the notifications menu - under the events the server is stuck in a reboot state. Is there a way to force out of reboot state? The server is up and going, can ssh into it and use the internet. Hopefully its just bug to however the XCP-ng center checks the state.



  • You started with the pool master first, right?



  • @olivierlambert
    Doh, hmm unfortunately no. I was planning to change the slave first and then transfer them over while updating the master. In case the master went down I thought that might cause "more" problem. Nevertheless the 2nd node should at least start, shouldn't it? πŸ™‚

    Edit: Does that mean that it might function properly if I upgrade the master as well? I mean if there are some sync problems?

    One more thing regarding the update info in this topic: "then either keep the xcp-ng-7.5.repo file to benefit from bugfix updates during the RC phase or remove it, in any case remember to remove it when 7.5 final is released, or it will pollute your repository list for nothing." - Maybe it would be nice if the full version check if the file(s) exists and then ask if the user want to remove them? So it doesn't cause more problem.



  • You host is stuck in maintenance because it can't communicate with the master, which is in an older version. That's why we specified in bold that you ALWAYS START TO UPGRADE THE MASTER:

    Reminder: always upgrade the master first (reboot it first, THEN the slaves).

    As soon the master is upgraded/rebooted, everything will come back to life.



  • @olivierlambert
    Ha, even though I actually read the text I didn't see any bold text πŸ™‚ But now when you point it out I found it. Well I'll try that too, sounds fair πŸ™‚



  • No worries, we'll put a very very big warning for all users who want to upgrade for the final release πŸ™‚

    You are not the first to experience this, some customers we got made the same mistake with XenServer upgrades in the past!



  • @olivierlambert
    Maybe make a notice during the pre-check? πŸ™‚ Saying that you have to upgrade the master first (type: ok?). fail-proof.



  • The only way is to do that in XOA, but we can't prevent people to do yum update in the host directly.



  • @olivierlambert @fraggan Hi there, I did it starting with the Master... but I think I should disable HA before doing it ?

    well, as I said is no worry, 2/3 of the pool still working, gonna wait to the final release of 7.5 and then upgrade all of them with CD/USB install. By the way as a new feature of 7.5 I will have to recreate the storage, because its a FC storage hardware, and as I read the 7.5 will allow to do thin provisioning over FC storages (at this moment it were only working on NFS as I can remember)

    EDIT disabled HA of the 2 remaining hosts, and upgraded both via yum with no problems πŸ™‚
    gonna try if I can recover the 1/3 that had the problem, at this moment it says that there is no network card.



  • @txsastre yes, Citrix officially recommend to disable HA before upgrading. And always starting with the master.



  • I added a warning in the RC post.



  • @olivierlambert
    The reason to use HA kinda is useless if I have to disable it everytime I'm doing something πŸ™‚ In my way on thinking: I should enter maintenance mode, so the host moves all running VMs over to the host2. And then upgrade it, move everything to node1 and repeat. At least there is not downtime. If I need to disable it...unless I move everything and then disable HA on master, reboot and upgrade it. And then re-enable HA again. But then I would like to move the master role back to host1 (1st master). I couldnt find a way to transfer the master role. Is it possible, if yes, where can I find it?

    @txsastre I didnt upgrade the master when going from 7.4 to 7.5-rc1. That's why I lost connection (stuck in reboot state). I shutdown every VM and upgrade the slave too. Then reboot both hosts and now 7.5-rc1 is working.

    EDIT: Btw this time I was talking about updating through CLI, locally on the hosts with "yum update".



  • Hmmm. Correction: it's not "everytime I'm doing something", it's "when I'm upgrading between XCP-ng versions", which isn't an operation you'll do everyday. This is a Citrix recommendation, not our own, I suppose they got enough experience to know that their own HA is doing crazy stuff when you upgrade your pool.

    Disabling HA is straightforward, and just time to reboot your pool master (first).