Change/Remove Master from Pool error - missing column
-
Hi Team,
I want to remove the master from our production pool.I get the following error:
xe pool-designate-new-master host-uuid=<<other pool server uuid>> The server failed to handle your request, due to an internal error. The given message may give details useful for debugging the problem. message: missing column <extra>: host <extra>: https_only
I have made sure that the current Pool master was fully updated first, and rebooted.
I then updated all other hosts in the pool (5 in total).yum update Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile Excluding mirror: updates.xcp-ng.org * xcp-ng-base: mirrors.xcp-ng.org Excluding mirror: updates.xcp-ng.org * xcp-ng-updates: mirrors.xcp-ng.org No packages marked for update
I have even shutdown the master, loosing connection via XO and running the change master command via the command line on the other servers, which of course, do nothing.
I have no VM's on the current pool master, and have tried put in maintainence mode first.
In XO (from courses) I have tried to
Detach
the master which generates the following error in the logs;host.detach { "host": "d2d01b9a-e6d1-4dd0-90e6-f101a251c7b1" } { "code": "INTERNAL_ERROR", "params": [ "Xapi_pool.Cannot_eject_master" ], "call": { "method": "pool.eject", "params": [ "OpaqueRef:90ba839f-61a2-4f96-b485-df6332fb0bce" ] }, "message": "INTERNAL_ERROR(Xapi_pool.Cannot_eject_master)", "name": "XapiError", "stack": "XapiError: INTERNAL_ERROR(Xapi_pool.Cannot_eject_master) at Function.wrap (/opt/xo/xo-builds/xen-orchestra-202310302357/packages/xen-api/src/_XapiError.js:16:12) at /opt/xo/xo-builds/xen-orchestra-202310302357/packages/xen-api/src/transports/json-rpc.js:35:21 at runNextTicks (node:internal/process/task_queues:60:5) at processImmediate (node:internal/timers:447:9) at process.callbackTrampoline (node:internal/async_hooks:130:17)" }
I have run
xe host-list
to get the other host UUID's from all 5 servers and have tried thexe pool-designate-new-master host-uuid=
command on the current master and all other servers with the same error message.I have even tried the window controller MSI via the gui
Which appears to timeout and fails.
Host list:
XO Errors Screen:
-
Okay Team,
For my reference, and anyone else finding this;I was able to recover the pool by completing the following steps.
As this is out production cluster, I did it late into the night - thankfully, with no downtime.1: I shutdown the old master than had no VM's running.
1: On another host that I wanted to become the master, I ssh'd on and ran the following command.
xe pool-emergency-transition-to-master
I got the following response:
Host agent will restart and transition to master in 10.000 seconds...
3: I then went into
XO
=>Settings
=>Server
and connected to the new master.2: After a god 10-15 seconds, On the new master I ran the following command
xe pool-recover-slaves
Which after a few seconds, the GUID's of the other hosts in the pool appeared.
xe pool-recover-slaves f600ea3a-cc02-4f24-a15e-938756feb00c d123525c-6022-43c9-8c23-d0f1ca567219 16c7c955-f65e-4d5b-96e7-787085c2d25f
And all hosts are showing correctly in XO!
-
I've found this post:
https://docs.xenserver.com/en-us/citrix-hypervisor/dr/machine-failures.html#master-failuresWhich looks like I can run
xe pool-emergency-transition-to-master
On one of the other members of the pool.
Is this the best/safest way to change the pool master?@olivierlambert Do you think this is a solution to the errors?
Thanks
-
Hi,
Are all your hosts in the pool fully updated? (not just the master)
-
@olivierlambert Yes, I updated the master and rebooted it first.
Then updated the other hosts and rebooted each of those. -
Okay Team,
For my reference, and anyone else finding this;I was able to recover the pool by completing the following steps.
As this is out production cluster, I did it late into the night - thankfully, with no downtime.1: I shutdown the old master than had no VM's running.
1: On another host that I wanted to become the master, I ssh'd on and ran the following command.
xe pool-emergency-transition-to-master
I got the following response:
Host agent will restart and transition to master in 10.000 seconds...
3: I then went into
XO
=>Settings
=>Server
and connected to the new master.2: After a god 10-15 seconds, On the new master I ran the following command
xe pool-recover-slaves
Which after a few seconds, the GUID's of the other hosts in the pool appeared.
xe pool-recover-slaves f600ea3a-cc02-4f24-a15e-938756feb00c d123525c-6022-43c9-8c23-d0f1ca567219 16c7c955-f65e-4d5b-96e7-787085c2d25f
And all hosts are showing correctly in XO!
-
Good to know you made it
What about the old primary after that? Did you started it again or just removed it for good?
-
@olivierlambert Good question,
Once I booted it back, it added as a slave to the pool with out issues.But I have removed it from the pool and am re-installing as we speak.
We're moving data centres, so I'm splitting the pool up to do a part-move.Thanks for your interest, and assistance.
-
Okay, glad to know it works! Good luck for the rest of your operations!
-
-
-
-