XCP host rebooted: VM's wont start anymore :-(
-
@olivierlambert
I can see the vhd file on the local disk of this host and it has always been there afaik. The other host wasnt used for this VM. And the VM was running fine without the other host being present for several months. So I assume the disk on this host ? -
Well, you can display the VM disk list with a
xe vm-disk-list uuid=<VM UUID>
.Then you can find info on those disks with, for each, a
xe vdi-param-list uuid=<VDI UUID>
.Then, you'll see on which SR are each disks, and you'll understand why the VM can't boot.
-
I'm trying to understand how it is possible that this vm has been running for a few months without the other host being present ? Where would have the vhd file been stored ?
I have a copy of the vhd file here, can I create a new vm with that ? -
[17:22 xcp ~]# xe vdi-param-list uuid=06f7760e-157f-4a18-83fe-ba48db06a5ef uuid ( RO) : 06f7760e-157f-4a18-83fe-ba48db06a5ef name-label ( RW): mailserver name-description ( RW): Created by XO is-a-snapshot ( RO): false snapshot-of ( RO): <not in database> snapshots ( RO): snapshot-time ( RO): 19700101T00:00:00Z allowed-operations (SRO): generate_config; update; forget; destroy; snapshot; copy; clone current-operations (SRO): sr-uuid ( RO): e1fb6d59-93c5-72bf-a018-184dd3ea3643 sr-name-label ( RO): Local storage
It says the sr-uuid is
e1fb6d59-93c5-72bf-a018-184dd3ea3643
, this my local storage SR of the current host ?? -
Your
mailserver
disk is using SRe1fb6d59-93c5-72bf-a018-184dd3ea3643
. This SR seems to belong tohost ( RO): xcp
, notxcp-ng-01
. -
@olivierlambert said in XCP host rebooted: VM's wont start anymore :
Your
mailserver
disk is using SRe1fb6d59-93c5-72bf-a018-184dd3ea3643
. This SR seems to belong tohost ( RO): xcp
, notxcp-ng-01
.Yes thats right, the host xcp is the current up and the host xcp-ng-01 is the one 'lost'.
I really cant see the problem -
Then check your local SR (if it's correctly connected)
-
@olivierlambert said in XCP host rebooted: VM's wont start anymore :
Then check your local SR (if it's correctly connected)
What is the proper way to do that using cli ?
-
@olivierlambert said in XCP host rebooted: VM's wont start anymore :
Then check your local SR (if it's correctly connected)
xe sr-scan uuid=e1fb6d59-93c5-72bf-a018-184dd3ea3643 The SR has no attached PBDs sr: e1fb6d59-93c5-72bf-a018-184dd3ea3643 (Local storage)
How can I connect or attach a PBD ?
-
That's your problem, indeed
In XO, it's "connect", otherwise it's
xe pbd-connect
-
I managed to find the PBD and it doesnt seemed attached
#xe pbd-list uuid ( RO) : 1a9396ae-e59b-9ea7-1d1a-3c5b139a11cb host-uuid ( RO): f4d5a20d-e7f3-4e62-8804-e2caa6922a43 sr-uuid ( RO): e1fb6d59-93c5-72bf-a018-184dd3ea3643 device-config (MRO): device: /dev/disk/by-id/ata-WDC_WD1003FBYZ-012GB0_WD-WCAW3CYHV0PK currently-attached ( RO): false
-
@olivierlambert said in XCP host rebooted: VM's wont start anymore :
That's your problem, indeed
In XO, it's "connect", otherwise it's
xe pbd-connect
I think it is 'xe pbd-plug' because 'pbd-connect' doesnt seem to exist ?
But using this command results in this:
[17:56 xcp ~]# xe pbd-plug uuid=1a9396ae-e59b-9ea7-1d1a-3c5b139a11cb Error code: SR_BACKEND_FAILURE_40 Error parameters: , The SR scan failed [opterr=uuid=mailserver],
-
Yes indeed. Okay at least the error message is very visible.
Why do you have a disk with an UUID
mailserver
? Have you rename your disk manually?? -
@olivierlambert said in XCP host rebooted: VM's wont start anymore :
Yes indeed. Okay at least the error message is very visible.
Why do you have a disk with an UUID
mailserver
? Have you rename your disk manually??To be honest: I really have no clue why it is called like this
Is there a way to fix this error, probably caased by the disk being shut off the hard way caused by the power failure ? -
No, the problem is a manual rename of the VHD in your SR.
-
@olivierlambert said in XCP host rebooted: VM's wont start anymore :
No, the problem is a manual rename of the VHD in your SR.
Checked /run/sr-mount/e1fb6d59-93c5-72bf-a018-184dd3ea3643 and there was a smal 300kb mailserver.vhd file dated october 15th ?? No clue why it was there.
I have removed it and the xe pbd-plug works now.
I also seem to able to start the vm now -
Yes, you should never rename a file manually in the SR So it blocked rescan, then PBD plub, then the VM.
Also, please remove the unused host from your pool.
-
And finally, please consider getting pro support if you are running XCP-ng in production. You would have get faster assistance, but more importantly, contributing to the project (helping us to get more people involved)
-
@olivierlambert said in XCP host rebooted: VM's wont start anymore :
Yes, you should never rename a file manually in the SR So it blocked rescan, then PBD plub, then the VM.
Also, please remove the unused host from your pool.
Thanks for your help sofar, really appreciate it
The host was removed earlier.
The vm seems to work fine but I still cant install XOA from the webinterface or from cli with the script ? -
Please remove the host first. Not physically as you already did, from XAPI perspective