XCP host rebooted: VM's wont start anymore :-(
-
I'm trying to understand how it is possible that this vm has been running for a few months without the other host being present ? Where would have the vhd file been stored ?
I have a copy of the vhd file here, can I create a new vm with that ? -
[17:22 xcp ~]# xe vdi-param-list uuid=06f7760e-157f-4a18-83fe-ba48db06a5ef uuid ( RO) : 06f7760e-157f-4a18-83fe-ba48db06a5ef name-label ( RW): mailserver name-description ( RW): Created by XO is-a-snapshot ( RO): false snapshot-of ( RO): <not in database> snapshots ( RO): snapshot-time ( RO): 19700101T00:00:00Z allowed-operations (SRO): generate_config; update; forget; destroy; snapshot; copy; clone current-operations (SRO): sr-uuid ( RO): e1fb6d59-93c5-72bf-a018-184dd3ea3643 sr-name-label ( RO): Local storage
It says the sr-uuid is
e1fb6d59-93c5-72bf-a018-184dd3ea3643
, this my local storage SR of the current host ?? -
Your
mailserver
disk is using SRe1fb6d59-93c5-72bf-a018-184dd3ea3643
. This SR seems to belong tohost ( RO): xcp
, notxcp-ng-01
. -
@olivierlambert said in XCP host rebooted: VM's wont start anymore :
Your
mailserver
disk is using SRe1fb6d59-93c5-72bf-a018-184dd3ea3643
. This SR seems to belong tohost ( RO): xcp
, notxcp-ng-01
.Yes thats right, the host xcp is the current up and the host xcp-ng-01 is the one 'lost'.
I really cant see the problem -
Then check your local SR (if it's correctly connected)
-
@olivierlambert said in XCP host rebooted: VM's wont start anymore :
Then check your local SR (if it's correctly connected)
What is the proper way to do that using cli ?
-
@olivierlambert said in XCP host rebooted: VM's wont start anymore :
Then check your local SR (if it's correctly connected)
xe sr-scan uuid=e1fb6d59-93c5-72bf-a018-184dd3ea3643 The SR has no attached PBDs sr: e1fb6d59-93c5-72bf-a018-184dd3ea3643 (Local storage)
How can I connect or attach a PBD ?
-
That's your problem, indeed
In XO, it's "connect", otherwise it's
xe pbd-connect
-
I managed to find the PBD and it doesnt seemed attached
#xe pbd-list uuid ( RO) : 1a9396ae-e59b-9ea7-1d1a-3c5b139a11cb host-uuid ( RO): f4d5a20d-e7f3-4e62-8804-e2caa6922a43 sr-uuid ( RO): e1fb6d59-93c5-72bf-a018-184dd3ea3643 device-config (MRO): device: /dev/disk/by-id/ata-WDC_WD1003FBYZ-012GB0_WD-WCAW3CYHV0PK currently-attached ( RO): false
-
@olivierlambert said in XCP host rebooted: VM's wont start anymore :
That's your problem, indeed
In XO, it's "connect", otherwise it's
xe pbd-connect
I think it is 'xe pbd-plug' because 'pbd-connect' doesnt seem to exist ?
But using this command results in this:
[17:56 xcp ~]# xe pbd-plug uuid=1a9396ae-e59b-9ea7-1d1a-3c5b139a11cb Error code: SR_BACKEND_FAILURE_40 Error parameters: , The SR scan failed [opterr=uuid=mailserver],
-
Yes indeed. Okay at least the error message is very visible.
Why do you have a disk with an UUID
mailserver
? Have you rename your disk manually?? -
@olivierlambert said in XCP host rebooted: VM's wont start anymore :
Yes indeed. Okay at least the error message is very visible.
Why do you have a disk with an UUID
mailserver
? Have you rename your disk manually??To be honest: I really have no clue why it is called like this
Is there a way to fix this error, probably caased by the disk being shut off the hard way caused by the power failure ? -
No, the problem is a manual rename of the VHD in your SR.
-
@olivierlambert said in XCP host rebooted: VM's wont start anymore :
No, the problem is a manual rename of the VHD in your SR.
Checked /run/sr-mount/e1fb6d59-93c5-72bf-a018-184dd3ea3643 and there was a smal 300kb mailserver.vhd file dated october 15th ?? No clue why it was there.
I have removed it and the xe pbd-plug works now.
I also seem to able to start the vm now -
Yes, you should never rename a file manually in the SR So it blocked rescan, then PBD plub, then the VM.
Also, please remove the unused host from your pool.
-
And finally, please consider getting pro support if you are running XCP-ng in production. You would have get faster assistance, but more importantly, contributing to the project (helping us to get more people involved)
-
@olivierlambert said in XCP host rebooted: VM's wont start anymore :
Yes, you should never rename a file manually in the SR So it blocked rescan, then PBD plub, then the VM.
Also, please remove the unused host from your pool.
Thanks for your help sofar, really appreciate it
The host was removed earlier.
The vm seems to work fine but I still cant install XOA from the webinterface or from cli with the script ? -
Please remove the host first. Not physically as you already did, from XAPI perspective
-
@olivierlambert said in XCP host rebooted: VM's wont start anymore :
And finally, please consider getting pro support if you are running XCP-ng in production. You would have get faster assistance, but more importantly, contributing to the project (helping us to get more people involved)
I certainly will consider it but i'm still in the process to decide if xcp_ng is something that suits me. I had some other quircky issues before that didnt convince me to put it in production (yet).
-
Well, renaming a VHD in the SR is a good way to learn to NOT do it
Also not removing a host in XAPI while doing it physically is also something else you have to avoid.