XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    XCP host rebooted: VM's wont start anymore :-(

    Scheduled Pinned Locked Moved Xen Orchestra
    40 Posts 2 Posters 7.9k Views 1 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • olivierlambertO Offline
      olivierlambert Vates 🪐 Co-Founder CEO
      last edited by

      Well, you can display the VM disk list with a xe vm-disk-list uuid=<VM UUID>.

      Then you can find info on those disks with, for each, a xe vdi-param-list uuid=<VDI UUID>.

      Then, you'll see on which SR are each disks, and you'll understand why the VM can't boot.

      P 2 Replies Last reply Reply Quote 0
      • P Offline
        prensel @olivierlambert
        last edited by

        @olivierlambert

        I'm trying to understand how it is possible that this vm has been running for a few months without the other host being present ? Where would have the vhd file been stored ?
        I have a copy of the vhd file here, can I create a new vm with that ?

        1 Reply Last reply Reply Quote 0
        • P Offline
          prensel @olivierlambert
          last edited by olivierlambert

          @olivierlambert

          [17:22 xcp ~]# xe vdi-param-list uuid=06f7760e-157f-4a18-83fe-ba48db06a5ef
          uuid ( RO)                    : 06f7760e-157f-4a18-83fe-ba48db06a5ef
                        name-label ( RW): mailserver
                  name-description ( RW): Created by XO
                     is-a-snapshot ( RO): false
                       snapshot-of ( RO): <not in database>
                         snapshots ( RO): 
                     snapshot-time ( RO): 19700101T00:00:00Z
                allowed-operations (SRO): generate_config; update; forget; destroy; snapshot; copy; clone
                current-operations (SRO): 
                           sr-uuid ( RO): e1fb6d59-93c5-72bf-a018-184dd3ea3643
                     sr-name-label ( RO): Local storage
          

          It says the sr-uuid is e1fb6d59-93c5-72bf-a018-184dd3ea3643, this my local storage SR of the current host ??

          1 Reply Last reply Reply Quote 0
          • olivierlambertO Offline
            olivierlambert Vates 🪐 Co-Founder CEO
            last edited by

            Your mailserver disk is using SR e1fb6d59-93c5-72bf-a018-184dd3ea3643. This SR seems to belong to host ( RO): xcp, not xcp-ng-01.

            P 1 Reply Last reply Reply Quote 0
            • P Offline
              prensel @olivierlambert
              last edited by

              @olivierlambert said in XCP host rebooted: VM's wont start anymore 😞:

              Your mailserver disk is using SR e1fb6d59-93c5-72bf-a018-184dd3ea3643. This SR seems to belong to host ( RO): xcp, not xcp-ng-01.

              Yes thats right, the host xcp is the current up and the host xcp-ng-01 is the one 'lost'.
              I really cant see the problem 😞

              1 Reply Last reply Reply Quote 0
              • olivierlambertO Offline
                olivierlambert Vates 🪐 Co-Founder CEO
                last edited by

                Then check your local SR (if it's correctly connected)

                P 3 Replies Last reply Reply Quote 0
                • P Offline
                  prensel @olivierlambert
                  last edited by

                  @olivierlambert said in XCP host rebooted: VM's wont start anymore 😞:

                  Then check your local SR (if it's correctly connected)

                  What is the proper way to do that using cli ?

                  1 Reply Last reply Reply Quote 0
                  • P Offline
                    prensel @olivierlambert
                    last edited by

                    @olivierlambert said in XCP host rebooted: VM's wont start anymore 😞:

                    Then check your local SR (if it's correctly connected)

                    xe sr-scan uuid=e1fb6d59-93c5-72bf-a018-184dd3ea3643 
                    The SR has no attached PBDs
                    sr: e1fb6d59-93c5-72bf-a018-184dd3ea3643 (Local storage)
                    

                    How can I connect or attach a PBD ?

                    1 Reply Last reply Reply Quote 0
                    • olivierlambertO Offline
                      olivierlambert Vates 🪐 Co-Founder CEO
                      last edited by

                      That's your problem, indeed 😄

                      In XO, it's "connect", otherwise it's xe pbd-connect

                      P 1 Reply Last reply Reply Quote 0
                      • P Offline
                        prensel @olivierlambert
                        last edited by

                        @olivierlambert

                        I managed to find the PBD and it doesnt seemed attached

                        #xe pbd-list
                        
                        uuid ( RO)                  : 1a9396ae-e59b-9ea7-1d1a-3c5b139a11cb
                                     host-uuid ( RO): f4d5a20d-e7f3-4e62-8804-e2caa6922a43
                                       sr-uuid ( RO): e1fb6d59-93c5-72bf-a018-184dd3ea3643
                                 device-config (MRO): device: /dev/disk/by-id/ata-WDC_WD1003FBYZ-012GB0_WD-WCAW3CYHV0PK
                            currently-attached ( RO): false
                        
                        1 Reply Last reply Reply Quote 0
                        • P Offline
                          prensel @olivierlambert
                          last edited by

                          @olivierlambert said in XCP host rebooted: VM's wont start anymore 😞:

                          That's your problem, indeed 😄

                          In XO, it's "connect", otherwise it's xe pbd-connect

                          I think it is 'xe pbd-plug' because 'pbd-connect' doesnt seem to exist ?

                          But using this command results in this:

                          [17:56 xcp ~]# xe pbd-plug uuid=1a9396ae-e59b-9ea7-1d1a-3c5b139a11cb
                          Error code: SR_BACKEND_FAILURE_40
                          Error parameters: , The SR scan failed  [opterr=uuid=mailserver],
                          
                          1 Reply Last reply Reply Quote 0
                          • olivierlambertO Offline
                            olivierlambert Vates 🪐 Co-Founder CEO
                            last edited by

                            Yes indeed. Okay at least the error message is very visible.

                            Why do you have a disk with an UUID mailserver? Have you rename your disk manually??

                            P 1 Reply Last reply Reply Quote 0
                            • P Offline
                              prensel @olivierlambert
                              last edited by

                              @olivierlambert said in XCP host rebooted: VM's wont start anymore 😞:

                              Yes indeed. Okay at least the error message is very visible.

                              Why do you have a disk with an UUID mailserver? Have you rename your disk manually??

                              To be honest: I really have no clue why it is called like this 😞
                              Is there a way to fix this error, probably caased by the disk being shut off the hard way caused by the power failure ?

                              1 Reply Last reply Reply Quote 0
                              • olivierlambertO Offline
                                olivierlambert Vates 🪐 Co-Founder CEO
                                last edited by

                                No, the problem is a manual rename of the VHD in your SR.

                                P 1 Reply Last reply Reply Quote 0
                                • P Offline
                                  prensel @olivierlambert
                                  last edited by

                                  @olivierlambert said in XCP host rebooted: VM's wont start anymore 😞:

                                  No, the problem is a manual rename of the VHD in your SR.

                                  Checked /run/sr-mount/e1fb6d59-93c5-72bf-a018-184dd3ea3643 and there was a smal 300kb mailserver.vhd file dated october 15th ?? No clue why it was there.
                                  I have removed it and the xe pbd-plug works now.
                                  I also seem to able to start the vm now 🙂

                                  1 Reply Last reply Reply Quote 0
                                  • olivierlambertO Offline
                                    olivierlambert Vates 🪐 Co-Founder CEO
                                    last edited by

                                    Yes, you should never rename a file manually in the SR 🙂 So it blocked rescan, then PBD plub, then the VM.

                                    Also, please remove the unused host from your pool.

                                    P 1 Reply Last reply Reply Quote 0
                                    • olivierlambertO Offline
                                      olivierlambert Vates 🪐 Co-Founder CEO
                                      last edited by

                                      And finally, please consider getting pro support if you are running XCP-ng in production. You would have get faster assistance, but more importantly, contributing to the project (helping us to get more people involved)

                                      P 1 Reply Last reply Reply Quote 0
                                      • P Offline
                                        prensel @olivierlambert
                                        last edited by

                                        @olivierlambert said in XCP host rebooted: VM's wont start anymore 😞:

                                        Yes, you should never rename a file manually in the SR 🙂 So it blocked rescan, then PBD plub, then the VM.

                                        Also, please remove the unused host from your pool.

                                        Thanks for your help sofar, really appreciate it 🙂

                                        The host was removed earlier.
                                        The vm seems to work fine but I still cant install XOA from the webinterface or from cli with the script ?

                                        1 Reply Last reply Reply Quote 0
                                        • olivierlambertO Offline
                                          olivierlambert Vates 🪐 Co-Founder CEO
                                          last edited by

                                          Please remove the host first. Not physically as you already did, from XAPI perspective 🙂

                                          1 Reply Last reply Reply Quote 0
                                          • P Offline
                                            prensel @olivierlambert
                                            last edited by

                                            @olivierlambert said in XCP host rebooted: VM's wont start anymore 😞:

                                            And finally, please consider getting pro support if you are running XCP-ng in production. You would have get faster assistance, but more importantly, contributing to the project (helping us to get more people involved)

                                            I certainly will consider it but i'm still in the process to decide if xcp_ng is something that suits me. I had some other quircky issues before that didnt convince me to put it in production (yet).

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post