XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Every virtual machine I restart doesn't boot.

    Scheduled Pinned Locked Moved XCP-ng
    12 Posts 2 Posters 11 Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • O Online
      ohthisis
      last edited by

      Hello,
      Every virtual machine I restart doesn't boot. It just show VM.start: 33%. I did:
      # xe-toolstack-restart
      # service xapi restart

      But, problem exit. All my servers went down.

      What should I do?

      Thank you.

      P 1 Reply Last reply Reply Quote 0
      • P Online
        Pilow @ohthisis
        last edited by

        @ohthisis hi, need more context...
        seems to be a storage issue... do you have somethiong in LOGS tab of the VM and/or SR ?

        do you have any issue represented in DASHBOARD/HEALTH ?

        what version of XCP / XO or XOA ?

        O 1 Reply Last reply Reply Quote 0
        • O Online
          ohthisis @Pilow
          last edited by ohthisis

          @Pilow
          I see :

          Headers Timeout Error
          

          Nothing in the HEALTH and I use Xen Orchestra, commit 83a69 and:

          [09:04 xcp-ng ~]# cat /etc/redhat-release 
          XCP-ng release 8.2.1 (xenenterprise)
          
          P 1 Reply Last reply Reply Quote 0
          • P Online
            Pilow @ohthisis
            last edited by

            @ohthisis how many hosts in the pool ?
            is it shared or local storage for your VMs ?

            on cli on hosts try

            # xl list
            

            and

            # xenops-cli list
            

            report back the results

            O 1 Reply Last reply Reply Quote 0
            • O Online
              ohthisis @Pilow
              last edited by

              @Pilow
              I just have one host and I use local storage.

              [09:27 xcp-ng ~]# xl list
              Name                                        ID   Mem VCPUs	State	Time(s)
              Domain-0                                     0  5936    16     r-----  18024910.4
              XO                                          72  4096     2     -b----  8899359.6
              Grafana                                    114 16383     1     --psr-  2983554.9
              Tor                                        156  2039     1     --p---       0.0
              [09:28 xcp-ng ~]# xenops-cli list
              Name                                         ID   Mem   VCPUs     State   Time(s)
              Tor                                          -    2048  4         Paused  
              Grafana                                      -    16384 4         Paused  
              XO                                           72   4096  2         Running 
              
              
              P 1 Reply Last reply Reply Quote 0
              • P Online
                Pilow @ohthisis
                last edited by

                @ohthisis you confirm that your 2 VMs Grafana and Tor are paused in web UI ?
                if not, try

                xe vm-reset-powerstate uuid=<uuid_of_vm> force=true
                

                and do a restart toolstack afterward

                O 1 Reply Last reply Reply Quote 0
                • O Online
                  ohthisis @Pilow
                  last edited by ohthisis

                  @Pilow
                  Doesn't work. It just show VM.start: 33%
                  I'm afraid to restart the server too because the XO will not run also.

                  P 1 Reply Last reply Reply Quote 0
                  • O Online
                    ohthisis
                    last edited by

                    I don't know what happened.

                    1 Reply Last reply Reply Quote 0
                    • P Online
                      Pilow @ohthisis
                      last edited by Pilow

                      @ohthisis the --psr- state of Grafana is suspicious...
                      0.0 in Time column for Tor too

                      did you have a brutal reboot / power loss of this server ?
                      is your storage OK ?

                      there is some commands to brutally kill all VM in hybrid states like that, but could corrupt data inside (like a forced hard shutdown would do)

                      do you have backups of your VMs ?

                      # xl destroy 114
                      

                      This command would kill the domain runtime (hard shutdown) of Grafana VM
                      it would then disappear of xl list
                      followed by a toolstack restart, should present you an halted VM in web UI that potentially could be started normally if no other issue on storage exists

                      but :

                      • i do not recommend this action if your storage currently have issues
                      • i do not recommend this action if you do not have backups
                      • i do not recommend this action, it could corrupt data inside the VM (it is paused.. but is it really ?!)

                      just noticed you do not have same vCPUs on VMs with each command.
                      there is a real desync in your system 😕

                      O 1 Reply Last reply Reply Quote 0
                      • O Online
                        ohthisis @Pilow
                        last edited by

                        @Pilow
                        Please take a look at these:

                        [09:08 xcp-ng ~]# df -h
                        Filesystem      Size  Used Avail Use% Mounted on
                        devtmpfs        2.8G   20K  2.8G   1% /dev
                        tmpfs           2.8G  160K  2.8G   1% /dev/shm
                        tmpfs           2.8G   11M  2.8G   1% /run
                        tmpfs           2.8G     0  2.8G   0% /sys/fs/cgroup
                        /dev/sda1        18G  2.0G   15G  12% /
                        xenstore        2.8G     0  2.8G   0% /var/lib/xenstored
                        /dev/sda5       3.9G  836M  2.8G  23% /var/log
                        tmpfs           571M     0  571M   0% /run/user/0
                        [09:17 xcp-ng ~]# mount | grep -E "(\/var|\/opt)"
                        xenstore on /var/lib/xenstored type tmpfs (rw,relatime,mode=755)
                        /dev/sda5 on /var/log type ext3 (rw,relatime)
                        sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw,relatime)
                        [09:18 xcp-ng ~]# dmesg | grep -iE "(error|fail|timeout|scsi|sd|hba)" | tail -20
                        [81734324.210688] Status code returned 0xc000006d NT_STATUS_LOGON_FAILURE
                        [81734324.217399] Status code returned 0xc000006d NT_STATUS_LOGON_FAILURE
                        [81734324.223994] Status code returned 0xc000006d NT_STATUS_LOGON_FAILURE
                        [81734324.230647] Status code returned 0xc000006d NT_STATUS_LOGON_FAILURE
                        [81734324.237867] Status code returned 0xc000006d NT_STATUS_LOGON_FAILURE
                        [81734324.244529] Status code returned 0xc000006d NT_STATUS_LOGON_FAILURE
                        [81734324.251239] Status code returned 0xc000006d NT_STATUS_LOGON_FAILURE
                        [81734324.257781] Status code returned 0xc000006d NT_STATUS_LOGON_FAILURE
                        [81734324.264466] Status code returned 0xc000006d NT_STATUS_LOGON_FAILURE
                        [81734324.270941] Status code returned 0xc000006d NT_STATUS_LOGON_FAILURE
                        [81734324.277574] Status code returned 0xc000006d NT_STATUS_LOGON_FAILURE
                        [81734324.284149] Status code returned 0xc000006d NT_STATUS_LOGON_FAILURE
                        [81734324.290576] Status code returned 0xc000006d NT_STATUS_LOGON_FAILURE
                        [81734324.297118] Status code returned 0xc000006d NT_STATUS_LOGON_FAILURE
                        [81734324.306270] Status code returned 0xc000006d NT_STATUS_LOGON_FAILURE
                        [81734324.314061] Status code returned 0xc000006d NT_STATUS_LOGON_FAILURE
                        [81734324.322632] Status code returned 0xc000006d NT_STATUS_LOGON_FAILURE
                        [81734324.330717] Status code returned 0xc000006d NT_STATUS_LOGON_FAILURE
                        [81734324.337242] Status code returned 0xc000006d NT_STATUS_LOGON_FAILURE
                        [81734324.344686] Status code returned 0xc000006d NT_STATUS_LOGON_FAILURE
                        [09:18 xcp-ng ~]# pvs
                          PV         VG                                                 Fmt  Attr PSize  PFree  
                          /dev/sda3  VG_XenStorage-00f82a18-a9f6-f7bc-9ca1-f42698d46b5f lvm2 a--  95.18g <95.18g
                          /dev/sdb   VG_XenStorage-c5129868-a590-68ca-e587-db708ad61f38 lvm2 a--  <3.64t   3.09t
                        [09:18 xcp-ng ~]# vgs
                          VG                                                 #PV #LV #SN Attr   VSize  VFree  
                          VG_XenStorage-00f82a18-a9f6-f7bc-9ca1-f42698d46b5f   1   1   0 wz--n- 95.18g <95.18g
                          VG_XenStorage-c5129868-a590-68ca-e587-db708ad61f38   1   8   0 wz--n- <3.64t   3.09t
                        [09:18 xcp-ng ~]# lvs -o lv_name,vg_name,lv_size,lv_attr
                          LV                                       VG                                                 LSize    Attr      
                          MGT                                      VG_XenStorage-00f82a18-a9f6-f7bc-9ca1-f42698d46b5f    4.00m -wi-a-----
                          MGT                                      VG_XenStorage-c5129868-a590-68ca-e587-db708ad61f38    4.00m -wi-a-----
                          VHD-1461c885-89c6-4e0e-8ee1-7d5be059f3dc VG_XenStorage-c5129868-a590-68ca-e587-db708ad61f38  <30.07g -wi-------
                          VHD-2aaa4501-1c9b-48d6-8532-961ab8a3e627 VG_XenStorage-c5129868-a590-68ca-e587-db708ad61f38  <30.07g -wi-ao----
                          VHD-4de5831d-5a4d-4d2d-9f0a-ce4d1c2d8ef5 VG_XenStorage-c5129868-a590-68ca-e587-db708ad61f38  100.20g -wi-a-----
                          VHD-6b1ea821-d677-4426-99e0-43314ef3c536 VG_XenStorage-c5129868-a590-68ca-e587-db708ad61f38 <250.50g -wi-ao----
                          VHD-6c08ae7f-71a7-4f97-a553-3c067dbbe243 VG_XenStorage-c5129868-a590-68ca-e587-db708ad61f38  <50.11g -wi-a-----
                          VHD-bc8dd3e4-ea0e-4006-a918-817b18d65456 VG_XenStorage-c5129868-a590-68ca-e587-db708ad61f38  <50.11g -wi-ao----
                          VHD-ccaaabb0-b5ae-4e29-ab8d-c895af000550 VG_XenStorage-c5129868-a590-68ca-e587-db708ad61f38  <50.11g -wi-a-----
                        [09:18 xcp-ng ~]# lvdisplay /dev/VG_XenStorage-c5129868-a590-68ca-e587-db708ad61f38/VHD-ccaaabb0-b5ae-4e29-ab8d-c895af000550
                          --- Logical volume ---
                          LV Path                /dev/VG_XenStorage-c5129868-a590-68ca-e587-db708ad61f38/VHD-ccaaabb0-b5ae-4e29-ab8d-c895af000550
                          LV Name                VHD-ccaaabb0-b5ae-4e29-ab8d-c895af000550
                          VG Name                VG_XenStorage-c5129868-a590-68ca-e587-db708ad61f38
                          LV UUID                TggCle-7H7d-BN1o-KU5U-8oME-lckS-z0puvZ
                          LV Write Access        read/write
                          LV Creation host, time xcp-ng, 2023-07-11 14:19:40 +0330
                          LV Status              available
                          # open                 0
                          LV Size                <50.11 GiB
                          Current LE             12827
                          Segments               1
                          Allocation             inherit
                          Read ahead sectors     auto
                          - currently set to     256
                          Block device           253:2
                          
                        
                        P 1 Reply Last reply Reply Quote 0
                        • P Online
                          Pilow @ohthisis
                          last edited by Pilow

                          @ohthisis I see nothing outstanding
                          you have two SRs, thick provisionned
                          one small 95Gb that is empty probably created on the install, and one big 3.64Tb on /dev/sdb

                          VMs are on the big SR, same sized VDIs could indicate existing snapshots.

                          is your /dev/sdb a RAID5 array or a standlone disk ?

                          Can you create a new test VM, that is running normally on this SR ?

                          O 1 Reply Last reply Reply Quote 0
                          • O Online
                            ohthisis @Pilow
                            last edited by

                            @Pilow
                            I guess my server using RAID5.
                            I created a VM with PXE as boot, but it is VM.start: 50%.

                            1 Reply Last reply Reply Quote 0
                            • First post
                              Last post