XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    After installing updates: 0 bytes free, Control domain memory = 0B

    Scheduled Pinned Locked Moved XCP-ng
    92 Posts 7 Posters 28.8k Views 5 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • olivierlambertO Offline
      olivierlambert Vates 🪐 Co-Founder CEO
      last edited by

      So /boot/xen.gz is pointing to xen-4.17.3-4.gz, which sounds correct. So why you are still running on Xen 4.13? 🤔 It's like you did not reboot, but since you showed me the Grub menu, I'm assuming you already did 🤔

      It would be interesting to compare the existing xen file and see if it's the right one from our repo. Something is fishy here 🤔

      1 Reply Last reply Reply Quote 0
      • olivierlambertO Offline
        olivierlambert Vates 🪐 Co-Founder CEO
        last edited by

        Can you md5sum xen-4.17.3-4.gz

        From the mirror & RPM, I have f011721be0c7b57563e29ed282558da3

        D 1 Reply Last reply Reply Quote 0
        • olivierlambertO Offline
          olivierlambert Vates 🪐 Co-Founder CEO
          last edited by

          Adding @yann in the loop in case I'm missing something obvious

          1 Reply Last reply Reply Quote 0
          • yannY Offline
            yann Vates 🪐 XCP-ng Team @Dataslak
            last edited by

            @Dataslak what does lsblk -o name,mountpoint,label,size,uuid show?

            D 1 Reply Last reply Reply Quote 0
            • D Offline
              Dataslak @olivierlambert
              last edited by

              @olivierlambert
              c08e011f-a210-471f-87c9-ed421595dac5-image.png

              1 Reply Last reply Reply Quote 0
              • D Offline
                Dataslak @yann
                last edited by

                @yann
                Hello Yann, thank you for pitching in.
                7c30db11-59d4-4898-a19b-83a933e8f9ee-image.png

                yannY 1 Reply Last reply Reply Quote 0
                • yannY Offline
                  yann Vates 🪐 XCP-ng Team @Dataslak
                  last edited by

                  @Dataslak can you please request a commandline from GRUB (hit c on the boot menu), and issue the following commands:

                  echo $root
                  search --label --set root root-eqjpzg
                  echo $root
                  
                  D 1 Reply Last reply Reply Quote 1
                  • olivierlambertO Offline
                    olivierlambert Vates 🪐 Co-Founder CEO
                    last edited by olivierlambert

                    Also a cat /proc/mdstat in the Dom0 would help.

                    D 1 Reply Last reply Reply Quote 0
                    • D Offline
                      Dataslak @yann
                      last edited by Dataslak

                      @yann
                      edbc1bca-dc7d-409a-85f4-9080ab4993df-image.png

                      Info: I am mirroring two M.2 SSDs ! Software RAID established by the installation routine of v8.3.
                      Could the mirror be broken and cause this somehow?

                      yannY 1 Reply Last reply Reply Quote 0
                      • D Offline
                        Dataslak @olivierlambert
                        last edited by Dataslak

                        @olivierlambert said in After installing updates: 0 bytes free, Control domain memory = 0B:

                        Also a cat /proc/mdstat in the Dom0 would help.

                        Please forgive my ignorance: How do I execute this command in Dom0 ?

                        I've read https://wiki.xenproject.org/wiki/Dom0 and it helped a little. Do I run the command in the console within XOA?

                        olivierlambertO 1 Reply Last reply Reply Quote 1
                        • yannY Offline
                          yann Vates 🪐 XCP-ng Team @Dataslak
                          last edited by

                          @Dataslak so it is choosing to "boot from the 1st disk of the raid1", we could try to tell him to boot from the 2nd one:

                          • on the grub menu hit e to edit the boot commands
                          • replace that search ... line with set root=hd1,gpt1
                          • then hit Ctrl-x to boot
                          D 1 Reply Last reply Reply Quote 0
                          • olivierlambertO Offline
                            olivierlambert Vates 🪐 Co-Founder CEO @Dataslak
                            last edited by

                            @Dataslak Dom0 is "the host" (if you think it's the host it's not really but anyway), ie the machine you are connected to and showing results since the start 🙂

                            D 1 Reply Last reply Reply Quote 0
                            • D Offline
                              Dataslak @yann
                              last edited by Dataslak

                              @yann
                              Wohoo!!
                              All VMs came up!
                              Host is not in maintenance mode.
                              Control domain memory = 12GiB
                              Stats are back
                              Etc....

                              813787ec-7c49-41e1-8a59-af2fda8648bc-image.png

                              1867f534-16d5-4da6-8418-17d61d664857-image.png

                              As far as I can see (which is limited) everything looks good?

                              How can I see the status of the RAID1 and see if the mirror is intact ?

                              1 Reply Last reply Reply Quote 0
                              • D Offline
                                Dataslak @olivierlambert
                                last edited by Dataslak

                                @olivierlambert
                                Thank you for explaining to me. I will look more into details when (if) I find time 😄

                                fcb72f47-cdc2-46a6-b84e-b532b9089d14-image.png

                                Ah - I see you were ahead of me !

                                How can I interpret this? Raid1 OK? Synched? Ready to deal with a single drive failure?

                                How will XO inform me if one of the drives fails? Will I have to scour through logs, or will there be a clear visible notice in the interface?

                                1 Reply Last reply Reply Quote 0
                                • olivierlambertO Offline
                                  olivierlambert Vates 🪐 Co-Founder CEO
                                  last edited by olivierlambert

                                  That's the problem. Your RAID1 lost the sync. And so it continued to boot on the disk out of sync, loading the old Xen from the boot while the rest (root partition) was up to date.

                                  D 1 Reply Last reply Reply Quote 0
                                  • D Offline
                                    Dataslak @olivierlambert
                                    last edited by Dataslak

                                    @olivierlambert
                                    Since this happened on six servers simultaneously when applying updates through XO I guess we may have found an error ?

                                    If so then all of this was not in vain, and I can be happy to have made a tiny tiny contribution to the development of 8.3 ?

                                    Will the modification of the Grub boot loader be safe to apply to all remaining 5 servers? Or should I do some verification on each before applying it?

                                    Is the modification of Grub what I will have to do if a drive fails? Change that one line from set root=hd1,gpt1 to set root=hd0,gpt1 or something?

                                    stormiS 1 Reply Last reply Reply Quote 0
                                    • olivierlambertO Offline
                                      olivierlambert Vates 🪐 Co-Founder CEO
                                      last edited by olivierlambert

                                      I don't know yet, but you lost one drive. Can you run xe host-call-plugin host-uuid=<uuid> plugin=raid.py fn=check_raid_pool? (replace with the UUID of the host)

                                      edit: check that on all your other hosts

                                      D 1 Reply Last reply Reply Quote 0
                                      • D Offline
                                        Dataslak @olivierlambert
                                        last edited by

                                        @olivierlambert

                                        XCP-ng-002:
                                        This runs 8.2.1 with only one drive. I was planning on upgrading it to 8.3 and insert another drive to obtain redundancy when time was available:
                                        41e6be9f-d97e-49fb-a95f-b4fc6e2ab6ac-image.png

                                        XCP-ng-003:
                                        d8b36828-abda-45a3-916e-65a916546ed8-image.png

                                        XCP-ng-004:
                                        ab3aa27c-bfd3-4a62-b8c2-d16f4d409b0e-image.png

                                        XCP-ng-005:
                                        62787efd-5397-4d7d-9f70-9c4f36ed1db1-image.png

                                        XCP-ng-006:
                                        8f3ac816-8bda-4ad8-9e4d-3b9fd6f99a90-image.png

                                        XCP-ng-008:
                                        This server is clean installed after the problems
                                        3bbaf4cb-d89f-4825-9888-3e8a37ec15ac-image.png

                                        Can your trained eyes see anything I should be aware of?

                                        1 Reply Last reply Reply Quote 0
                                        • olivierlambertO Offline
                                          olivierlambert Vates 🪐 Co-Founder CEO
                                          last edited by olivierlambert

                                          I can immediately see the hosts with the State: "clean, degraded" on XCP-ng 005. The rest is in the state "active", which is OK.

                                          So you don't have a similar issue on your other hosts, it's only with this one, you have a dead disk (not syncing since a while). Try to check the dead disk and if you can, force a RAID1 sync on it.

                                          D 1 Reply Last reply Reply Quote 0
                                          • D Offline
                                            Dataslak @olivierlambert
                                            last edited by Dataslak

                                            @olivierlambert
                                            XCP-ng-005 is the only 8.3 host I have restarted so far. All the hosts showing "3" in the triangle is asking for a reboot and claims hardware does not support virtualization.
                                            65573855-a436-4552-b4b5-9e3614955c57-image.png
                                            2351705a-75a9-459b-9073-de0c52871481-image.png

                                            Shall I try to reboot one of them and see if the RAID1 breaks like it did on XCP-ng-005?

                                            If you plan on going home for the week-end soon then we can delay this until monday? I hope the power does not fail in the meantime (it very very rarely does; it is very reliable where I am).
                                            I do not wish to keep you at work. But if you - like me - plan to remain at work then I am very happy to keep going.

                                            Please forgive my rudeness:
                                            THANK YOU for solving the problem so far! The 900+ USD I invested 3/4 year ago was money well spent. Not only is your product amazing. Your skill and availability is also great!

                                            yannY 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post