XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Host crash during backup

    Scheduled Pinned Locked Moved Solved Backup
    14 Posts 5 Posters 1.4k Views 4 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • DanpD Offline
      Danp Pro Support Team
      last edited by

      What type of server hardware are you using? Have you performed a test on the RAM to check for issues?

      MathieuM 1 Reply Last reply Reply Quote 0
      • MathieuM Offline
        Mathieu @Danp
        last edited by

        @Danp
        It's a barebone server from ASRock Rack 1U4LW-X570/2L2T RPSU
        CPU : AMD Ryzen 9 5950X
        RAM : 4 x 32 GB ECC DDR4

        I'm not on site right now, I will try a memchek asap to see if there are some errors appearing.

        Except for these crashes, the host had been stable with the VMs for two weeks.

        MathieuM 1 Reply Last reply Reply Quote 0
        • MathieuM Offline
          Mathieu @Mathieu
          last edited by

          So, after a night of memtest and a few hours of cpu stress test, no crash or error with the RAM modules.

          In the meantime, I moved my XO VM to a different host (outside of the pool I'm backing up) and now everything seems OK, no more crashes.
          I still don't have any explication for the host crash, but at least I got the backup working.

          BrantleyHobbsB 1 Reply Last reply Reply Quote 0
          • BrantleyHobbsB Offline
            BrantleyHobbs @Mathieu
            last edited by

            @Mathieu did this ever happen to you again?

            I recently started having this happen as well, on a host that has been running rock solid for 6-8 months now. Coincidentally also a Ryzen 9 (a 7900X). The crashes don't always happen, but they are pretty frequent, and in some cases it has caused data corruption.

            MathieuM DanpD 2 Replies Last reply Reply Quote 0
            • MathieuM Offline
              Mathieu @BrantleyHobbs
              last edited by

              @BrantleyHobbs
              No more crashes with backup for a long time, now.
              My pool has been upgraded with a second host an XO is running in one of them without any hassle.

              1 Reply Last reply Reply Quote 0
              • DanpD Offline
                Danp Pro Support Team @BrantleyHobbs
                last edited by

                @BrantleyHobbs You may want to provide some additional details on your setup, ie:

                • Version of XCP-ng
                • Are the host's fully patched?
                • Current XO version or commit
                • etc

                Have you checked /var/crash subdirectory on the crashed host to see if kernel crash logs were captured? https://docs.xcp-ng.org/troubleshooting/log-files/

                BrantleyHobbsB 1 Reply Last reply Reply Quote 0
                • BrantleyHobbsB Offline
                  BrantleyHobbs @Danp
                  last edited by

                  @Danp fully patched 8.3 (through the most recent May patches). XO commit d810e (master commit e3a58). I usually update XO around the first of each month; so it's a little bit behind master.

                  There are crash logs in /var/crash. I can provide a log bundle if needed, or copy/paste some info here if I know what to provide.

                  BrantleyHobbsB 1 Reply Last reply Reply Quote 0
                  • BrantleyHobbsB Offline
                    BrantleyHobbs @BrantleyHobbs
                    last edited by

                    Friends, I am once again here to make stupid mistakes so that you don't have to: it appears that the reason my backups were causing the machine to crash/hang is that:
                    A) I was making DR replicas to local storage (the same device the host is booting from; not the same partition)

                    B) I was filling it up.

                    Again, this is not a production environment, simply a home lab, and I'm a bit resource poor, and I was looking for something other than my main disk repository for quick recovery. Making use of unused disk space left over on the boot device seemed like a Good Idea at the time.

                    I added an additional physical disk specifically for DR replication and all my problems stopped.

                    Hope that helps someone in the future.

                    P 1 Reply Last reply Reply Quote 1
                    • P Offline
                      Pilow @BrantleyHobbs
                      last edited by

                      @BrantleyHobbs 0669c00d-f209-47fc-bf42-4a935393efa2-image.jpeg

                      facepalm situation ! thanks for the feedback 😄

                      BrantleyHobbsB 1 Reply Last reply Reply Quote 0
                      • BrantleyHobbsB Offline
                        BrantleyHobbs @Pilow
                        last edited by

                        @Pilow pretty much

                        1 Reply Last reply Reply Quote 0
                        • olivierlambertO olivierlambert marked this topic as a question
                        • olivierlambertO olivierlambert has marked this topic as solved

                        Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                        Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                        With your input, this post could be even better 💗

                        Register Login
                        • First post
                          Last post