XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Host crash during backup

    Scheduled Pinned Locked Moved Backup
    11 Posts 4 Posters 1.2k Views 3 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • DanpD Offline
      Danp Pro Support Team
      last edited by

      https://docs.xcp-ng.org/troubleshooting/log-files/

      1 Reply Last reply Reply Quote 1
      • julien-fJ Offline
        julien-f Vates 🪐 Co-Founder XO Team @Mathieu
        last edited by

        @Mathieu Interrupted means that XO was stopped (abruptly) during the backup run, it has no information why just that the backup run could not finish.

        As @Danp said, in this case you need to check out the pool's and host's logs to figure out why it crashed.

        MathieuM 1 Reply Last reply Reply Quote 0
        • MathieuM Offline
          Mathieu @julien-f
          last edited by

          @julien-f

          Hello,

          I can definitely link the crash to the backup job, as it occurs a few minutes after I started the job and it happened several times.
          That backup was a fresh new job, just a full backup of 8 VMs from a single host to a NFS remote, no delta involved.

          I've check xensource.log, daemon.log, kernel.log and SMlog, nothing obvious appears to me, but maybe I'm missing something.
          If you want to have look, the host crashed at about 09:20:40 and started back at 09:24

          /var/crash is empty, nothing there.

          The pool is for now a single host, nothing fancy on that side.

          Should I check other log files?

          Many thanks,

          1 Reply Last reply Reply Quote 0
          • DanpD Offline
            Danp Pro Support Team
            last edited by

            What type of server hardware are you using? Have you performed a test on the RAM to check for issues?

            MathieuM 1 Reply Last reply Reply Quote 0
            • MathieuM Offline
              Mathieu @Danp
              last edited by

              @Danp
              It's a barebone server from ASRock Rack 1U4LW-X570/2L2T RPSU
              CPU : AMD Ryzen 9 5950X
              RAM : 4 x 32 GB ECC DDR4

              I'm not on site right now, I will try a memchek asap to see if there are some errors appearing.

              Except for these crashes, the host had been stable with the VMs for two weeks.

              MathieuM 1 Reply Last reply Reply Quote 0
              • MathieuM Offline
                Mathieu @Mathieu
                last edited by

                So, after a night of memtest and a few hours of cpu stress test, no crash or error with the RAM modules.

                In the meantime, I moved my XO VM to a different host (outside of the pool I'm backing up) and now everything seems OK, no more crashes.
                I still don't have any explication for the host crash, but at least I got the backup working.

                BrantleyHobbsB 1 Reply Last reply Reply Quote 0
                • BrantleyHobbsB Offline
                  BrantleyHobbs @Mathieu
                  last edited by

                  @Mathieu did this ever happen to you again?

                  I recently started having this happen as well, on a host that has been running rock solid for 6-8 months now. Coincidentally also a Ryzen 9 (a 7900X). The crashes don't always happen, but they are pretty frequent, and in some cases it has caused data corruption.

                  MathieuM DanpD 2 Replies Last reply Reply Quote 0
                  • MathieuM Offline
                    Mathieu @BrantleyHobbs
                    last edited by

                    @BrantleyHobbs
                    No more crashes with backup for a long time, now.
                    My pool has been upgraded with a second host an XO is running in one of them without any hassle.

                    1 Reply Last reply Reply Quote 0
                    • DanpD Offline
                      Danp Pro Support Team @BrantleyHobbs
                      last edited by

                      @BrantleyHobbs You may want to provide some additional details on your setup, ie:

                      • Version of XCP-ng
                      • Are the host's fully patched?
                      • Current XO version or commit
                      • etc

                      Have you checked /var/crash subdirectory on the crashed host to see if kernel crash logs were captured? https://docs.xcp-ng.org/troubleshooting/log-files/

                      BrantleyHobbsB 1 Reply Last reply Reply Quote 0
                      • BrantleyHobbsB Offline
                        BrantleyHobbs @Danp
                        last edited by

                        @Danp fully patched 8.3 (through the most recent May patches). XO commit d810e (master commit e3a58). I usually update XO around the first of each month; so it's a little bit behind master.

                        There are crash logs in /var/crash. I can provide a log bundle if needed, or copy/paste some info here if I know what to provide.

                        1 Reply Last reply Reply Quote 0

                        Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                        Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                        With your input, this post could be even better 💗

                        Register Login
                        • First post
                          Last post