XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Delta Backups not working anymore on a single host

    Scheduled Pinned Locked Moved Backup
    11 Posts 4 Posters 760 Views 4 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • A Offline
      AlexD2006
      last edited by

      Hi,

      i have a really strange behaviour on one of your xcp-ng Hosts.
      We have some XCP-ng Pools and 2 identical StandAlone Hosts.

      We use Delta-Backups (nightly) on a Xen-Orchestra VM from sources.

      A few weeks ago Delta Backups suddenly stopped working on only one of the two Standalone-Hosts, while Delta Backups keep working without any Problems on all other Hosts/Pools.

      The two identical Stand-Alone Hosts are:

      • Lenovo SR655
      • AMD EPYC 7282 16-Core Processor
      • 512GB RAM
      • Local ext4 SAS-Raid (around 3,5TB used of 17,3TB on a 9-Disk Raid-5)
      • 2x 10Gbit as bond0

      Both StandAlone Hosts are absolutely identical (even Firmware up2date and latest XCP-ng Patchlevel and rebootet in the last days, to try if anything will fix the problem)

      As the error initially appeared, the backup-logs started saying "stream has ended with not enough data", at the transfer-stage of the delta backups.

      I then started to clean snapshots and old backups on some VMs.
      After that, the first full backup of a that VMs was working fine, but the second then delta backup showed the same error.

      To dig deeper, i installed a fully new ubuntu 22 VM and installed Xen Orchestra from sources again and connected the 2 Standalone Hosts on that new XOfs-VM with a remote NFS-backup-remote.
      Same again. Initial Full-Backup works fine, first Delta fails one that one Host only, while working without problems on the other Host.
      But this time with staying in "transfer" forever. This status is staying even for days and the backup Job never finishes, so the job next day fails with "Error: the job (x) is already running".

      Today i restarted the XOfs-VM and updated to commit "afadc" and tried to reproduce with a new backup job with just one single VM.

      It seems to be a XCP-ng related thing, cause the other identical Host is working perfect.
      On that one Host i have the same thing. Initial Full is working, Delta comes never back and stuck at stage "transfer".

      When i watch the xe task-list while the backup is running, it seems the export-task is working fine for the delta and there is new data on the nfs-remote. Then at 100% the task dissapears, but the Backup-job stays in transfer and never comes Back.

      To eliminate all things maybe related to my "from sources" Installation (even the error is only on this one host and all others are working fine), i deployed a XOA-VM, but i cant start a free trial (you already consumed ...) and so i can not test Delta Backup.

      Do you have any ideas or maybe had a similar issue in the past?

      Kind regards
      Alex

      1 Reply Last reply Reply Quote 0
      • olivierlambertO Offline
        olivierlambert Vates 🪐 Co-Founder CEO
        last edited by

        Just let me know your email in private message so I can extend your trial 🙂

        A 1 Reply Last reply Reply Quote 0
        • A Offline
          AlexD2006 @olivierlambert
          last edited by

          @olivierlambert
          Many thanks for your help.
          Wrote you a p.m. with my e-mail. 🙂

          A 1 Reply Last reply Reply Quote 0
          • olivierlambertO Offline
            olivierlambert Vates 🪐 Co-Founder CEO
            last edited by

            Done!

            1 Reply Last reply Reply Quote 0
            • A Offline
              AlexD2006 @AlexD2006
              last edited by

              Thx @olivierlambert for extending the trial.

              I will make new tests with the XOA and tell my findings.

              Meanwhile i found out why the backup-job is hanging forever on my new XOfs Installation.

              When i mounted the NFS-remote, i checked the option "Store backup as multiple data".

              I removed the remote-nfs and reconnected it without that option.
              Now the full backup is working as expected and the first delta fails with "Error: stream has ended with not enough data (actual: 430, expected: 512)".

              So maybe there is a difference in the error handling when this option is active and the exception is not handled correctly.
              Just for your Information.

              I will come back when i tested with XOA.

              1 Reply Last reply Reply Quote 0
              • olivierlambertO Offline
                olivierlambert Vates 🪐 Co-Founder CEO
                last edited by

                What's your filesystem on this NFS machine?

                A 2 Replies Last reply Reply Quote 0
                • A Offline
                  AlexD2006 @olivierlambert
                  last edited by

                  @olivierlambert

                  its a 12-Disk Synology-NAS with btrfs.
                  On that NFS-Storage are multiple folders exported as remotes in my XOfs Installations.
                  As i said. All other Hosts/XOfs Installations work fine on that NFS-Storage.
                  Only this one specific XCP-ng Host has these Problems. So i think its not XenOrchestra related. It seems to be a problem with XCP-ng on this specific Host, but the identical second Host does not have this Problem.

                  1 Reply Last reply Reply Quote 0
                  • A Offline
                    AlexD2006 @olivierlambert
                    last edited by

                    ok, so i tested with XOA.

                    fully new (empty) NFS-Export mounted as remote in XOA.

                    • initial full backup in delta backup job successfull.
                    • delta job fails with "Error: Expected values to be strictly equal: 430 !== 1536"

                    i will crosstest now with a VM on the working Host.

                    A 1 Reply Last reply Reply Quote 0
                    • A Offline
                      AlexD2006 @AlexD2006
                      last edited by

                      crosstestet with XOA

                      • same remote-nfs
                      • identical delta backup job
                      • VM on the other identical Host.

                      All working fine as expected.

                      So it seems that first Host has some Problems and is not providing any useful data when it comes to exporting the first delta snapshot.

                      Unfortunately i have to change to customer support right now and have to stop my testings for today.
                      I will keep going tomorrow.

                      Thx for your support so far.

                      florentF 1 Reply Last reply Reply Quote 0
                      • florentF Offline
                        florent Vates 🪐 XO Team @AlexD2006
                        last edited by

                        @AlexD2006 hi , this error means the host was cutting the backup stream. We have a few PR in the pipeline thath will improve (a little) the behaviour and help diagnose the root cause (XO or XCP)

                        1 Reply Last reply Reply Quote 0
                        • F Offline
                          fitcfitcfatc
                          last edited by

                          Hi,

                          sorry for the long break. i'll continue here for my colleague AlexD2006.

                          We have now also tried with the current version (deeb3) of Xen-Orchestra, the problem remains the same. But we have now noticed other error messages (also with the old version) that appear during a delta backup ("base VDI is not a vhd; cannot compute differences"):

                          May 23 16:40:35 xcp-mono02 xapi: [error||7644621 HTTPS 10.32.1.159->:::80|[XO] Exporting content of VDI test-vm.test 1 R:2e892fda7776|vhd_tool_wrapper] base VDI is not a vhd; cannot compute differences
                          May 23 16:40:37 xcp-mono02 xapi: [error||7644621 :::80||backtrace] [XO] Exporting content of VDI test-vm.test 1 R:2e892fda7776 failed with exception (Failure "base VDI is not a vhd; cannot compute differences")
                          May 23 16:40:37 xcp-mono02 xapi: [error||7644621 :::80||backtrace] Raised (Failure "base VDI is not a vhd; cannot compute differences")
                          May 23 16:40:37 xcp-mono02 xapi: [error||7644621 :::80||backtrace] 1/14 xapi Raised at file stdlib.ml, line 29
                          May 23 16:40:37 xcp-mono02 xapi: [error||7644621 :::80||backtrace] 2/14 xapi Called from file ocaml/xapi/vhd_tool_wrapper.ml, line 206
                          May 23 16:40:37 xcp-mono02 xapi: [error||7644621 :::80||backtrace] 3/14 xapi Called from file ocaml/xapi/export_raw_vdi.ml, line 50
                          May 23 16:40:37 xcp-mono02 xapi: [error||7644621 :::80||backtrace] 4/14 xapi Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
                          May 23 16:40:37 xcp-mono02 xapi: [error||7644621 :::80||backtrace] 5/14 xapi Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 35
                          May 23 16:40:37 xcp-mono02 xapi: [error||7644621 :::80||backtrace] 6/14 xapi Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
                          May 23 16:40:37 xcp-mono02 xapi: [error||7644621 :::80||backtrace] 7/14 xapi Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 35
                          May 23 16:40:37 xcp-mono02 xapi: [error||7644621 :::80||backtrace] 8/14 xapi Called from file ocaml/xapi/export_raw_vdi.ml, line 90
                          May 23 16:40:37 xcp-mono02 xapi: [error||7644621 :::80||backtrace] 9/14 xapi Called from file ocaml/xapi/export_raw_vdi.ml, line 116
                          May 23 16:40:37 xcp-mono02 xapi: [error||7644621 :::80||backtrace] 10/14 xapi Called from file ocaml/xapi/server_helpers.ml, line 101
                          May 23 16:40:37 xcp-mono02 xapi: [error||7644621 :::80||backtrace] 11/14 xapi Called from file ocaml/xapi/server_helpers.ml, line 122
                          May 23 16:40:37 xcp-mono02 xapi: [error||7644621 :::80||backtrace] 12/14 xapi Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
                          May 23 16:40:37 xcp-mono02 xapi: [error||7644621 :::80||backtrace] 13/14 xapi Called from file string.ml, line 115
                          May 23 16:40:37 xcp-mono02 xapi: [error||7644621 :::80||backtrace] 14/14 xapi Called from file src/sexp.ml, line 113
                          May 23 16:40:37 xcp-mono02 xapi: [error||7644621 :::80||backtrace]
                          

                          Can anyone say anything about the cause of the problem or any ideas for further analysis?

                          1 Reply Last reply Reply Quote 0
                          • olivierlambertO olivierlambert moved this topic from Something else on
                          • First post
                            Last post