XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    S3 backup fails without alert

    Scheduled Pinned Locked Moved Xen Orchestra
    4 Posts 3 Posters 176 Views 3 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • A Offline
      Andrew Top contributor
      last edited by

      So.... S3 delta backups have been working well for the most part.

      They do have problems sometimes and should do a better job of retrying due to network or AWS errors.

      Looking in the logs I see warnings about other S3 problems. These warnings cause me some concern that the backup for that VM is not working correctly. The backup job completes with a "Success" result even with the warnings in the logs. I ran the new "restore test" on the latest backup and it failed.

      Here's some of the logs (they continue...):

      May  8 01:12:55 xo1 xo-server[97773]: 2022-05-08T05:12:55.507Z xo:backups:MixinBackupWriter WARN the parent /xo-vm-backups/00371171-0a0b-f32f-916e-a79e8f8a98d4/vdis/f9405e51-da48-4eb1-980d-99da3435
      9b3c/b6834642-beb3-49d5-aaf2-bcf09ba79eef/20220421T051121Z.alias.vhd of the child /xo-vm-backups/00371171-0a0b-f32f-916e-a79e8f8a98d4/vdis/f9405e51-da48-4eb1-980d-99da34359b3c/b6834642-beb3-49d5-aa
      f2-bcf09ba79eef/20220422T051113Z.alias.vhd is unused
      May  8 01:12:55 xo1 xo-server[97773]: 2022-05-08T05:12:55.509Z xo:backups:MixinBackupWriter WARN merging /xo-vm-backups/00371171-0a0b-f32f-916e-a79e8f8a98d4/vdis/f9405e51-da48-4eb1-980d-99da34359b3
      c/b6834642-beb3-49d5-aaf2-bcf09ba79eef/20220422T051113Z.alias.vhd into /xo-vm-backups/00371171-0a0b-f32f-916e-a79e8f8a98d4/vdis/f9405e51-da48-4eb1-980d-99da34359b3c/b6834642-beb3-49d5-aaf2-bcf09ba7
      9eef/20220421T051121Z.alias.vhd
      May  8 01:13:07 xo1 xo-server[97773]: 2022-05-08T05:13:07.213Z xo:backups:MixinBackupWriter WARN the parent /xo-vm-backups/afd1ab6b-7cea-ba6c-5d32-44df289b11dc/vdis/f9405e51-da48-4eb1-980d-99da3435
      9b3c/7fe8c5db-4c26-4430-ba0e-86878f498bbe/20220423T051206Z.alias.vhd of the child /xo-vm-backups/afd1ab6b-7cea-ba6c-5d32-44df289b11dc/vdis/f9405e51-da48-4eb1-980d-99da34359b3c/7fe8c5db-4c26-4430-ba
      0e-86878f498bbe/20220424T051112Z.alias.vhd is unused
      May  8 01:13:07 xo1 xo-server[97773]: 2022-05-08T05:13:07.213Z xo:backups:MixinBackupWriter WARN merging /xo-vm-backups/afd1ab6b-7cea-ba6c-5d32-44df289b11dc/vdis/f9405e51-da48-4eb1-980d-99da34359b3
      c/7fe8c5db-4c26-4430-ba0e-86878f498bbe/20220424T051112Z.alias.vhd into /xo-vm-backups/afd1ab6b-7cea-ba6c-5d32-44df289b11dc/vdis/f9405e51-da48-4eb1-980d-99da34359b3c/7fe8c5db-4c26-4430-ba0e-86878f49
      8bbe/20220423T051206Z.alias.vhd
      May  8 01:13:17 xo1 xo-server[97773]: 2022-05-08T05:13:17.214Z xo:backups:MixinBackupWriter WARN merging /xo-vm-backups/afd1ab6b-7cea-ba6c-5d32-44df289b11dc/vdis/f9405e51-da48-4eb1-980d-99da34359b3
      c/7fe8c5db-4c26-4430-ba0e-86878f498bbe/20220424T051112Z.alias.vhd: 120/120
      May  8 01:17:16 xo1 xo-server[97773]: 2022-05-08T05:17:16.219Z xo:backups:MixinBackupWriter WARN the parent /xo-vm-backups/280d6605-e7a9-df07-1a53-53ad2fd361b0/vdis/f9405e51-da48-4eb1-980d-99da3435
      9b3c/1169c517-da85-4dc1-8a8f-d9a9aa32f942/20220423T051555Z.alias.vhd of the child /xo-vm-backups/280d6605-e7a9-df07-1a53-53ad2fd361b0/vdis/f9405e51-da48-4eb1-980d-99da34359b3c/1169c517-da85-4dc1-8a
      8f-d9a9aa32f942/20220424T051334Z.alias.vhd is unused
      May  8 01:17:16 xo1 xo-server[97773]: 2022-05-08T05:17:16.221Z xo:backups:MixinBackupWriter WARN merging /xo-vm-backups/280d6605-e7a9-df07-1a53-53ad2fd361b0/vdis/f9405e51-da48-4eb1-980d-99da34359b3
      c/1169c517-da85-4dc1-8a8f-d9a9aa32f942/20220424T051334Z.alias.vhd into /xo-vm-backups/280d6605-e7a9-df07-1a53-53ad2fd361b0/vdis/f9405e51-da48-4eb1-980d-99da34359b3c/1169c517-da85-4dc1-8a8f-d9a9aa32
      f942/20220423T051555Z.alias.vhd
      May  8 01:19:56 xo1 xo-server[97773]: 2022-05-08T05:19:56.412Z xo:backups:MixinBackupWriter WARN the parent /xo-vm-backups/2ff4f95d-7e3a-8bcf-baf1-ef1d78d4b902/vdis/f9405e51-da48-4eb1-980d-99da3435
      9b3c/0c2921b2-72f3-410c-a78b-c10cca1513dc/20220423T052142Z.alias.vhd of the child /xo-vm-backups/2ff4f95d-7e3a-8bcf-baf1-ef1d78d4b902/vdis/f9405e51-da48-4eb1-980d-99da34359b3c/0c2921b2-72f3-410c-a7
      8b-c10cca1513dc/20220424T051830Z.alias.vhd is unused
      May  8 01:19:56 xo1 xo-server[97773]: 2022-05-08T05:19:56.413Z xo:backups:MixinBackupWriter WARN merging /xo-vm-backups/2ff4f95d-7e3a-8bcf-baf1-ef1d78d4b902/vdis/f9405e51-da48-4eb1-980d-99da34359b3
      c/0c2921b2-72f3-410c-a78b-c10cca1513dc/20220424T051830Z.alias.vhd into /xo-vm-backups/2ff4f95d-7e3a-8bcf-baf1-ef1d78d4b902/vdis/f9405e51-da48-4eb1-980d-99da34359b3c/0c2921b2-72f3-410c-a78b-c10cca15
      13dc/20220423T052142Z.alias.vhd
      May  8 01:21:25 xo1 xo-server[97773]: 2022-05-08T05:21:25.013Z xo:backups:MixinBackupWriter WARN the parent /xo-vm-backups/bde73955-ef9b-23b5-dd2f-776eb971a669/vdis/f9405e51-da48-4eb1-980d-99da3435
      9b3c/68455f96-1bdb-4fa0-b1f7-69cf02714467/20220423T052923Z.alias.vhd of the child /xo-vm-backups/bde73955-ef9b-23b5-dd2f-776eb971a669/vdis/f9405e51-da48-4eb1-980d-99da34359b3c/68455f96-1bdb-4fa0-b1
      f7-69cf02714467/20220424T051859Z.alias.vhd is unused
      May  8 01:21:25 xo1 xo-server[97773]: 2022-05-08T05:21:25.014Z xo:backups:MixinBackupWriter WARN merging /xo-vm-backups/bde73955-ef9b-23b5-dd2f-776eb971a669/vdis/f9405e51-da48-4eb1-980d-99da34359b3
      c/68455f96-1bdb-4fa0-b1f7-69cf02714467/20220424T051859Z.alias.vhd into /xo-vm-backups/bde73955-ef9b-23b5-dd2f-776eb971a669/vdis/f9405e51-da48-4eb1-980d-99da34359b3c/68455f96-1bdb-4fa0-b1f7-69cf0271
      4467/20220423T052923Z.alias.vhd
      ...
      

      But the normal backup continues to be a "Success". This is BAD because the backups are not valid (for that VM). XO should flag a warning for the job. Maybe a yellow "Warning" result that lets you know something is amiss. Or maybe it should try to fix the problem? Or maybe it should flag the job a as a full backup so there is a good backup next time.

      1 Reply Last reply Reply Quote 0
      • olivierlambertO Online
        olivierlambert Vates 🪐 Co-Founder CEO
        last edited by

        Ping @florent

        1 Reply Last reply Reply Quote 0
        • florentF Offline
          florent Vates 🪐 XO Team
          last edited by

          Hi @Andrew the backup listed in restore (and health check) should be complete. We create an alias to the real vhd directory and this alias is created only after the transfer is finished.

          We are reworking the logs to be more precise and visible in the UI, because some of these are not needed anymore (and some are really hard to understand (like the unused message that can be reached by two code paths)

          maybe it's more a restore heath check issue. Do you have more details?

          A 1 Reply Last reply Reply Quote 0
          • A Offline
            Andrew Top contributor @florent
            last edited by

            @florent It looks like the restore test failed because the connection stream failed (and did not retry). It's hard to test this particular one because it is rather large.

            A good reminder to have more than one backup method (I do). I'll look for a XO update and see what changes. It is scheduled to do a full backup soon (as normal) to make sure issues like this are not a long term problem.

            1 Reply Last reply Reply Quote 0
            • First post
              Last post