XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Automatic backup verification

    Scheduled Pinned Locked Moved Xen Orchestra
    featurein backlog
    11 Posts 5 Posters 1.1k Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • sheridancomputeS Offline
      sheridancompute
      last edited by marcungeschikts

      During a live vlog with lawrencesystems the conversation came up with regards to automatic backup verification. For example Veeam, Storagecraft, etc. allow for booting up backups and taking screenshots (usually with network disabled) to ensure backups actually boot.

      It was an interesting topic to see if this is something that would beneficial to XO, leaving this here as per olivierlambert to open up the discussion.

      1 Reply Last reply Reply Quote 1
      • olivierlambertO Offline
        olivierlambert Vates 🪐 Co-Founder CEO
        last edited by

        Sure, so the hard part is: how to automatically test them?

        The easiest path to me, is to detect tools started.

        lawrencesystemsL 1 Reply Last reply Reply Quote 2
        • lawrencesystemsL Offline
          lawrencesystems Ambassador @olivierlambert
          last edited by

          olivierlambert

          That would be a simple method. Boot the system with a host only / isolated network, confirm tools start, append that to the backup log, done.

          1 Reply Last reply Reply Quote 0
          • olivierlambertO Offline
            olivierlambert Vates 🪐 Co-Founder CEO
            last edited by

            Should it be integrated to the backup job? Or should we have a different scheduler to test backup at other time than in the end of a backup?

            So right now, trying to imagine this:

            • enabling backup restore on a backup job (or in a dedicated scheduler for it)
            • selecting the main network for restore (ideally, one should use a network where no conflict should happen with production VM). Should we scrub the network interfaces automatically in all VMs to avoid any IP conflict problem?
            • we could be sure the system will boot (if we got tools answered, XO can detect this). However, we can't "validate" data eg in an extra disk in the VM. I don't see any easy way to do it, except having an agent in the VM reporting what you need is here.
            • then, when we get the result, we can put it in a dedicated report, on in the backup job report
            lawrencesystemsL 1 Reply Last reply Reply Quote 0
            • lawrencesystemsL Offline
              lawrencesystems Ambassador @olivierlambert
              last edited by

              olivierlambert I don't think it really has to do anything as advanced as checking any other part of the system. Tools such as the Datto backup uses "Screenshot Backup Verification" https://www.datto.com/technologies/screenshot-verification which shows the VM boots but nothing else to verify any other application status within the VM. I think the tool verification and starting the system with an isolated network interface to avoid IP conflict issues or the sever reaching out to anything would be enough. I like the idea as this being part of a backup job. For example I currently have delta jobs running during the week and I have a separate full back job that does the offline snapshot that runs on the weekends and that is where I would also like the "Restore Test" type of feature.

              1 Reply Last reply Reply Quote 0
              • olivierlambertO Offline
                olivierlambert Vates 🪐 Co-Founder CEO
                last edited by

                So, a kind of new option in the backup job, with a field like "Check backup every XX", opening new fields:

                • where to restore those test backups (SR and network)
                • checking if tools are up after booting them
                • then removing them
                • sending a dedicated report on the restore test

                Am I missing anything else?

                lawrencesystemsL 1 Reply Last reply Reply Quote 2
                • lawrencesystemsL Offline
                  lawrencesystems Ambassador @olivierlambert
                  last edited by

                  olivierlambert That sounds like enough to me, maybe sheridancompute or tekwendell have a few other thoughts on it.

                  1 Reply Last reply Reply Quote 0
                  • olivierlambertO Offline
                    olivierlambert Vates 🪐 Co-Founder CEO
                    last edited by

                    Adding marcungeschikts in the loop so we can think on creating an issue/card with the detailed spec (with julien-f opinion on how this could fit in our current backup code)

                    marcungeschiktsM 1 Reply Last reply Reply Quote 1
                    • sheridancomputeS Offline
                      sheridancompute
                      last edited by sheridancompute

                      Sounds right to me, that's pretty much how storagecraft do it, boot to login prompt, take screenshot and email it.

                      We've used StorageCraft for backup for years, this was one of the original winning features for us.

                      https://support.arcserve.com/s/topic/0TO36000000Ln5gGAC/advanced-verification?language=en_US

                      1 Reply Last reply Reply Quote 0
                      • marcungeschiktsM Offline
                        marcungeschikts Vates 🪐 Project mgmt @olivierlambert
                        last edited by

                        olivierlambert 👌 on it

                        julien-fJ 1 Reply Last reply Reply Quote 0
                        • julien-fJ Offline
                          julien-f Vates 🪐 Co-Founder XO Team @marcungeschikts
                          last edited by

                          Yes, it should be safe enough to try importing the backup, then starting the VM without network (ie VIFs) and see if it stays up for a few minutes.

                          I think we could start by adding a manual feature and then see in a second step how we could automate it.

                          1 Reply Last reply Reply Quote 1
                          • First post
                            Last post