XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    What Are You All Doing for Disaster Recovery of XOA Itself?

    Scheduled Pinned Locked Moved Xen Orchestra
    7 Posts 5 Posters 888 Views 3 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • planedropP Offline
      planedrop Top contributor
      last edited by

      This is something I've put a lot of time into recently but am still not quite sure the #1 way I would go about doing it.

      If XOA is hosted at one physical site, and you're planning for the possibility of that site collapsing completely, how do you go about managing other remote sites with XOA?

      Mostly posting because I am curious how others are architecting this.

      Thinking for me I will mostly just have XOA itself replicated and can power it on at another site if ever need be. I think the complication comes into IP addressing and the like though, since the other site might not have the same subnet layout.

      A 1 Reply Last reply Reply Quote 0
      • A Offline
        Andrew Top contributor @planedrop
        last edited by

        @planedrop I don't think there is a true active/standby XO setup. You can always replicate to a different pool and start/recover it there if you have a failure. You can also just have XO run on a different pool and not have it do anything (ie. backups) but still be able to access pools and restore backups.

        As a last ditch total disaster recovery option, I have XO running on a laptop (Windows/VirtualBox/Linux). It is not a true duplicate/backup but does have connectivity (from the office or a VPN) to the needed hosts so I can fix/rebuild/restore. I can start over with new location, new hardware, and recover from off-site backups.

        You can always deploy new XCP and new XO in minutes then restore your config... You have backups, right?

        planedropP 1 Reply Last reply Reply Quote 1
        • P Offline
          probain
          last edited by probain

          I treat XO as almost ephemeral. And extensive regularly take scheduled backups of its config. Both locally and off-site.
          For XO itself, I use community edition. But I've got an entire ansible-role I've built that sets it up within minutes on any configured computer. Mainly because I don't like/trust scripts that are from some random people on the internet. And also, my approach is as close to doing a step-by-step according to docs, as you can get.

          I've been thinking about sharing the ansible-role. But unsure how much interest there is for such things.

          But for XOA there isn't a need for such a thing. And still, with the extremely easy restore config from backup-file possibility. I wouldn't really do something crazy and overly complicated either.

          1 Reply Last reply Reply Quote 0
          • planedropP Offline
            planedrop Top contributor @Andrew
            last edited by

            @Andrew Yeah so I hear you on this, and it all makes sense, I've recovered XO plenty of times in my lab.

            But, it still feels more complicated than it needs to be and I feel like there has to be a better solution in place. The idea with disaster recovery sites is relatively quick failover (even if manual). Having to restore XO, then reconnect it to all the servers, reconfigure it's IP addresses, etc... feels like it's more work than it should be to do a restore. Not to mention accessing the other cluster assuming something like complete natural disaster destruction at the primary site.

            As for backups, yes, I manage a lot of infrastructure like this, not a noob by any means I promise lol, I'm just trying to improve the way I am going to do this in the future, if possible.

            Restoring XOA at another site and then reconfiguring it entirely seems like a lot of work.

            1 Reply Last reply Reply Quote 0
            • D Offline
              DustinB
              last edited by

              If you really require uptime of XOA, you could use a public cloud like AWS, setup an EKS cluster across 3 AZ's each having a Site to Site VPN and an EC2 instance which provides a connection so you can access the hosts at a given site.

              It's an expensive approach, but the only way I can think of that would ensure XOA was running at all times in the event of an individual outage.

              planedropP 1 Reply Last reply Reply Quote 0
              • olivierlambertO Offline
                olivierlambert Vates 🪐 Co-Founder CEO
                last edited by

                XOA got a config you can save in your xen-orchestra.com account, redeploy a fresh one, it detects the config, ask for import and boom.

                1 Reply Last reply Reply Quote 0
                • planedropP Offline
                  planedrop Top contributor @DustinB
                  last edited by

                  @DustinB Yeah I thought about doing something like this, but then we are getting into overkill costs lol.

                  My main point behind this post was to get info on what other people are doing, I presumed most are just relying on config backups and will restore it at the other site if needed, which is fine.

                  I was just also debating about whether or not I wanted the subnets to match between these 2 sites, since that can create some routing headaches, but should still be doable and would make recovery even faster.

                  @olivierlambert yeah I do use that feature, so that's very helpful! Should make this all doable.

                  1 Reply Last reply Reply Quote 0
                  • First post
                    Last post