XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    What Are You All Doing for Disaster Recovery of XOA Itself?

    Scheduled Pinned Locked Moved Xen Orchestra
    10 Posts 7 Posters 985 Views 5 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • planedropP Offline
      planedrop Top contributor
      last edited by

      This is something I've put a lot of time into recently but am still not quite sure the #1 way I would go about doing it.

      If XOA is hosted at one physical site, and you're planning for the possibility of that site collapsing completely, how do you go about managing other remote sites with XOA?

      Mostly posting because I am curious how others are architecting this.

      Thinking for me I will mostly just have XOA itself replicated and can power it on at another site if ever need be. I think the complication comes into IP addressing and the like though, since the other site might not have the same subnet layout.

      A 1 Reply Last reply Reply Quote 0
      • A Offline
        Andrew Top contributor @planedrop
        last edited by

        @planedrop I don't think there is a true active/standby XO setup. You can always replicate to a different pool and start/recover it there if you have a failure. You can also just have XO run on a different pool and not have it do anything (ie. backups) but still be able to access pools and restore backups.

        As a last ditch total disaster recovery option, I have XO running on a laptop (Windows/VirtualBox/Linux). It is not a true duplicate/backup but does have connectivity (from the office or a VPN) to the needed hosts so I can fix/rebuild/restore. I can start over with new location, new hardware, and recover from off-site backups.

        You can always deploy new XCP and new XO in minutes then restore your config... You have backups, right?

        planedropP 1 Reply Last reply Reply Quote 1
        • P Offline
          probain
          last edited by probain

          I treat XO as almost ephemeral. And extensive regularly take scheduled backups of its config. Both locally and off-site.
          For XO itself, I use community edition. But I've got an entire ansible-role I've built that sets it up within minutes on any configured computer. Mainly because I don't like/trust scripts that are from some random people on the internet. And also, my approach is as close to doing a step-by-step according to docs, as you can get.

          I've been thinking about sharing the ansible-role. But unsure how much interest there is for such things.

          But for XOA there isn't a need for such a thing. And still, with the extremely easy restore config from backup-file possibility. I wouldn't really do something crazy and overly complicated either.

          W 1 Reply Last reply Reply Quote 0
          • planedropP Offline
            planedrop Top contributor @Andrew
            last edited by

            @Andrew Yeah so I hear you on this, and it all makes sense, I've recovered XO plenty of times in my lab.

            But, it still feels more complicated than it needs to be and I feel like there has to be a better solution in place. The idea with disaster recovery sites is relatively quick failover (even if manual). Having to restore XO, then reconnect it to all the servers, reconfigure it's IP addresses, etc... feels like it's more work than it should be to do a restore. Not to mention accessing the other cluster assuming something like complete natural disaster destruction at the primary site.

            As for backups, yes, I manage a lot of infrastructure like this, not a noob by any means I promise lol, I'm just trying to improve the way I am going to do this in the future, if possible.

            Restoring XOA at another site and then reconfiguring it entirely seems like a lot of work.

            1 Reply Last reply Reply Quote 0
            • D Offline
              DustinB
              last edited by

              If you really require uptime of XOA, you could use a public cloud like AWS, setup an EKS cluster across 3 AZ's each having a Site to Site VPN and an EC2 instance which provides a connection so you can access the hosts at a given site.

              It's an expensive approach, but the only way I can think of that would ensure XOA was running at all times in the event of an individual outage.

              planedropP 1 Reply Last reply Reply Quote 0
              • olivierlambertO Offline
                olivierlambert Vates 🪐 Co-Founder CEO
                last edited by

                XOA got a config you can save in your xen-orchestra.com account, redeploy a fresh one, it detects the config, ask for import and boom.

                1 Reply Last reply Reply Quote 0
                • planedropP Offline
                  planedrop Top contributor @DustinB
                  last edited by

                  @DustinB Yeah I thought about doing something like this, but then we are getting into overkill costs lol.

                  My main point behind this post was to get info on what other people are doing, I presumed most are just relying on config backups and will restore it at the other site if needed, which is fine.

                  I was just also debating about whether or not I wanted the subnets to match between these 2 sites, since that can create some routing headaches, but should still be doable and would make recovery even faster.

                  @olivierlambert yeah I do use that feature, so that's very helpful! Should make this all doable.

                  1 Reply Last reply Reply Quote 0
                  • W Offline
                    wezke @probain
                    last edited by

                    @probain

                    I would be interested in youre ansible role for deploying XO from sources 🙂

                    M 1 Reply Last reply Reply Quote 1
                    • M Offline
                      MajorP93 @wezke
                      last edited by MajorP93

                      @wezke said in What Are You All Doing for Disaster Recovery of XOA Itself?:

                      @probain

                      I would be interested in youre ansible role for deploying XO from sources 🙂

                      I don't think there is a need to build an ansible role for this purpose when the whole process has been scripted in bash already by ronivay 😅

                      You can just download and execute the script via ansible and you're good to go:

                        tasks:
                        - name: Download ronivay XO from sources bash script
                          ansible.builtin.shell: wget "https://raw.githubusercontent.com/ronivay/XenOrchestraInstallerUpdater/refs/heads/master/xo-install.sh" -O /tmp/xo-install.sh
                      
                        - name: Set permissions
                          ansible.builtin.shell: chmod +x /tmp/xo-install.sh
                      
                        - name: Run ronivay XO from sources bash script
                          become: yes
                          ansible.builtin.shell: /tmp/xo-install.sh --install
                      
                      
                      P 1 Reply Last reply Reply Quote 0
                      • P Offline
                        probain @MajorP93
                        last edited by

                        @wezke
                        I'll try to get some time later on and make a cloned & public version release of my role. No promises about timeframe, as I am absolutely swamped right now.

                        @MajorP93

                        Well yes and no.

                        A script is harder to verify for maliciousness. I'm not saying that this script is bad. Rather that I have a paranoid standard practice that I start from.

                        Ansible has several benefits as well. That I value above using specialized scripts.

                        The beauty is that everyone can use and do whatever solution they like the best.

                        1 Reply Last reply Reply Quote 0
                        • First post
                          Last post