XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Slow Backups | XOA Performance Test – Upgrading from 2 vCPU to 4 vCPU / 8GB RAM

    Scheduled Pinned Locked Moved Backup
    backupxoaperformance
    2 Posts 1 Posters 18 Views 1 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • LoTus111L Online
      LoTus111
      last edited by

      XOA Performance Test – Upgrading from 2 vCPU to 4 vCPU / 8GB RAM

      I wanted to share my experience because I was seeing unusually high CPU usage on my XOA VM during backup operations.

      Environment:

      • XCP-ng pool with multiple hosts
      • XOA running as a VM
      • SSD RAID10 NFS storage
      • Daily delta backups (~17 production VMs)
      • Weekly backup jobs with Health Checks
      • Backup window previously around 7 hours

      Symptoms before optimization:

      • XOA CPU usage frequently close to 100%
      • During backups, multiple backupWorker.mjs processes saturated available CPUs
      • XOA felt sluggish while backups were running
      • Backup jobs took significantly longer than expected
      • htop showed worker processes fighting for CPU resources

      Original XOA VM configuration:

      • 2 vCPU
      • 4 GB RAM
      • 1 socket / 2 cores

      Observed htop behavior during backups:

      /usr/local/lib/node_modules/xo-server/node_modules/@xen-orchestra/backups/backupWorker.mjs

      Several workers continuously consumed nearly all available CPU.

      IMPORTANT:

      Run the following commands via SSH directly on the XCP-ng Pool Master.

      Do NOT perform these changes through the Xen Orchestra GUI itself because you are modifying the XOA VM that is currently managing your environment. Otherwise you may lock yourself out or interrupt your own management session during the reconfiguration process.

      Connect to the Pool Master:

      ssh root@<POOL-MASTER-IP>

      First identify the XOA VM UUID:

      xe vm-list | grep -i xoa -B1 -A2
      or:
      xe vm-list name-label="<YOUR-XOA-VM-NAME>"

      Example output:

      uuid ( RO): xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
      name-label ( RW): XOA-Production
      power-state ( RO): running

      Copy the UUID for the following steps.

      Test procedure:
      I increased XOA resources to:

      New XOA configuration:

      • 4 vCPU
      • 8 GB RAM
      • CPU topology: 1 socket / 4 cores

      Procedure used:

      1. Shutdown XOA VM

      xe vm-shutdown uuid=<XOA-VM-UUID>

      Wait a few seconds and verify status:

      xe vm-list
      uuid=<XOA-VM-UUID>
      params=name-label,power-state

      1. Set memory

      xe vm-memory-limits-set
      uuid=<XOA-VM-UUID>
      static-min=8GiB
      dynamic-min=8GiB
      dynamic-max=8GiB
      static-max=8GiB

      1. Increase vCPU count

      xe vm-param-set
      uuid=<XOA-VM-UUID>
      VCPUs-max=4

      xe vm-param-set
      uuid=<XOA-VM-UUID>
      VCPUs-at-startup=4

      1. Set CPU topology

      xe vm-param-set
      uuid=<XOA-VM-UUID>
      platform:cores-per-socket=4

      1. Start XOA again

      xe vm-start uuid=<XOA-VM-UUID>

      Verification:

      xe vm-param-get
      uuid=<XOA-VM-UUID>
      param-name=VCPUs-max

      xe vm-param-get
      uuid=<XOA-VM-UUID>
      param-name=VCPUs-at-startup

      xe vm-param-get
      uuid=<XOA-VM-UUID>
      param-name=platform

      xe vm-list
      uuid=<XOA-VM-UUID>
      params=name-label,power-state

      Results from the same backup job:

      Before:

      • Backup size: 10.58 GiB
      • Transfer speed: 150.21 MiB/s
      • Total duration: 3 minutes
      • Health check transfer: 2 minutes

      After:

      • Backup size: 10.71 GiB
      • Transfer speed: 229.12 MiB/s
      • Total duration: 2 minutes
      • Health check transfer: 1 minute

      Measured improvement:

      Transfer throughput:
      +52%

      Backup duration:
      -33%

      Health check transfer:
      -50%

      After the upgrade, htop showed backup workers distributed across all available CPUs instead of saturating only two cores.

      Conclusion:

      For my environment the bottleneck was not:

      • NFS
      • Storage
      • SSD RAID10
      • Host performance

      The bottleneck appears to have been XOA itself being underprovisioned.

      If you are running larger backup jobs, health checks, multiple workers, or backup-heavy environments, increasing XOA resources may provide a noticeable improvement.

      I still need to test the full nightly production run (~17 VMs), but initial results are very promising.

      Hope this helps someone.

      LoTus111L 1 Reply Last reply Reply Quote 0
      • LoTus111L Online
        LoTus111 @LoTus111
        last edited by

        Before:
        2vCPU.png

        After:
        4vCPU.png

        1 Reply Last reply Reply Quote 0

        Hello! It looks like you're interested in this conversation, but you don't have an account yet.

        Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

        With your input, this post could be even better 💗

        Register Login
        • First post
          Last post