Backup Health Check Procedure
-
Apologies if this has been addressed somewhere else, a quick search didn't yield any keyword responses.
I've created a delta backup job that's been running fine for some time. When I tick the "health check" in the schedule (and select the destination SR), the health check portion of the job ultimately fails. It performs the backup like it should, performs the restore and boots the VM to the point that the management agent is recognized, but then just pauses. After ~10 minutes it will power down the VMs and report a job failure. I know there is supposedly a way to execute a script (https://xen-orchestra.com/docs/backups.html#backup-health-check), but 1) I'm happy enough with just the boot check & 2) I'm not seeing a method of tagging, or in this case untagging, the restored VMs to remove the script method. The restored VM objects are given a "restored from backup" and a "xo:no-bak=Health Check" tag.
I see there is a way to restore based on tag, so I can see adding a "xo-backup-healthcheck-xenstore" to an existing VM to force the script method restore, but in this case I don't want the script method.
What am I missing?
-
If it's waiting, it's maybe because the tools aren't installed or booted in the restored/tested VM?
-
@olivierlambert From what I can see, the management agent appears to be operational. I'm up for other ideas to check, if you have some?
-
@olivierlambert And a Windows machine...this one is admittedly installed with the agent I pulled from Citrix (https://www.xenserver.com/downloads). Same backup job/schedule though.
-
What XO version are you using exactly? Are you on stable or latest?
-
@olivierlambert -- stable channel
Current version: 5.92.1 - XOA build: 20240311- node: 18.19.1
- npm: 10.2.4
- xen-orchestra-upload-ova: 0.1.6
- xo-cli-premium: 0.26.0
- xo-server: 5.138.1
- xo-server-audit-premium: 0.10.6
- xo-server-auth-github-premium: 0.3.1
- xo-server-auth-google-premium: 0.3.1
- xo-server-auth-ldap-premium: 0.10.8
- xo-server-auth-oidc-premium: 0.3.0
- xo-server-auth-saml-premium: 0.11.0
- xo-server-backup-reports-premium: 0.18.0
- xo-server-load-balancer-premium: 0.8.1
- xo-server-netbox-premium: 1.4.0
- xo-server-netdata-premium: 0.2.0
- xo-server-perf-alert-premium: 0.3.6
- xo-server-sdn-controller-premium: 1.0.8
- xo-server-telemetry: 0.5.0
- xo-server-transport-email-premium: 1.0.0
- xo-server-transport-icinga2-premium: 0.1.2
- xo-server-transport-nagios-premium: 1.0.1
- xo-server-transport-slack-premium: 0.0.1
- xo-server-transport-xmpp-premium: 0.1.3
- xo-server-usage-report-premium: 0.10.5
- xo-server-web-hooks-premium: 0.3.3
- xo-server-xoa: 0.27.0
- xo-web-premium: 5.140.0
- xoa-cli: 0.38.1
- xoa-updater: 0.48.1
-
Can you try on
latest
and see if it works? -
@olivierlambert Same behavior. Should I open a case with support?
I have noticed if I do a manual health check outside of the backup job, it behaves as I'd expect....restores VM>>power-on>>management tools seen>>power-off>>destroy.
-
Yes, that would be better to open a ticket at that point