Thanks @ph7, I'll try to have a look to understand what's going on
Posts
-
RE: Restore only showing 1 VM
-
RE: found reproductible BUG with FLR
Hi @Pilow,
Thanks for the report.
We are aware that there are many problems with the FLR. We would like to fix them but they are not easy to fix, and we can't give an estimation date for a fix. I've linked this topic to our investigation ticket.
For the moment, when FLR fails, we recommend to manually restore your files by following this documentation: https://github.com/vatesfr/xen-orchestra/blob/master/%40vates/fuse-vhd/README.md#restore-a-file-from-a-vhd-using-fuse-vhd-cli
-
RE: Backup retention policy and key backup interval
@Pilow The backups kept by LTR are just regular backups with a specific tag, which doesn't change how we treat them.
If you want to avoid each of your LTR backup to depend on one another, we recommend to set a full backup interval value to your backup job, which will regularly force a full backup. (even without LTR, having an infinite chain of backups can cause problem in the long term, especially if no healthchecks are made)
-
RE: Backup retention policy and key backup interval
Thanks @Pilow for doing the explanations.
It may be configurable in the future, but for now LTR picks the first backup of the day, week, month and year. Depending on the timezone of your XOA, the first day of the week may either be Monday or Sunday.
There was initially a bug, making LTR pick the last backup of a time period instead of the first, but this has been fixed a couple of months ago.
-
RE: Restore only showing 1 VM
@ph7 Ok, thanks.
But the most important will be to do this test while some VMs are missing from the backup restore page (if it happens again)
-
RE: Restore only showing 1 VM
@ph7 Ok, let's do this.
If it happens again, can you check that XO can still access your remote on which missing VM backups are stored? (by using button "test your remote" on page settings > remote)
It may just me a network issue.
-
RE: Migrations after updates
Hi @acebmxer,
I've made some tests with a small infrastructure, which helped me understand the behaviour you encounter.
With the performance plan, the load balancer can trigger migrations in the following cases:
- to better satisfy affinity or anti-affinity constraints
- if a host has a memory or CPU usage exceeds a threshold (85% of the CPU critical threshold, of 1.2 times the free memory critical threshold)
- with vCPU balancing behaviour, if the vCPU/CPU ratio differs too much from one host to another AND at least one host has more vCPUs than CPUs
- with preventive behaviour, if CPU usage differs too much from one host to another AND at least one host has more than 25% CPU usage
After a host restart, your VMs will be unevenly distributed, but this will not trigger a migration if there are no anti-affinity constraints to satisfy, if no memory or CPU usage thresholds are exceeded, and if no host has more CPUs than vCPUs.
If you want migrations to happen after a host restart, you should probably try using the "preventive" behaviour, which can trigger migrations even if thresholds are not reached. However it's based on CPU usage, so if your VMs use a lot of memory but don't use much CPU, this might not be ideal as well.
We've received very few feedback about the "preventive" behaviour, so we'd be happy to have yours.

As we said before, lowering the critical thresholds might also be a solution, but I think it will make the load balancer less effective if you encounter heavy load a some point.
-
RE: Migrations after updates
@Greg_E The RPU is supposed to disable the load balancer, but it's possible that when the load balancer restarts at the end of the RPU, it takes into account the host stats during the RPU, which may create some unexpected migrations.
We'll have to investigate on that. Thanks for the feedback.
-
RE: Migrations after updates
@acebmxer at the moment I don't know what could cause this behaviour. I'll try to reproduce it during the following days.
I think setting the memory limit to half of the host RAM is fine if you don't expect too much load, but if you're getting a lot of RAM use on your hosts at some point, I'm not sure the load balancer will migrate VMs from a host at 90% RAM use to a host at 60% RAM use, as both exceed the limit.
Also, could you try again to reproduce the bug after changing the "performance plan behaviour" setting to conservative, to see if it changes something? The "vCPU balancing" mode is quite recent, so maybe there's some bug with it that we didn't discover yet.
-
RE: Migrations after updates
Hi @acebmxer,
I think the reason for this is a feature we recently added that prevents VMs from moving back-and-forth between hosts. VMs now have a cooldown (default 30min) between 2 load-balancer-triggered migrations
Can you try to set the migration cooldown to 0 and tell us if it fixes this behaviour? (in the "Advanced" section of the load balancer configuration)
-
RE: backup mail report says INTERRUPTED but it's not ?
Hi @MajorP93,
This PR is only about changing the way we delete old logs (linked to a bigger work of making backups use XO tasks instead of their own task system), it won't fix the issue discussed in this topic.
-
RE: Unkown PCI device attached to VM
Hi @champagnecharly ,
On the XO side, it seems that this PCI has an empty string ID, which doesn't enable us to delete it.
We'll have to do some tests to find out how to prevent that.We might have trouble reproducing the issue, so would you mind helping us with the tests?
You would need to add this piece of code on filexo-server/dist/xapi-object-to-xo.mjsbefore the line that start withif (isHvm) {(that should be near line 475)if ((_vm$attachedPcis = vm.attachedPcis) !== null && _vm$attachedPcis !== void 0 && _vm$attachedPcis.includes('')) { warn('Empty string PCI id:', otherConfig.pci); }then restart xo-server and look at the output of
journalctl, there should be some lines looking like:2026-01-30T09:26:17.763Z xo:server:xapi-objects-to-xo WARN Empty string PCI id: -
RE: backup mail report says INTERRUPTED but it's not ?
We just merged the delay: https://github.com/vatesfr/xen-orchestra/pull/9400
We increased it to 5s to have a security margin, as the optimal delay may not be the same on different configurations.
-
RE: backup mail report says INTERRUPTED but it's not ?
We're still carrying a bit of investigations to see if we can find the cause of the problem, but if we don't find it we'll add this delay.
Thanks @Pilow for the tests once again

-
RE: backup mail report says INTERRUPTED but it's not ?
Ok so 1s is slightly not enough, thanks for the update.
-
RE: backup mail report says INTERRUPTED but it's not ?
Thanks again @Pilow
I don't think the remotes being S3 changes something here.
-
RE: There are any commands that allow me to verify the integrity of the backup files?
@cbaguzman for information, I made some changes on
vhd-cliso in the future we'll get a more explanatory error message when a command failed because we passed an incorrect argument: https://github.com/vatesfr/xen-orchestra/pull/9386 -
RE: backup mail report says INTERRUPTED but it's not ?
Hi @Pilow,
I've done some more testing and looked at the code, and I wasn't able to reproduce this behaviour once. It's also unclear to me why it can happen.
We may just add the delay as you did, but 10s is probably too long. Could you try to replace it by a 1s delay instead, and tell us if it's enough?
-
RE: There are any commands that allow me to verify the integrity of the backup files?
Hi @cbaguzman,
I tested on my own and I got the same result as you, but then I realized the AI you used both tricked us into thinking that the
--chainwas a valid option for theinfocommand (it's not).I removed this option and the command worked properly.
Can you try the same command without this option?