Hi @KPS,
Thank you for reporting this behavior. We haven't been able to reproduce the bug yet, but we'll look into it with @MathieuRA. We're a bit busy at the moment, so we probably won't be able to fix this issue before the November release.
Hi @KPS,
Thank you for reporting this behavior. We haven't been able to reproduce the bug yet, but we'll look into it with @MathieuRA. We're a bit busy at the moment, so we probably won't be able to fix this issue before the November release.
@jshiells this value is the average load across all cores on a host. To be more precise, it is a weighted average of the last 30min of this value. Migrations are triggered if this average exceeds 85% of the critical threshold defined in the plugin configuration, which is roughly 64% if you defined the critical threshold at 75%.
Other circumstances can trigger migrations :
Hi @McHenry
If you still have the problem, you can increase the healthCheckTimeout value as Olivier recommended (e.g. healthCheckTimeout = ‘30m’
), however this value should be in the [backups.defaultSettings] section of the configuration file (or in the [backups.vm.defaultSettings] section) rather than in the [backups] section.
We've detailed the documentation a bit to make this more understandable: https://github.com/vatesfr/xen-orchestra/blob/a945888e63dd37227262eddc8a14850295fa303f/packages/xo-server/config.toml#L78-L79
As Mathieu answered in this topic, the bug has been reproduced on our side but isn't trivial to fix. We'll discuss with the team to schedule this task, and we'll keep you informed when it's fixed.
You are right, the documentation isn't up to date, this isn't configurable at the moment.
We are currently working on the load balancer, so this may come in future versions.
It appears that the value of healthCheckTimeout
from the config file is indeed not taken into account.
We've created a card on our side to fix this issue, and we'll soon plan it with the team.
If you can't wait for the fix to be released, you can modify the default value of healthCheckTimeout
in the code in files @xen-orchestra/backups/_runners/VmsRemote.mjs
and @xen-orchestra/backups/_runners/VmsXapi.mjs
, then restart XO server, and this should fix it until next update.
Hi @nicols ,
As Dan said, we are indeed investigation this issue, and we will try to provide a fix during the next weeks. We will keep you informed.
Regards
There is indeed a bug in the perf-alert plugin with removable storages.
This will be fixed in an upcoming XO version by removing these SRs from the perf-alert monitoring.
Some of the errors you encountered are intended. We don't allow values in the "Virtual Machines" field if "Exclude VMs" is disabled and "All running VMs" is enabled, because it would make the plugin configuration confusing.
However you're right, there seems to be an issue when the VMs are selected and then removed. The value becomes an empty list instead of being undefined, which causes the validation to fail when we try to turn off the "Exclude VMs" option.
I'm going to create a task on our side so that we can plan to resolve this problem.
In the meantime you can work around the problem by deleting the monitor and recreating a new one with the same parameters.
Hi @KPS ,
The difference between these two settings is that sessionCookieValidity
determines the time before a user gets disconnected if they did not check the "Remember me" option, and permanentCookieValidity
determines this when this option was checked.
If you want to force users to be disconnected after 12 hours regardless of how they connected, I think you need to set both sessionCookieValidity = '12 hours'
and permanentCookieValidity = '12 hours'
.
However, this memory increase you're experiencing is intriguing, it is not an intended behaviour.
Some of the errors you encountered are intended. We don't allow values in the "Virtual Machines" field if "Exclude VMs" is disabled and "All running VMs" is enabled, because it would make the plugin configuration confusing.
However you're right, there seems to be an issue when the VMs are selected and then removed. The value becomes an empty list instead of being undefined, which causes the validation to fail when we try to turn off the "Exclude VMs" option.
I'm going to create a task on our side so that we can plan to resolve this problem.
In the meantime you can work around the problem by deleting the monitor and recreating a new one with the same parameters.
We have just merged to master a fix for this spam issue. Can you test these changes and confirm that the problem has been solved for you?
Hi @kagbasi-ngc ,
We have just merged changes to the perf-alert plugin on the master branch, which should resolve this spam problem that appeared some time ago. This fix will be available in the next XO version (5.105).
Please let us know if you encounter frequent alerts after upgrading to this version.
Hi @McHenry
If you still have the problem, you can increase the healthCheckTimeout value as Olivier recommended (e.g. healthCheckTimeout = ‘30m’
), however this value should be in the [backups.defaultSettings] section of the configuration file (or in the [backups.vm.defaultSettings] section) rather than in the [backups] section.
We've detailed the documentation a bit to make this more understandable: https://github.com/vatesfr/xen-orchestra/blob/a945888e63dd37227262eddc8a14850295fa303f/packages/xo-server/config.toml#L78-L79
It appears that the value of healthCheckTimeout
from the config file is indeed not taken into account.
We've created a card on our side to fix this issue, and we'll soon plan it with the team.
If you can't wait for the fix to be released, you can modify the default value of healthCheckTimeout
in the code in files @xen-orchestra/backups/_runners/VmsRemote.mjs
and @xen-orchestra/backups/_runners/VmsXapi.mjs
, then restart XO server, and this should fix it until next update.
As Mathieu answered in this topic, the bug has been reproduced on our side but isn't trivial to fix. We'll discuss with the team to schedule this task, and we'll keep you informed when it's fixed.
Hi @KPS,
Thank you for reporting this behavior. We haven't been able to reproduce the bug yet, but we'll look into it with @MathieuRA. We're a bit busy at the moment, so we probably won't be able to fix this issue before the November release.
There is indeed a bug in the perf-alert plugin with removable storages.
This will be fixed in an upcoming XO version by removing these SRs from the perf-alert monitoring.