Xen Orchestra Load Balancer - turning on hosts
-
Yes, but it was written before we realized it's not greatly supported when using a shared storage. We'll reword the doc
-
@olivierlambert said in Xen Orchestra Load Balancer - turning on hosts:
Yes, but it was written before we realized it's not greatly supported when using a shared storage. We'll reword the doc
So what is current logic of "density profile" of load balancer?
IMHO it is a great feature, like k8s for VMs. With energy crisis having an option to shut down hosts completely and then spin them back up when necessary, auto-magically? Sounds like an awesome feature that in a big enterprise would save so much on electricity that Xen Orchestra subscription basically paid for itself.
-
Yes, that was exactly the initial plan
However, the storage stack (SMAPIv1) isn't made to work correctly for shared SR when some hosts are down in the pool. Because it's not possible to know if the host is truly shutdown or just offline for the management but maybe still connected on the SR side, so to avoid data corruption, it will block some SR operations on it as long as a host is down.
If you don't use a shared SR, it's fine though.
-
@olivierlambert thanks!
I, unfortunately, use shared NFS storage to store all VMs across the pool, as it is simpler and makes migration between hosts way faster.
So to work correctly, that would require dividing current pool into chunks for each host? And if I do so, under which condition will LB spin-up host back online?
Also, any plans to switch from SMAPIv1 to something else, so that use-case is doable?
-
Yes, we want to get rid of SMAPIv1 to go to SMAPIv3, however, it's not an easy task, but we started a while ago, and are making some progress. Clearly, we'll try to get a shared NFS working while some hosts are down, but this needs some thinking.
I agree having a shared SR is great (that's what we do in production), but sadly, for now, there's no simple option to enjoy this + having hosts shutdown.
-
Well, the hosts are shutting down with no issues.
What is happening is that usually LB will migrate all the VMs to the pool master and then shut down all other hosts. My issue is with LB sending (or in my case not sending) WoL packets when I try to start/deploy new VM that doesn't fit on pool master anymore.
-
Oh yes, your hosts will shutdown OK
But IIRC, when you'll try to make a SR operation (like a VM snapshot) while one host is down, you'll have an error. Feel free to test it so we can be 100% sure
-
@olivierlambert I see what you did there
- My mistake - currently I use SMB shared storage, not NFS.
- Booted up the pool, waited for LB to shut down everything except master. Then decided to snapshot the XOA itself using XCP-ng center.
No issues whatsoever. Just disks, with memory, everything is fine. The XOA is on SMB SR.
- Snapshoted the XOA using... XOA.
Also no issues.
-
I would have asked to live migrate on the left host, but IDK how many hosts you have in your pool. Also, please don't provide XCP-ng Center screenshot, I'm allergic to this tool, that I will hope will die at some point (when XO Lite will replace it)
-
Live migrate VM from one host to another? That was working yesterday, as long as I exclude the host from LB. I can do that if you come up with a test-case
You may be allergic to XCP-ng, but sometimes it is the single reason to have windows installed There are some things that are easier to find (like WoL settings, I found it by mistake in XCP-ng, no idea where it is in Orchestra), but most important thing is that Center is outside the pool, so in case of screw-up I can control the pool and start Orchestra
-
As it will be possible with XO Lite
Anyway, back to the topic, I wonder if the behavior is different between NFS and SMB shared SR I do remember seeing those limitation with "half powered on" pool, but I don't remember which operations exactly.
-
@olivierlambert said in Xen Orchestra Load Balancer - turning on hosts:
Anyway, back to the topic, I wonder if the behavior is different between NFS and SMB shared SR I do remember seeing those limitation with "half powered on" pool, but I don't remember which operations exactly.
State right now is that SR is "Broken" because it is Unplugged from hosts... because, you know... the hosts are down.
Snapshotting works, migration works as well (when hosts in pool are down). Also, IIRC there are no issues when hosts in a pool are down and VMs are hosted on the SR storage. That is something that happened quite often in many previous projects.
So my issue with LB plugin is deterministic way of knowing under which conditions will it spin hosts back up.
@olivierlambert said in Xen Orchestra Load Balancer - turning on hosts:
As it will be possible with XO Lite
That is awesome, no more Windows for me
-
This post is deleted! -
By looking into the source of the plugin, it looks like all the currently implemented behavior is observed.
It calculates/simulates the best plan, moves VMs and tries to shut down the host. Nothing about waking the hosts up.
And that makes sense, as the feature of spinning hosts back up should be implemented elsewhere (in context of density plan) - when starting VM, instead of providing the error message "No hosts that are able to run this VM are online", there should/could/may be:
- Evaluation of which host is the smallest one which would be able to host the desired VM (partially implemented in density-plan) and WoL/iDrac/iLO packet send to it(implemented in performance plan)
- Some timeout/waiting for boot pop-up/dialog/having this thing in queue
- After set time (let's say 5 min), call the currently implemented logic of starting VM in a pool.
P.S. Would be nice to be able to set, how often the plan is evaluated (currently 1 minute), so until "spinning hosts back up" functionality is implemented, at least host started manually won't be shut down by plan immediately.
-
Adding @ronan-a in the loop so we don't forget this
-
@olivierlambert Thank you!
-
@berish-lohith Just FYI I created a card in our backlog, I don't see too many blocking points to implement it correctly.