Short VM freeze when migrating to another host
-
We're seeing this issue when trying to migrate a Debian VM with 16GB of RAM.
It is a worker node in a Kubernetes cluster so it is likely that the RAM changes a fair bit. It is not uncommon for the migration to fail due to the freeze hitting a 30 second time limit.
A Windows 10 Pro VM with 16GB of RAM migrates fine, because not much is changing in the RAM I expect.
Following along for recommendations! Our hosts sound very similar to @arc1 except our network speed is slower, which is one thing we are working on.
-
And you guys aren't using any kind of dynamic memory?
Can you post a screen dump of the Advanced tab where it shows the memory configuration?We have VM's with 128Gb ram that migrates just fine, when migrating it between hosts the network shows peaks at 7,6Gbit/s and it is migrated in about ~20 seconds.
Smaller VM's with 8, 16 or even 32Gb ram is migrated almost instantly. -
@nikade
in VM (linux) with a free i see 94 gb of total memory -
@nikade thanks for the ideas of where to look!
In my case we're testing and just have a 1Gb link between these hosts, which is what I was putting it down to.
This particular VM is a freshly migrated PV from Debian Xen with:
Memory limits (min/max)
Static: 16 MiB/16 GiB
Dynamic: 8 GiB/16 GiBCould that Dynamic setting be the problem because as I recall it reduces the VM to 8 on migrate, so when doing the migrate perhaps 8 isn't enough for the VM?
I will try changing it to 16/16 and see if that has any noticeable impact. Thanks!
-
Yes that's very likely.
-
@andrewperry yeah try set it 16/16Gb instead, it will probably do some magic
-
@olivierlambert hi, today i've upgraded my host..
The big VM frozen for ~7 minutes, is a big vm (96 gbram and 32 cpu) but 7 minutes is a very long time (for customer!)
i've setting 96/06 in dynamic: is a normal time? -
Can you provide more details? 96/96 in dynamic? Just doing live migration or suspending the VM?
-
@olivierlambert live migration, the vm is very important (today, in christmas holyday, i've received some phone calls for 7 minutes of freeze..)
-
The topology looks insane Also, a live migration shouldn't make the VM inaccessible for more than few seconds, except if there is a LOT of memory pages changes, at a pace that is close to the transfer speed.
-
@olivierlambert ops.. why the best topology?
-
IIRC, just remove it (small cross) so it should use something default