XCP-ng 7.6 RC1 available

conradical

@rizaemet-0 said in XCP-ng 7.6 RC1 available:

migrate issue with Linux or Windows, PV driver installed or not, vm or storage

I can reproduce on my end almost every time. the migration hangs more often than not.

olivierlambert

Anything in logs that could help us to track down the issue?

conradical

I haven't seen much, but I am curious about my post earlier about xcp-emu-manager, I see emu-manager referenced in the logs and was curious if xcp-emu-manager should be referenced.

rizaemet 0

I did a test. Tried a storage live migrate on a vm which have one sanpshot. And cause this migrate problem. stack in 99%.
Then I delete snapshot of this vm and wait coalesce process. then tried again storage live migrate. Successfull.
I did this test on same vm. so, for now I think problem source is look like snapshot. I will do more test.

olivierlambert

On 7.6, right?

rizaemet 0

@olivierlambert Sorry, maybe not right topic for this issue but my xcp-ng is 7.5

olivierlambert

Okay so it's not the right topic indeed, but I really want to know if you still have these issues on 7.6 RC

rizaemet 0

@olivierlambert I will try and will post result

olivierlambert

Thanks a lot! It's important to test between 2x 7.6 hosts, to be sure it's still an issue and not a problem from another version in the middle

conradical

@olivierlambert said in XCP-ng 7.6 RC1 available:

o test between 2x 7.6 hosts, to be sure it's still an issue and n

Mine is between multiple 7.6 hosts. PV tools, no PV tools, linux, windows. It all does it.

rizaemet 0

@conradical vm have any snapshot what you try to migrate? is this storage live migrate or vm live migrate?

conradical

@rizaemet-0
No snapshots. It happens on both storage migrate and love migrate.

olivierlambert

Any activity in the VM or it's idle?

Are they in the same pool?

conradical

@olivierlambert
Same Pool. It does not seem to matter if it is idle or not.

conradical

Interesting update: I installed the latest PV tools by enabling window update, this fixed 4 out of my 5 test VMs running windows. More testing is underway with linux PV and HVM.

olivierlambert

That's maybe the reason why I can't reproduce. It's like old tools will lead to migration problems but not more recent ones

conradical

@olivierlambert said in XCP-ng 7.6 RC1 available:

That's maybe the reason why I can't reproduce. It's like old tools will lead to migration problems but not more recent ones

HVM also have migration problems.

olivierlambert

I'm testing with PVHVM guests, I can't reproduce since I got latest tools installed

KFC-Netearth

I don't know if this helps but I have also been doing some test on 7.5 and it looks like VMs that are using a large percentage of their memory have problems migrating.
eg Centos 7 HVM with Atlassian bitbucket java process using about 80%
From top:

PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND 
 1874 atlbitb+  20   0 3695600   1.3g   6440 S   0.3 42.8   0:52.40 java                 
 1891 atlbitb+  20   0 4154320   1.1g   8124 S   1.3 36.5   6:40.68 java

Live migration stops at 99% and stays there till you cancel the migration and I have seen it go to 100% and the VM crash.
You then need to restart the toolstack on the receiving server or
/var/log/xcp-rrdd-plugins.log
fills up with these messages

 xcp-rrdd-squeezed: [ warn|xen1-3|0 ||rrdd-plugins] Couldn't find cached dynamic-max value for domain 66, using 0

Shut down the 2 java processes and it migrates OK.

I have also seen the same problem with migration of virtual disks between iSCSI SRs

borzel

@KFC-Netearth edited your post to see the top output better