XCP-ng 7.5.0 final is here
@borzel I'll try that. Thanks.
My most recent test was to have all the VMs running on server1, put it into Maint mode and watch all the VMs mmigrate to server2. It behaves the same way with the same error regardless of direction.
@borzel If I power off the VM gracefully I can start it on either server in the pool. I have to restart the toolstack on the target server following the error during a live migration and then I can start it on either machine from what appears to be a forcibly power off state (Start windows normally?).
It only happens during a live migration and so far only on this VM which happens to be Windows 2008 R2 and guest tools v7.1.
@dsiminiuk can you please create a full copy of this VM and livemigrade with that? Just to exclude some crude problems with the VDI's
@borzel Running full clone copy to new storage... (not fast clone). Standby...
@borzel The VM was cloned successfully and booted up fine.
I tried to live migrate it from the pool master (currently server1) to server2 and it's been "Migrating" for 20 minutes now, with no sign of progress. Waiting for it to succeed or fail either way. I deleted the logs before starting the migration so I'll have smaller files.
@borzel After 36 minutes I restarted the toolstack on the pool master.
The GUI gave an exception.
From the log...
2018-08-12 21:22:35,248 FATAL XenAdmin.Program [Connection to xs1] - Uncaught exception System.NullReferenceException: Object reference not set to an instance of an object.
Closed it, opened it. The Copied VM was still running on the pool master. I'm assuming the migration task was killed so that makes sense.
Now that the toolstack was restarted I'm trying live migrate it again.
I'm going to leave it overnight now and I'll report the outcome tomorrow (Monday) morning.
txsastre last edited by txsastre
here you are !
may be you're gonna receive a warning, but its a save server, still havent installed an SSL certificate
@borzel 12 hours 18 minutes later, the migration task still appears to be running with no progress indicated.
I think restarting the toolstack will return it to "running" status and kill the migration task. I have not done so yet, awaiting your next advice.
- the PV-Tools from citrix where in a something strange state
- we tried to remove them and install the XCP-ng beta testsigned drivers, but without luck
-> I will build me a test VM and play arround, to get a better understanding, what this Windows Server 2008 R2 is doing with our drivers
@dsiminiuk bad news, my fresh installed test VM (Server 2008 R2 with SP1) installed the (
betapre-alpha testsinged non-production do-not-use wear-gloves make-backups) drivers just fine:
I'm going to build a new Windows VM, anticipating the windows tools release. Thanks
I had a similar problem with my setup. 2 hosts with freenas-iscsi. Some VMs didn't migrate as they should, got stuck in a migrate/shutdown state I shutdown everything and start them again. That solved a few problems. One VM didn't find the hdd and when I tried restore from snapshot it didn't work 100% although it booted. But its just a "lab" and the VM was a pi-hole so just I re-installed it. After that I installed 1 one for redundancy
If you cant re-install it. Try shutdown it trough the cli.
Edit: I then installed the guest tools extracted from the Xenserver ISO.
Initiated live migration from server1 where I built it, to server2 (pool master)...
2018-08-14 18:20:29,259 ERROR XenAdmin.Actions.AsyncAction  - Internal error: Xenops_interface.Does_not_exist(_) 2018-08-14 18:20:29,259 ERROR XenAdmin.Actions.AsyncAction  - at XenAdmin.Network.TaskPoller.poll() at XenAdmin.Network.TaskPoller.PollToCompletion() at XenAdmin.Actions.VMActions.VMMigrateAction.Run() at XenAdmin.Actions.AsyncAction.RunWorkerThread(Object o) 2018-08-14 18:20:29,261 WARN Audit  - Operation failure: VMMigrateAction: pool1: VM 2a80d18b-b1d6-d2c0-6874-e664b9ebdb67 (SWEET): Pool 86708278-0d7d-2bb1-5024-b04c39690e15 (pool1): Host c3970498-bc7f-4a49-8fe2-23e1e6dcb6bb (xs2): Migrating
The new VM appears powered off attached to the pool.
Initiated a power on of the VM on server2. Status stays Yellow, console of VM remains blank.
I initiated a restart of the toolstack of server2 (pool master).
The XCP-ng UI crashes with two popups to check the log.
2018-08-14 18:55:54,296 ERROR XenAdmin.Program [Main program thread] - Exception in Invoke System.NullReferenceException: Object reference not set to an instance of an object. <snip> 2018-08-14 18:55:54,297 ERROR XenAdmin.Program  - Exception in Invoke (ControlType=XenAdmin.MainWindow, MethodName=<RefreshTreeView>b__63_0) System.NullReferenceException: Object reference not set to an instance of an object. at System.Windows.Forms.Control.MarshaledInvoke(Control caller, Delegate method, Object args, Boolean synchronous) <snip> 2018-08-14 18:55:54,304 FATAL XenAdmin.Program  - Uncaught exception
After closing and reopening the UI, the VM I tried to start appeared powered off still.
I initiated a startup of the new VM on server2 (pool master) and it started.
It seems to me that for whatever reason the live migration fails, it leaves the target machine (server2 in this case) in an invalid state and you cannot start a VM on it again until the toolstack is restarted.
@borzel Thank you for the hours of hands on assistance Alex.
I've decided to go back to 7.4. Even with XCP-ng Center 7.5 it behaves as expected with these same VMs.
I hope this information is useful for somebody in the future.
Maybe this is something someone could test with XS 7.5.
It looks to me as if the pv-drivers from XS 7.4 or before (because we didn't deliver windows drivers ever) causing problems on the newer xen version in XCP-ng 7.5 .... so maybe the same issue is happening in XS 7.5 (without upgrading the drivers)... and not upgrading the drivers is not recommended as I recall the upgrade instructions from Citrix
...it's just a guess from my feeling
By the way, I just found the smiley bar:
I exported all my VMs to XVA. Wiped disks, installed anew from 7.4, updated to latest with yum, imported VMs.
Everything works as it should; live migrations, everything.
Yeah but 7.4 won't be maintained, so it would be better to try to see if it's a XS issue or not.
I had a live migration issue with windows too.
Secondary host running XCP-ng 7.5, primary XS7.1, live-migrated the Win 2K12 VM from primary host to secondary fine, took 14mins. I updated XS7.1 to XCP-ng 7.5 and installed new Dell BIOS on primary host. Migrate back hung at 99%. I left it about 45mins (3-times as long as the original migrate took) and noted it was still sending data between the two.. I wonder if it was stuck in some form of loop.
The logs on src host had this in logs a lot:
Aug 16 13:55:05 xen52 xenopsd-xc: [debug|xen52|37 |Async.VM.migrate_send R:c6ab32165e47|xenops_server] TASK.signal 1305 = ["Pending",0.99] Aug 16 13:55:05 xen52 xapi: [debug|xen52|389 |org.xen.xapi.xenops.classic events D:333a26fc4942|xenops] Processing event: ["Task","1305"] Aug 16 13:55:05 xen52 xapi: [debug|xen52|389 |org.xen.xapi.xenops.classic events D:333a26fc4942|xenops] xenops event on Task 1305 Aug 16 13:55:05 xen52 xenopsd-xc: [debug|xen52|37 |Async.VM.migrate_send R:c6ab32165e47|xenops] VM = 66ebe30b-2bd2-ae59-c225-54a625655d52; domid = 2; progress = 99 / 100
I read a bit on Citrix forums and decided to shutdown the VM. Doing so from XCP-ng center didn't work, so I did it from within the VM.
Upon shutdown it seemed to sort itself, but the VM came up as 'paused' on the primary node. This couldn't be resumed. A force restart from XCP-ng center jsut seemed to hang. I tried to cancel the task with xe task-cancel on hard_reboot and it didn't work so did a xe-toolstack-restart and this reset the VM state back to shutdown. It then booted normally...
Now the secondary host was running an old BIOS still with earlier microcode so not sure if it was related to that in anyway..
@john205 If you have these problems with Windows, can you please test your case with our testsigned drivers? Only if its not production:
@borzel unfortunately this is a production client server so can't do that easily. The two linux VM's moved over fine. Most of the VM's we have are Linux based also but I might be able to do it with an internal one on a different host but need to see if it can be moved easily as it has multiple network interfaces.
I think the one with the issue I mentioned before had XS7.1 drivers installed.