XCP-ng 7.6 RC1 available
-
@KFC-Netearth That's really interesting. Do you have the latest tools version installed in your VM?
edit: I'll try to use 80%+ RAM on a test VM and see if it triggers the problem
-
I have this version of the guest-tools on the VM
#rpm -aq|grep xen
xe-guest-utilities-xenstore-7.10.0-1.x86_64 -
Seems to be latest tools.
Do you have anything during the failed migration in
/var/log/xensource.log
? Or/var/log/SMlog
? -
iSCSI possible problem:
I install XCP-NG 7.6-RC1 on 2 node testing pool and I get error when I try to attach iSCSI SR.
"The SR could not ne connected because the drive gfs2 was not recognized.
Check you settings and try again."![0_1540743412810_XCP-NG_7.6-iSCSI error.jpg](Uploading 100%)
I create THICK volume:
![0_1540724487043_XCP-NG_7.6-iSCSI create.png](Uploading 0%)With Citrix XEN 7.6 works fine on same QNAP.
Any idea what I do wrong or I encountered an error in XCP-NG 7.6.
-
@olivierlambert first testresults:
Situation
- testpool at work with 3 nodes
- XCP-ng 7.5 with latest regular updates
- VM with debian 8.9, 2 vCPU, 2GB RAM, latest updates, xe-guest-utilities 7.4.0-1, x64, PVHVM
Test 1 - VM is idle
- VM migrates just fine
Test 2 - VM stressed with
stress --cpu 1
- VM migration crashed with error, VM is stopped
Oct 28 16:17:52 xen5 xenopsd-xc: [debug|xen5|32 ||xenops_server] TASK.signal 192582 = ["Failed",["Internal_error","End_of_file"]] ... Oct 28 16:17:52 xen5 xapi: [debug|xen5|78975 INET :::80|Async.VM.pool_migrate R:e90b0ed1692e|xapi] xenops: will retry migration: caught Xenops_interface.Internal_error("End_of_file") from sync_with_task in attempt 1 of 3. ... Oct 28 16:17:52 xen5 xenopsd-xc: [ info|xen5|28 |Async.VM.pool_migrate R:e90b0ed1692e|xenops_server] Caught Xenops_interface.Does_not_exist(_) executing ["VM_migrate",["0c19e8aa-24d2-a6a4-5677-8ef4af6dd592",{},{},{},"http://xxx.xxx.xxx.xxx/services/xenops?session_id=OpaqueRef:89ad861c-f0fd-4d70-bb0d-4bf65741f13a"]]: triggering cleanup actions
(full log: https://schulzalex.de/nextcloud/index.php/s/pT456wEZkBoJrTw)
I had to execute
xe-toolstack-restart
on the whole pool to get migration working again.Test3 - VM is idle
- migrate just fine like in Test 1
Test4 - VM is executing
apt upgrade
- migrate stucks at
... VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100...
until the upgrade inside the VM is finished and the VM is idle
Partial output of
tail -f /var/log/xensource.log | grep -i migrate | grep -i progress
... Oct 28 16:57:56 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:56 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:56 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:56 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:56 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:57 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:57 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:57 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:57 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:57 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:57 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:57 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:57 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:57 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:58 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:58 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:58 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:58 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:58 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:58 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:58 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:58 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:58 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:58 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:58 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:58 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:59 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:59 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:59 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:59 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:59 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:59 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:59 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:59 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:59 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:59 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:57:59 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:58:00 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:58:00 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:58:00 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:58:00 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:58:00 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:58:00 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 Oct 28 16:58:00 xen5 xenopsd-xc: [debug|xen5|14 |Async.VM.pool_migrate R:ccb8ffcf5ce5|xenops] VM = 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592; domid = 26; progress = 99 / 100 ....
- in the end the VM is migrated just fine
-
Test 5 - VM idle (with latest guest tools
xe-guest-utilities 7.10.0-1
)- migrated just fine
Test 6 - VM stressed with
stress --cpu 1
(with latest guest toolsxe-guest-utilities 7.10.0-1
)- same error as in Test 2
- logfile from receiving side: https://schulzalex.de/nextcloud/index.php/s/XCPPzN5qcda59wL
-
Just spotted a difference in my logfiles
I migrated my test-VM in idle and in stressed state and recorded the logfiles on both the sender and the receiver host.On the sender side there was a difference, see here:
https://schulzalex.de/nextcloud/index.php/s/aeoEFAD7TNWjfqkHere the part that is different:
Oct 28 18:08:04 xen5 xenopsd-xc: [error|xen5|26 |Async.VM.pool_migrate R:8d24f91b7cb1|emu-manager] Memory F 24499440 KiB S 0 KiB T 32766 MiB Oct 28 18:08:04 xen5 xenopsd-xc: [ info|xen5|26 |Async.VM.pool_migrate R:8d24f91b7cb1|xenops_server] Caught End_of_file executing ["VM_migrate",["0c19e8aa-24d2-a6a4-5677-8ef4af6dd592",{},{},{},"http://xxx.xxx.xxx.xxx/services/xenops?session_id=OpaqueRef:efd310cd-3d07-4d0b-92f5-531fe4a26614"]]: triggering cleanup actions Oct 28 18:08:04 xen5 xenopsd-xc: [debug|xen5|26 |Async.VM.pool_migrate R:8d24f91b7cb1|xenops_server] Task 157 reference Async.VM.pool_migrate R:8d24f91b7cb1: ["VM_check_state","0c19e8aa-24d2-a6a4-5677-8ef4af6dd592"] Oct 28 18:08:04 xen5 xenopsd-xc: [ warn|xen5|26 |Async.VM.pool_migrate R:8d24f91b7cb1|xenops_server] VM 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592 has unexpectedly suspended Oct 28 18:08:04 xen5 xenopsd-xc: [debug|xen5|26 |Async.VM.pool_migrate R:8d24f91b7cb1|xenops_server] VM.shutdown 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592 Oct 28 18:08:04 xen5 xenopsd-xc: [debug|xen5|26 |Async.VM.pool_migrate R:8d24f91b7cb1|xenops_server] Performing: ["VM_destroy_device_model","0c19e8aa-24d2-a6a4-5677-8ef4af6dd592"] Oct 28 18:08:04 xen5 xenopsd-xc: [debug|xen5|26 |Async.VM.pool_migrate R:8d24f91b7cb1|xenops_server] VM.destroy_device_model 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592 Oct 28 18:08:04 xen5 xenopsd-xc: [debug|xen5|26 |Async.VM.pool_migrate R:8d24f91b7cb1|xenops] qemu-dm: stopping qemu-dm with SIGTERM (domid = 31) Oct 28 18:08:04 xen5 xenopsd-xc: [ info|xen5|26 |Async.VM.pool_migrate R:8d24f91b7cb1|xenops] removing core files from /var/xen/qemu: ignoring exception (Unix.Unix_error "No such file or directory" rmdir /var/xen/qemu/21226) Oct 28 18:08:04 xen5 xenopsd-xc: [debug|xen5|26 |Async.VM.pool_migrate R:8d24f91b7cb1|xenops_server] TASK.signal 157 (object deleted) Oct 28 18:08:04 xen5 xenopsd-xc: [debug|xen5|26 |Async.VM.pool_migrate R:8d24f91b7cb1|xenops_server] Performing: ["Parallel","0c19e8aa-24d2-a6a4-5677-8ef4af6dd592","VBD.unplug vm=0c19e8aa-24d2-a6a4-5677-8ef4af6dd592",[["VBD_unplug",["0c19e8aa-24d2-a6a4-5677-8ef4af6dd592","xvda"],true],["VBD_unplug",["0c19e8aa-24d2-a6a4-5677-8ef4af6dd592","xvdd"],true]]] Oct 28 18:08:04 xen5 xenopsd-xc: [debug|xen5|26 |Async.VM.pool_migrate R:8d24f91b7cb1|xenops_server] begin_Parallel:task=157.atoms=2.(VBD.unplug vm=0c19e8aa-24d2-a6a4-5677-8ef4af6dd592) Oct 28 18:08:04 xen5 xenopsd-xc: [debug|xen5|26 |Async.VM.pool_migrate R:8d24f91b7cb1|xenops_server] queue_atomics_and_wait: Parallel:task=157.atoms=2.(VBD.unplug vm=0c19e8aa-24d2-a6a4-5677-8ef4af6dd592): chunk of 2 atoms
There seems to be an issue in
emu-manager
if the VM is under load.I want to test this again with XCP-ng 7.6,
but I don't have any other hardware laying around and I can't update the test pool at work for various non-technical reasonsEdit:
Just builded a testpool (XCP-ng 7.6) with same VM to test migration:
- VM idle: no problems
- VM
stress --cpu 1
has the same behaviour
Oct 28 21:40:30 xen4 xenopsd-xc: [error|xen4|33 |Async.VM.pool_migrate R:27b04736b62f|emu-manager] Memory F 32865556 KiB S 0 KiB T 40958 MiB
VM is killed and off.
-
Pinging @johnelse regarding your last post, I think it's very interesting
-
@borzel said in XCP-ng 7.6 RC1 available:
stress --cpu 1
I can reproduce the bug with this command indeed. Thanks @borzel now we have something to reproduce, hence it will be easier to test some fixes.
-
7.6 has been released: https://xcp-ng.org/blog/2018/10/31/xcp-ng-7-6/
The migration issue that is being discussed here will be adressed by an update as soon as we have a fix.
-
Congratulations #TeamXCPNG
-
@olivierlambert how in the bloody hell could have happened that you released xcp-ng 7.6 with this bug? Also why not saying a word about it here https://xcp-ng.org/blog/2018/10/31/xcp-ng-7-6/?
-
That's because the bug is likely already in 7.5, and we plan to patch it ASAP (next week?)
-
just FYI, it is in 7.5 as well. Looking forward to the patch.
-
Yeah, that's why we didn't stopped the release anyway. We got some trace and we continue to investigate right now. I hope we can have something this week.
-
I just upgraded from version 7.5 to 7.6 using yum upgrade, all seemed to go without error.
When I started to migrate our vm's from one of the 7.5 hosts to the new 7.6 host the vm's would be hung and unresponsive on the new host when the migration completed.
A forced stop and a fresh start of the VM brought the VMs back up.
After finding this post and seeing the notes about "xe-guest-utilities" I noticed that when I install the guest-tools the version remains the same, in fact the version remains at 7.4 looking at it in XCP-ng Center.
Optimized (version 7.4 installed)This is what I see when I install guest-tools after migrating the VM to the 7.6 host.
root@# /mnt/Linux/install.sh
Detected `Debian GNU/Linux 9.5 (stretch)' (debian version 9).The following changes will be made to this Virtual Machine:
- packages to be installed/upgraded:
- xe-guest-utilities_7.10.0-1_amd64.deb
Continue? [y/n] y
(Reading database ... 42324 files and directories currently installed.)
Preparing to unpack .../xe-guest-utilities_7.10.0-1_amd64.deb ...
Unpacking xe-guest-utilities (7.10.0-1) over (7.10.0-1) ...
Setting up xe-guest-utilities (7.10.0-1) ...
Processing triggers for systemd (232-25+deb9u4) ...You should now reboot this Virtual Machine.
root@#What is the correct version of guest-tools?
Are guest-tools not being updated because I have used yum to upgrade for the last 2 versions?
I should note these are 95% Linux VM'sThanks, Will
- packages to be installed/upgraded:
-
@WBA said in XCP-ng 7.6 RC1 available:
This is what I see when I install guest-tools after migrating the VM to the 7.6 host.
root@# /mnt/Linux/install.sh
Detected `Debian GNU/Linux 9.5 (stretch)' (debian version 9).The following changes will be made to this Virtual Machine:
- packages to be installed/upgraded:
- xe-guest-utilities_7.10.0-1_amd64.deb
This is the correct version of the tools in XCP-ng 7.6 and is the same as in XCP-ng 7.5. It was 7.9.0 in 7.4.
Using yum to upgrade does give you the right version of the guest tools ISO.
So we need to find out why XCP-ng Center reports version 7.4.
Does anyone know how to check using
xe
in CLI to see if the wrong version number is reported byxapi
directly or is coming from a bug in XCP-ng Center? - packages to be installed/upgraded:
-
This is what I see, not sure how that correlates to the correct tools build for 7.6
xe vm-param-get param-name=PV-drivers-version uuid=3a8a717f-c998-f05f-d096-f486a4990d1
major: 7; minor: 4; micro: 50; build: 1Thx Will
-
I think the tools were built by XS before 7.5 was released and still carry the 7.4 version number, but they are the latest.
-
@olivierlambert honestly, my problem is that i was using 7.4 and i upgraded to 7.6, which made a perfectly stable pool pretty unstable.
It would have been really helpful to read a warning beforehand about this nasty bug which still exists in 7.6 as you were aware at that point.