XCP-ng 7.6 RC1 available
-
Test 5 - VM idle (with latest guest tools
xe-guest-utilities 7.10.0-1
)- migrated just fine
Test 6 - VM stressed with
stress --cpu 1
(with latest guest toolsxe-guest-utilities 7.10.0-1
)- same error as in Test 2
- logfile from receiving side: https://schulzalex.de/nextcloud/index.php/s/XCPPzN5qcda59wL
-
Just spotted a difference in my logfiles
I migrated my test-VM in idle and in stressed state and recorded the logfiles on both the sender and the receiver host.On the sender side there was a difference, see here:
https://schulzalex.de/nextcloud/index.php/s/aeoEFAD7TNWjfqkHere the part that is different:
Oct 28 18:08:04 xen5 xenopsd-xc: [error|xen5|26 |Async.VM.pool_migrate R:8d24f91b7cb1|emu-manager] Memory F 24499440 KiB S 0 KiB T 32766 MiB Oct 28 18:08:04 xen5 xenopsd-xc: [ info|xen5|26 |Async.VM.pool_migrate R:8d24f91b7cb1|xenops_server] Caught End_of_file executing ["VM_migrate",["0c19e8aa-24d2-a6a4-5677-8ef4af6dd592",{},{},{},"http://xxx.xxx.xxx.xxx/services/xenops?session_id=OpaqueRef:efd310cd-3d07-4d0b-92f5-531fe4a26614"]]: triggering cleanup actions Oct 28 18:08:04 xen5 xenopsd-xc: [debug|xen5|26 |Async.VM.pool_migrate R:8d24f91b7cb1|xenops_server] Task 157 reference Async.VM.pool_migrate R:8d24f91b7cb1: ["VM_check_state","0c19e8aa-24d2-a6a4-5677-8ef4af6dd592"] Oct 28 18:08:04 xen5 xenopsd-xc: [ warn|xen5|26 |Async.VM.pool_migrate R:8d24f91b7cb1|xenops_server] VM 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592 has unexpectedly suspended Oct 28 18:08:04 xen5 xenopsd-xc: [debug|xen5|26 |Async.VM.pool_migrate R:8d24f91b7cb1|xenops_server] VM.shutdown 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592 Oct 28 18:08:04 xen5 xenopsd-xc: [debug|xen5|26 |Async.VM.pool_migrate R:8d24f91b7cb1|xenops_server] Performing: ["VM_destroy_device_model","0c19e8aa-24d2-a6a4-5677-8ef4af6dd592"] Oct 28 18:08:04 xen5 xenopsd-xc: [debug|xen5|26 |Async.VM.pool_migrate R:8d24f91b7cb1|xenops_server] VM.destroy_device_model 0c19e8aa-24d2-a6a4-5677-8ef4af6dd592 Oct 28 18:08:04 xen5 xenopsd-xc: [debug|xen5|26 |Async.VM.pool_migrate R:8d24f91b7cb1|xenops] qemu-dm: stopping qemu-dm with SIGTERM (domid = 31) Oct 28 18:08:04 xen5 xenopsd-xc: [ info|xen5|26 |Async.VM.pool_migrate R:8d24f91b7cb1|xenops] removing core files from /var/xen/qemu: ignoring exception (Unix.Unix_error "No such file or directory" rmdir /var/xen/qemu/21226) Oct 28 18:08:04 xen5 xenopsd-xc: [debug|xen5|26 |Async.VM.pool_migrate R:8d24f91b7cb1|xenops_server] TASK.signal 157 (object deleted) Oct 28 18:08:04 xen5 xenopsd-xc: [debug|xen5|26 |Async.VM.pool_migrate R:8d24f91b7cb1|xenops_server] Performing: ["Parallel","0c19e8aa-24d2-a6a4-5677-8ef4af6dd592","VBD.unplug vm=0c19e8aa-24d2-a6a4-5677-8ef4af6dd592",[["VBD_unplug",["0c19e8aa-24d2-a6a4-5677-8ef4af6dd592","xvda"],true],["VBD_unplug",["0c19e8aa-24d2-a6a4-5677-8ef4af6dd592","xvdd"],true]]] Oct 28 18:08:04 xen5 xenopsd-xc: [debug|xen5|26 |Async.VM.pool_migrate R:8d24f91b7cb1|xenops_server] begin_Parallel:task=157.atoms=2.(VBD.unplug vm=0c19e8aa-24d2-a6a4-5677-8ef4af6dd592) Oct 28 18:08:04 xen5 xenopsd-xc: [debug|xen5|26 |Async.VM.pool_migrate R:8d24f91b7cb1|xenops_server] queue_atomics_and_wait: Parallel:task=157.atoms=2.(VBD.unplug vm=0c19e8aa-24d2-a6a4-5677-8ef4af6dd592): chunk of 2 atoms
There seems to be an issue in
emu-manager
if the VM is under load.I want to test this again with XCP-ng 7.6,
but I don't have any other hardware laying around and I can't update the test pool at work for various non-technical reasonsEdit:
Just builded a testpool (XCP-ng 7.6) with same VM to test migration:
- VM idle: no problems
- VM
stress --cpu 1
has the same behaviour
Oct 28 21:40:30 xen4 xenopsd-xc: [error|xen4|33 |Async.VM.pool_migrate R:27b04736b62f|emu-manager] Memory F 32865556 KiB S 0 KiB T 40958 MiB
VM is killed and off.
-
Pinging @johnelse regarding your last post, I think it's very interesting
-
@borzel said in XCP-ng 7.6 RC1 available:
stress --cpu 1
I can reproduce the bug with this command indeed. Thanks @borzel now we have something to reproduce, hence it will be easier to test some fixes.
-
7.6 has been released: https://xcp-ng.org/blog/2018/10/31/xcp-ng-7-6/
The migration issue that is being discussed here will be adressed by an update as soon as we have a fix.
-
Congratulations #TeamXCPNG
-
@olivierlambert how in the bloody hell could have happened that you released xcp-ng 7.6 with this bug? Also why not saying a word about it here https://xcp-ng.org/blog/2018/10/31/xcp-ng-7-6/?
-
That's because the bug is likely already in 7.5, and we plan to patch it ASAP (next week?)
-
just FYI, it is in 7.5 as well. Looking forward to the patch.
-
Yeah, that's why we didn't stopped the release anyway. We got some trace and we continue to investigate right now. I hope we can have something this week.
-
I just upgraded from version 7.5 to 7.6 using yum upgrade, all seemed to go without error.
When I started to migrate our vm's from one of the 7.5 hosts to the new 7.6 host the vm's would be hung and unresponsive on the new host when the migration completed.
A forced stop and a fresh start of the VM brought the VMs back up.
After finding this post and seeing the notes about "xe-guest-utilities" I noticed that when I install the guest-tools the version remains the same, in fact the version remains at 7.4 looking at it in XCP-ng Center.
Optimized (version 7.4 installed)This is what I see when I install guest-tools after migrating the VM to the 7.6 host.
root@# /mnt/Linux/install.sh
Detected `Debian GNU/Linux 9.5 (stretch)' (debian version 9).The following changes will be made to this Virtual Machine:
- packages to be installed/upgraded:
- xe-guest-utilities_7.10.0-1_amd64.deb
Continue? [y/n] y
(Reading database ... 42324 files and directories currently installed.)
Preparing to unpack .../xe-guest-utilities_7.10.0-1_amd64.deb ...
Unpacking xe-guest-utilities (7.10.0-1) over (7.10.0-1) ...
Setting up xe-guest-utilities (7.10.0-1) ...
Processing triggers for systemd (232-25+deb9u4) ...You should now reboot this Virtual Machine.
root@#What is the correct version of guest-tools?
Are guest-tools not being updated because I have used yum to upgrade for the last 2 versions?
I should note these are 95% Linux VM'sThanks, Will
- packages to be installed/upgraded:
-
@WBA said in XCP-ng 7.6 RC1 available:
This is what I see when I install guest-tools after migrating the VM to the 7.6 host.
root@# /mnt/Linux/install.sh
Detected `Debian GNU/Linux 9.5 (stretch)' (debian version 9).The following changes will be made to this Virtual Machine:
- packages to be installed/upgraded:
- xe-guest-utilities_7.10.0-1_amd64.deb
This is the correct version of the tools in XCP-ng 7.6 and is the same as in XCP-ng 7.5. It was 7.9.0 in 7.4.
Using yum to upgrade does give you the right version of the guest tools ISO.
So we need to find out why XCP-ng Center reports version 7.4.
Does anyone know how to check using
xe
in CLI to see if the wrong version number is reported byxapi
directly or is coming from a bug in XCP-ng Center? - packages to be installed/upgraded:
-
This is what I see, not sure how that correlates to the correct tools build for 7.6
xe vm-param-get param-name=PV-drivers-version uuid=3a8a717f-c998-f05f-d096-f486a4990d1
major: 7; minor: 4; micro: 50; build: 1Thx Will
-
I think the tools were built by XS before 7.5 was released and still carry the 7.4 version number, but they are the latest.
-
@olivierlambert honestly, my problem is that i was using 7.4 and i upgraded to 7.6, which made a perfectly stable pool pretty unstable.
It would have been really helpful to read a warning beforehand about this nasty bug which still exists in 7.6 as you were aware at that point.
-
@wayne
https://github.com/xcp-ng/xcp/wiki/Upgrade-Howto#from-command-line
'know you're doing it at your own risk and be prepared to fix any issues that would arise, especially unforeseen upgrade issues'//Adde
-
@Adde Wayne's issues are not related to the update via
yum
, they're related to a bug in our replacement of the proprietaryemu-manager
(for which we'll release update candidates today). -
-
@stormi my problem isn't the bug, there always be bugs.
My problem is that there is no proper communication about it still. I think it is a serious issue about which you know before the 7.6 release as it was tested in rc phase apparently.
It should have been mentioned when 7.6 was released, so anybody be aware what they're risking with upgrading.
Anyhow, I'm glad that the fix is coming.
-
@wayne while it was found during the testing phase for 7.6, it was already present in 7.5 and hadn't been detected until late, so it let us assume that only few users were affected and that is was not worse to release 7.6 with it (and fix it fast) than it had been releasing 7.5 with it. We did not foresee that people still in 7.4 would skip 7.5 and discover the bug with 7.6.