This is great news.
Looking forward to testing this functionality on a xcp-ng server in the future:)
Posts
-
RE: machine type
-
RE: best performing filesystem
@olivierlambert
Yeah, I did not mean to post this as a negative thing.
Currently Xen VMs are limited by one thread tapdisk, and will loose this fight if you compare servers with only a couple of vms. But With a multitude of vms, this image evens out, and probably even end up in Xen's favour.That last bench was just added for "show". My main point was that I can allready see a difference with Xens performance. I used to run passthrough, but as you see from my benching of xcp-ng 8.2, having Xen controll the raid gave my vm better performance then my old bench of 7.6 where I was passing the controller to a dedicated NFS vm serving as SR for all other VMs.
That image was reverse a couple years ago -
RE: machine type
@olivierlambert
No rush, I am not even expecting it to ever happen.
Xen works well as-is for server usage. And you know, maybe thats great too.
Kvm/Qemu for desktop, and Xen for servers.
Better devs focus on keeping Xen secure and updated, then spreading thin to add features most of Xen's userbase dont even need and devs cant keep updated.On the other hand, GPGPU workloads is quickly gaining ground, so I actually believe this feature could be very important for Xen in the future.
Consumer grade GPUs are super cheap, and having good support within Xen could potentially make it a lot easier for startups and small company's to use GPUs on a larger scale.
I am afraid Kvm will "win" users if it keeps lagging behind Xen on features.
I guess we allready seen it happen(amazon aws comes to mind...).
I will still be choosing Xen wherever applicablekeep up the great and important work you all do, Xcp-ng has my love.
-
RE: machine type
I took some time to test Q35 on qemu+kvm on ubuntu.
Q35 is amazing. Not only can I now use the newest amd driver(20.9.2) as opposed to 18.4.1 that was the highest driver I could make work with xen/i440, but Q35 also seem to be handling FLR perfectly. Using i440 I would often have to shutdown dom0 to make the GPU reset and be able to use it in again in a vm after a vm reboot. With Q35 I have not had a single reset issue, and I have tried hard. I have rebooted the vm repeatedly both soft and hard, and not once has it locked up on reset.
I also did a test with i440 on KVM to see if my issue with driver and reset bug was related to kvm/xen or the emulated chipset i440 vs q35. And the issues are the same with i440 on kvm, would not accept driver above 18.4.1 and hard reset of VM would cause gpu reset issue, only fixable by power cycling host(reboot is not enough. complete shutdown is needed).
I therefore conclude the chipset is the key factor.
I do have to wonder why xen is having a hardtime making Q35 work tho, concidering Xen basicly is/was using modified qemu. I am thinking making Q35 work for Xen has not been a priority. But it really should!
This works so well, Ive decided to build a new desktop to use this. -
RE: best performing filesystem
I ran a test with 8.2 and apparently things changed. I am now getting better performance with dom0 handling the raid controller.
Instead of destroying the raid and going zfs, I kept it equal and did one test before and one after I switched the control of the lsi megaraid from guest to dom0.
However, after doing the bench, I destroyed the raid and made it raid0, and installed ubuntu 20.10 and kvm/qemu mostly becouse I wanted to try out emulated Q35 chipset in a hvmpv setting as discussion in a previous thread. I will post some details in that thread about my experience with emulated Q35.
But to finish this post, here is a bench with raid0, but on ubuntu using qemu+kvm. All drivers are VirtIO.
yeah, I know, the more vms running, the more xen vs kvm would balance out.
But if your only running 1 or two IO intensive vms, and the rest are ideling a lot. It sure means better performance with kvm for those users.
-
RE: best performing filesystem
I dont remember the exact numbers, it was quite some time ago, when xcp-ng 7.6 was fresh. But I belive it was in the ballpark of 2-300MB/s difference for both read/write meassuring with CrystalDiskMark. But I will run the benchmark again before and after I do the mentioned reinstall for comparison.
Was planning to do it this weekend, but havent recieved the new SSD yet.
After the benchmark, I might reconfigure to raid 0 and daily backup to cloud.Another question tho. A typical bottleneck can often be tapdisk limited to using one core/thread. But what if I give a VM in need for maximum throughput 2 VHD's and then have windows software raid0 those drives.
Would dom0 be handling each VHD with a tapdisk process each, or would it still be one process handling both VHD's connected to the VM? -
best performing filesystem
What would be the reccomended way to get decent iops and high throughput.
I have a lsi megaraid with 8 sata drives connected currently. 3tb drives and set to raid 10.
Formated as ext4, and xcp-ng uses it to save .vhd files.Originally I let dom0 handle it, but performance was a bit on the low end. So I ended up passing the controller to a ubuntu guest, and export it to dom0 as nfs, and to other vms as samba for files that are shared/available on multiple vms.
xcp-ng is running from a small 120gb ssd, contaning the ubuntu vm only.During the weekend I plan on reinstalling xcp in uefi mode on a new 1tb ssd, and everything from the raid is temporary backedup to cloud. So this will be a good time to make any changes.
I read good things about zfs, but from what I can tell, its kind of useless with so few disks. Also it preferes to not run ontop of hardware raid(?).
So going the zfs path would probably mean destroying the raid and let zfs handle the disks directly, giving dom0 12-16gb ram, and maybe(?) the small 120gb ssd for l2arc.Any advice?
-
choice of vbios for pci passthrough
Hi.
I just saw that with KVM booting an UEFI guest with a gpu passthrough, its possible to choose a vbios rom to be loaded.
Is this something that could become a possibility for xcp-ng/xen further down the road, after windows UEFI guest is working ? -
RE: XCP-ng 8.2.0 beta now available!
@stormi
Im sorry, it wasnt clear. What I meant is XCP-ng Center. If that code has been updated, or just a question of compiling it. -
RE: XCP-ng 8.2.0 beta now available!
I dont know if this is expected behaviour, after upgrading all the way from 7.6 to 8.2 beta. But on a windows guest I have an AMD gpu in full pci passthrough, the device didnt show up after boot. And instead there was another device showing fault.
Intel 8237sb pci to usb universal host controller, was throwing an error 43.
And under Display adapters, there was only the microsoft remote emulated and a "Microsoft Basic" devices showing.
I reinstalled the AMD driver, and rebooted. And it came back to life.Also, I must say the vms feel snappier. Faster boots, and less delay/latency. But I have not meassured pre/post, so its only a feeling!:D
-
RE: XCP-ng 8.2.0 beta now available!
@stormi
Is it only lacking to get built/compiled? If so I could do it sometime during the weekend. Or has the code not been updated to reflect any changes done with xcp-ng 8.2. If the latter, I am afraid I cant help. -
RE: XCP-ng 8.2.0 beta now available!
And be ready for testing UEFI VMs soon! I'll post here when the updated
uefistored
package is available.Does the host have to be booted in UEFI mode for UEFI VMs to function?
If it does, is there an easy way to make the change from legacy to UEFI?
All I know, is the install docs clearly state you should never change mode when upgrading.
Is the only way to export all the vms, and reinstall, then import the vms back? -
RE: XCP-ng 8.2.0 beta now available!
Is there a Xcp-NG Center compatible version compiled yet ?
I had a little problems upgrading to the 8.2 beta.
The main tty showed a normal bootup untill a point where it showed a bunch of ACPI errors,
it worked for 10-15 min on starting previously disabled devices, and then just stopped.
I let it work another 10-15 min but all that came was this error, over and over again:[ 639.270804] EDAC sbridge: Seeking for: PCI ID 8086:3cf5 [ 639.270804] EDAC sbridge: CPU SrcID #0, Ha #0, Channel #0 has DIMMs, but ECC is disabled [ 639.270804] EDAC sbridge: Couldn't find mci handler [ 639.270804] EDAC sbridge: Failed to register device with error -19.
I switched to tty2 and lunched the init script from /opt/xensources and managed to upgrade from there. Wasnt too easy as the screen kept getting garbled from debug info that I assume default to tty2 output.
After the upgrade and reboot, it boots. But it takes a long long time.
Here is a small snippit from the first boot dmesg:[ 29.618412] usb 1-1.1.1: Product: G19 Gaming Keyboard [ 29.623569] input: G19 Gaming Keyboard as /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.1/1-1.1.1/1-1.1.1:1.0/0003:046D:C228.0003/input/input7 [ 30.290247] hid-generic 0003:046D:C228.0003: input,hidraw2: USB HID v1.10 Keyboard [G19 Gaming Keyboard] on usb-0000:00:1a.0-1.1.1/input0 [ 30.293954] input: G19 Gaming Keyboard Consumer Control as /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.1/1-1.1.1/1-1.1.1:1.1/0003:046D:C228.0004/input/input8 [ 30.958193] hid-generic 0003:046D:C228.0004: input,hiddev97,hidraw3: USB HID v1.10 Device [G19 Gaming Keyboard] on usb-0000:00:1a.0-1.1.1/input1 [ 129.058404] EXT4-fs (sdb1): mounting ext3 file system using the ext4 subsystem [ 129.063909] EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: (null) [ 169.228203] e1000e 0000:00:19.0 0000:00:19.0 (uninitialized): registered PHC clock [ 169.356510] e1000e 0000:00:19.0 eth0: (PCI Express:2.5GT/s:Width x1) 8c:89:a5:a3:b6:f3 [ 169.356512] e1000e 0000:00:19.0 eth0: Intel(R) PRO/1000 Network Connection
And its not done and ready untill it tries to initialize samba SR:
[ 453.242190] xenbr1: port 1(eth1) entered blocking state [ 453.242190] xenbr1: port 1(eth1) entered forwarding state [ 476.236967] FS-Cache: Loaded [ 476.254307] FS-Cache: Netfs 'cifs' registered for caching [ 476.254307] Key type cifs.spnego registered [ 476.254307] Key type cifs.idmap registered
All in all this is much better then 8.1 and 8.0 where I had similar problems with ACPI, but it never got any further, and just locked up, no tty2 or such.
I actually had to use ACPI=off and/or ACPI=strict, and either way, the whole experience after upgrading was super slow vms, but with no aparant error messages so I returned to 7.6.
Well see how 8.2 fares, but so far so good
I have attached both the dmesg and xl-dmesg from the installer, incase you want to review what takes so long during boot.
xl-dmesg-log.txt
dmesg-log.txtThanks for all you hard work
-
RE: machine type
a couple ideas to try later is to just modify a template.json and reinstall the templates. Modifying only this part:
"platform": {
"viridian": "true",
"device-model": "qemu-upstream-compat"to the q35.
However, that alone will not be enough, next would be to
modify this script to make it use the qemu-system-i386 binary: /usr/libexec/xenopsd/qemu-dm-wrapper
Maybe make it look for the -m switch, and run the i386 binary only incase of -m q35, to keep it all running compatible with how it is now. That would only call the i386 binary if q35 switch is detected.But before I start wasting any time on this, is there any chance this could work?
Or would I first need to enable anything virtualization-wise in the dom0 kernel or maybe even in the xen kernel? -
RE: machine type
I havent spent very much time on this yet.
And I have never looked at xen sources or dived into how it works. So please bare with me if my poking around will take some time.
But for now, i dont really see any reason why xcp-ng shouldnt be able to run the normal quem, instead of the xen modified quem and connect to xcp-centers console with vnc, as it does with the xen modified quem.correct me if Im completly wrong, please:)
-
RE: machine type
@olivierlambert
I havent found any official documentation, but Ive been poking around a bit and so far I have found some information that seems interesting.It looks like xen allways starts vms using the quem.
like this:
65540 23960 1.5 0.2 261356 14740 ? SLl Sep11 11:09 qemu-dm-5 -machine pc-0.10,accel=xen,max-ram-below-4g=4026531840,allow-unassigned=true,trad_compat=true -vnc unix:/var/run/xen/vnc-5,lock-key-sync=off -monitor null -xen-domid 5 -m size=4096 -boot order=cdn -usb -device usb-tablet,port=2 -smp 4,maxcpus=4 -serial pty -display none -nodefaults -trace enable=xen_platform_log -sandbox on,obsolete=deny,elevateprivileges=allow,spawn=deny,resourcecontrol=deny -S -global PIIX4_PM.revision_id=0x1 -global ide-hd.ver=0.10.2 -global piix3-ide-xen.subvendor_id=0x5853 -global piix3-ide-xen.subsystem_id=0x0001 -global piix3-usb-uhci.subvendor_id=0x5853 -global piix3-usb-uhci.subsystem_id=0x0001 -global rtl8139.subvendor_id=0x5853 -global rtl8139.subsystem_id=0x0001 -parallel null -qmp unix:/var/run/xen/qmp-libxl-5,server,nowait -qmp unix:/var/run/xen/qmp-event-5,server,nowait -device xen-platform,addr=3,device-id=0x0001,revision=0x2,class-id=0x0100,subvendor_id=0x5853,subsystem_id=0x0001 -drive file=,if=ide,index=3,media=cdrom,force-lba=off -drive file=/dev/sm/backend/2c466dab-0424-67c4-3a1b-3257cab0cf54/695c548c-5600-4c9b-b785-5de42747b0a5,if=ide,index=0,media=disk,force-lba=on,format=raw -device rtl8139,netdev=tapnet0,mac=0a:c2:0e:56:dd:ba,addr=4 -netdev tap,id=tapnet0,fd=7 -device VGA,vgamem_mb=8,rombar=1,romfile=,subvendor_id=0x5853,subsystem_id=0x0001,addr=2,qemu-extended-regs=false -vnc-clipboard-socket-fd 4 -xen-domid-restrict -chroot /var/xen/qemu/root-5 -runas 65540.998I then looked around inside xcp-ng for different quem binarys and config files, and what struck me first is this:
[root@localhost ~]# /usr/lib64/xen/bin/qemu-system-i386 -machine help
Supported machines are:
pc-i440fx-2.9 Standard PC (i440FX + PIIX, 1996)
pc-i440fx-2.8 Standard PC (i440FX + PIIX, 1996)
pc-i440fx-2.7 Standard PC (i440FX + PIIX, 1996)
pc-i440fx-2.6 Standard PC (i440FX + PIIX, 1996)
pc-i440fx-2.5 Standard PC (i440FX + PIIX, 1996)
pc-i440fx-2.4 Standard PC (i440FX + PIIX, 1996)
pc-i440fx-2.3 Standard PC (i440FX + PIIX, 1996)
pc-i440fx-2.2 Standard PC (i440FX + PIIX, 1996)
pc Standard PC (i440FX + PIIX, 1996) (alias of pc-i440fx-2.10)
pc-i440fx-2.10 Standard PC (i440FX + PIIX, 1996) (default)
pc-i440fx-2.1 Standard PC (i440FX + PIIX, 1996)
pc-i440fx-2.0 Standard PC (i440FX + PIIX, 1996)
pc-i440fx-1.7 Standard PC (i440FX + PIIX, 1996)
pc-i440fx-1.6 Standard PC (i440FX + PIIX, 1996)
pc-i440fx-1.5 Standard PC (i440FX + PIIX, 1996)
pc-i440fx-1.4 Standard PC (i440FX + PIIX, 1996)
pc-1.3 Standard PC (i440FX + PIIX, 1996)
pc-1.2 Standard PC (i440FX + PIIX, 1996)
pc-1.1 Standard PC (i440FX + PIIX, 1996)
pc-1.0 Standard PC (i440FX + PIIX, 1996)
pc-0.15 Standard PC (i440FX + PIIX, 1996)
pc-0.14 Standard PC (i440FX + PIIX, 1996)
pc-0.13 Standard PC (i440FX + PIIX, 1996)
pc-0.12 Standard PC (i440FX + PIIX, 1996)
pc-0.11 Standard PC (i440FX + PIIX, 1996)
pc-0.10 Standard PC (i440FX + PIIX, 1996)
pc-q35-2.9 Standard PC (Q35 + ICH9, 2009)
pc-q35-2.8 Standard PC (Q35 + ICH9, 2009)
pc-q35-2.7 Standard PC (Q35 + ICH9, 2009)
pc-q35-2.6 Standard PC (Q35 + ICH9, 2009)
pc-q35-2.5 Standard PC (Q35 + ICH9, 2009)
pc-q35-2.4 Standard PC (Q35 + ICH9, 2009)
q35 Standard PC (Q35 + ICH9, 2009) (alias of pc-q35-2.10)
pc-q35-2.10 Standard PC (Q35 + ICH9, 2009)
isapc ISA-only PC
none empty machine
xenfv Xen Fully-virtualized PC
xenpv Xen Para-virtualized PCThis is the extensive list of supported -machine arguments that the xens qem binary claims is supported. And as I suspected that i440 is the default. Its used as -machine pc-0.10 when its called by xen.
I havent found anyway, yet, to make xen switch that -machine paramater. But I tried the most obvious wich was to add machine type as a key to the platform paramater inside a template like this:
xe template-param-set uuid=552bce37-51b2-445d-84f2-5f33fa112d7e platform:machine=pc-q35-2.10verified that it was added:
[root@localhost ~]# xe template-param-list uuid=552bce37-51b2-445d-84f2-5f33fa112d7e | grep platform
platform (MRW): machine: pc-q35-2.10; hpet: true; nx: true; device-model: qemu-upstream-compat; pae: true; apic: true; viridian: true; acpi: 1Then made a vm using that template. It didnt do anything, quem was still spawned using -machine pc-0.10
There is another paramater in the template that could be
interesting:
hardware-platform-version ( RO): 0
But its readonly, so cant be changed using xe template-param-setAnyway, I dont know if the machine paramater used is hardcoded or not. There are also different quem binarys available:
/var/xen/qemu
/etc/systemd/system/qemuback.service.d
/usr/libexec/xenopsd/qemu-dm-wrapper
/usr/libexec/xenopsd/qemu-vif-script
/usr/libexec/qemu-bridge-helper
/usr/libexec/xen/bin/qemu-dm
/usr/share/qemu
/usr/share/qemu/qemu_vga.ndrv
/usr/share/qemu/qemu_logo_no_text.svg
/usr/share/qemu/qmp/qemu-ga-client
/usr/share/qemu/qemu-icon.bmp
/usr/share/xen/qemu
/usr/lib64/xen/bin/qemu-system-i386
/usr/lib64/xen/bin/qemu-io
/usr/lib64/xen/bin/qemu-dm
/usr/lib64/xen/bin/qemu-img
/usr/lib64/xen/bin/qemu-wrapper
/usr/lib64/xen/bin/qemu-nbd
/usr/lib64/xen/bin/qemu_trad_image.pyc
/usr/lib64/xen/bin/qemu_trad_image.pyThe one spawned by xen when launching a VM is the quem-dm or maybe the wrapper. And that binary does not have the same machine support list:
[root@localhost ~]# /usr/libexec/xen/bin/qemu-dm -M help
Supported machines are:
xenfv Xen Fully-virtualized PC (default)
xenpv Xen Para-virtualized PCAnyway, the fact that there is a quem binary claiming to support q35, i am getting my hopes up that it can infact be done. We allready know that the normal quem has the support. And I belive it may just be a matter of making a template that can choose wich binary and paramters to use.
Maybe even add kernel support, i havent looked at the xcp-ng kernel source to see what virtualization is enabled or not.
If I have time later this weekend, i will download a build-env docker and start to poke around the xcp-ng kernel and build one with any missing virtualizations missing.
I was planning to do that anyway to see if i can figure out what changes was made in regards to my problem using any xcp-ng version above 7.6 with sucess.Maybe I will look at some of the other sources as well to see if I can find if the quem binary used is hardcoded, and the if maybe some of the paramaters passed to it, like -machine, is hardcoded as well.
Thats it for now. Sorry for the long and messy post.
-
RE: machine type
So there is a huge difference in our scenarios. AMD vs INTEL.
IOMMU vs VT-d.It may be that my issue lies with the VT-d implementation on my MSI big bang xpower ii mainboard. Its quite old, so is the cpu.
Its basicly from the time that VT-d functionality was just starting to make its way to consumer grade hardware, and not only 6k usd xeons with equallu expensive supermicro motherboards and alike.
Not many needed this on consumer hardware at the time, and the mobo vendors cut many corners. The list of consumer mobos with broken VT-d implementations from then is quite extensive.
Maybe I should just get a used old supermicro mobo from ebay instead. Or maybe its the VT-d implementation of the old i7-3930k cpu, and just picking up an old xeon from ebay would work better. I just like the idea of repurposing this now very old gaming rig to something usefull. Its still quite beefy tbh. -
RE: machine type
@l1c May I ask, what amd gpu you are using, and maybe also what brand and make of motherboard/cpu?
-
RE: machine type
I will continue to lookup information, and will happily test different things.
But if it comes down to someone needing to write the code to emulate q35, then im sorry to say that is not within my abilities.
Will update here if I make any progress on either q35 or the AMD driver issue.
-
RE: machine type
Awsome
Im sure your investigation into this will be far superior to mine!Thanks for looking it up:)