Hello,
I'm a new developer on XCP-ng, I'll work on the Xen side to improve performance.
I'm a newly graduated of University of Versailles Saint-Quentin with a specialty in parallel computing and HPC, I have a big interest in operating systems.
Hello,
I'm a new developer on XCP-ng, I'll work on the Xen side to improve performance.
I'm a newly graduated of University of Versailles Saint-Quentin with a specialty in parallel computing and HPC, I have a big interest in operating systems.
Hello,
As some of you may know, there is currently a problem with disks with blocksize of 4KiB not being compatible to be a SR disk.
It is an error with the vhd-util
utilities that is not easily fixed.
As such, we quickly developed a SMAPI driver using losetup
ability to emulate another sector size to be able to workaround the problem for the moment.
The real solution will involve SMAPIv3, which the first driver is available to test: https://xcp-ng.org/blog/2024/04/19/first-smapiv3-driver-is-available-in-preview/
To go back to the LargeBlock driver, it is available in 8.3 in sm 3.0.12-12.2.
To set it up, it is as simple as creating a EXT SR with xe
CLI but with type=largeblock
.
xe sr-create host-uuid=<host UUID> type=largeblock name-label="LargeBlock SR" device-config:device=/dev/nvme0n1
It does not support using multiple devices because of quirks with LVM and the EXT SR driver.
It automatically creates a loop device with a sector size of 512b on top of the 4KiB device and then creates a EXT SR on top of this emulated device.
This driver is a workaround, we have automated tests but they can't catch all things.
If you have any feedbacks or problems, don't hesitate to share here
Hello again,
It is now available in 8.2.1 with the testing packages, you can install them by enabling the testing repository and updating.
Available in sm 2.30.8-10.2.
yum update --enablerepo=xcp-ng-testing sm xapi-core xapi-xe xapi-doc
You then need to restart the toolstack.
Afterwards, you can create SR with the command in the above post.
@olivierlambert @S-Pam Indeed, it's normal, Dom0 doesn't see the NUMA information and the hypervisor handle the compute and memory allocation. You can see the wiki about manipulating VM allocation with the NUMA architecture if you want. But in normal use-cases it's not worth the effort.
Hello
So, you can of course makes some config by hand to alleviate some of the cost of the architecture on virtualization.
But like you can imagine, the scheduler will move the vCPU around and sometimes break the L3 locality if it move it to a remote core.
I asked to someone more informed than me about that and he said that running a vCPU is always better than trying to make it run locally so it's only useful under specific condition (having enough resources).
You can use the cpupool functionality to isolate VM on a specific NUMA node.
But it's only interesting if you really want more performance since it's a manual process, and can be cumbersome.
You can also pin vCPU on a specific physical core to keep L3 locality, but it would only work if you have little amount of VM running on that particular core. So yes, it might be a little gain (or even a loss).
There is multiple ways to make the core pinned, most with xl
but if you want it to stick between VM reboot you need to use xe
. Especially since if you want to pin a VM to a node and need it's memory being allocated on that node, since it can only be done at boot time. Pinning vCPU after boot using xl
can create problem if you pin it on a node and the VM memory is allocated on a another node.
You can see the VM NUMA memory information with the command xl debug-key u; xl dmesg
.
With xl
:
Pin a CPU:
xl vcpu-pin <Domain> <vcpu id> <cpu id>
e.g. : xl vcpu-pin 1 all 2-5
to pin all the vCPU of the VM 1 to core 2 to 5.
With CPUPool:
xl cpupool-numa-split # Will create a cpupool by NUMA node
xl cpupool-migrate <VM> <Pool>
(CPUPool only works for guest, not dom0)
And with xe
:
xe vm-param-set uuid=<UUID> VCPUs-params:mask=<mask> #To add a pinning
xe vm-param-remove uuid=<UUID> param-name=VCPUs-params param-key=mask #To remove pinning
The mask
above is CPU id separated with comma e.g. 0,1,2,3
Hope I could be useful, I will add that to the XCP-ng documentation soon
@lotusdew Your second PCI address is wrong. You have a dot instead of a colon: 0000:03.00.1
-> 0000:03:00.1
.
@s-pam
I can't look at the dmesg today as I'm home with a cold...
I hope you get well soon
I did experiment with
xl cpupool-numa-split
but this did not generate good results for multithreaded workloads. I believe this is because VMs get locked to use only as many cores as there are in each NUMA domain.
Indeed, a VM in a pool get locked to use only the cores of the pool and its max amount of VCPU being the number of core in the pool. It is useful if you have the need to isolate completely the VM.
You need to be careful when benching these things because the memory allocation of a running VM is not moved but the VCPU will still run on the pinned node. I don't remember exactly if cpu-pool did have a different behavior than simple pinning in that case though. I remember that hard pinning a guest VCPU were not definitely not moving its memory. You could only modify this before booting.
No, there is none. You should have no problem using the Debian 10 template for any VM.
The Ryzen 7 2700X is not equipped with a GPU. IIRC only G-suffixed Ryzen are equipped with one.
@dthenot Sorry, you were asking about a ISO SR.
In this case, it's in /opt/xensource/sm/ISOSR.py:appendCIFSMountOptions
.
@stormi There is no way to do this currently.
You could add options manually in /opt/xensource/sm/SMBSR.py:getMountOptions
by appending the option here to try.
The simple way would add the parameter to all SMB SR though.
And it would be override by a sm
update.
@sluflyer06 In your example, only your first BDF is correct.
The other one you put a dot instead of a colon.
Hello again,
It is now available in 8.2.1 with the testing packages, you can install them by enabling the testing repository and updating.
Available in sm 2.30.8-10.2.
yum update --enablerepo=xcp-ng-testing sm xapi-core xapi-xe xapi-doc
You then need to restart the toolstack.
Afterwards, you can create SR with the command in the above post.
Hello,
As some of you may know, there is currently a problem with disks with blocksize of 4KiB not being compatible to be a SR disk.
It is an error with the vhd-util
utilities that is not easily fixed.
As such, we quickly developed a SMAPI driver using losetup
ability to emulate another sector size to be able to workaround the problem for the moment.
The real solution will involve SMAPIv3, which the first driver is available to test: https://xcp-ng.org/blog/2024/04/19/first-smapiv3-driver-is-available-in-preview/
To go back to the LargeBlock driver, it is available in 8.3 in sm 3.0.12-12.2.
To set it up, it is as simple as creating a EXT SR with xe
CLI but with type=largeblock
.
xe sr-create host-uuid=<host UUID> type=largeblock name-label="LargeBlock SR" device-config:device=/dev/nvme0n1
It does not support using multiple devices because of quirks with LVM and the EXT SR driver.
It automatically creates a loop device with a sector size of 512b on top of the 4KiB device and then creates a EXT SR on top of this emulated device.
This driver is a workaround, we have automated tests but they can't catch all things.
If you have any feedbacks or problems, don't hesitate to share here
@James9103 Hello, sorry for the delayed answer.
scsi_id would not work on /dev/xvdX
devices since those are not SCSI based.
They are xen-blkfront
devices using the blkif
protocol.
It took me a bit of time to answer since I don't know about Oracle ASM.
From what I can read from ASM documentation, you only need stable identifier for disks.
Could you use some other kind of unique identifier?
Another thing is that there no current way to obtain a unique identifier for xen-blkfront
devices, but that we could try to do something about.
I will be looking a bit more about on it.
@lotusdew Your second PCI address is wrong. You have a dot instead of a colon: 0000:03.00.1
-> 0000:03:00.1
.
@s-pam
I can't look at the dmesg today as I'm home with a cold...
I hope you get well soon
I did experiment with
xl cpupool-numa-split
but this did not generate good results for multithreaded workloads. I believe this is because VMs get locked to use only as many cores as there are in each NUMA domain.
Indeed, a VM in a pool get locked to use only the cores of the pool and its max amount of VCPU being the number of core in the pool. It is useful if you have the need to isolate completely the VM.
You need to be careful when benching these things because the memory allocation of a running VM is not moved but the VCPU will still run on the pinned node. I don't remember exactly if cpu-pool did have a different behavior than simple pinning in that case though. I remember that hard pinning a guest VCPU were not definitely not moving its memory. You could only modify this before booting.
@s-pam Damn, computer are really magic. I'm very surprised about these result.
Does the NONUMA really mean no NUMA info being given by the firmware?
I have no idea how the scheduler of Xen uses this information, I know that the memory allocator strip the memory of the VM on all nodes the VM is configured to be allocated on. As such it would mean the scheduler is doing good work on scheduling the VCPU on nodes, without even knowing about the memory positioning of the current process running inside the guest.
Did you touch anything in the config of the guest? It's interesting result nonetheless. Can you share the memory allocation of the VM? You can obtain it with xl debug-keys u; xl dmesg
from the Dom0.
@olivierlambert @S-Pam Indeed, it's normal, Dom0 doesn't see the NUMA information and the hypervisor handle the compute and memory allocation. You can see the wiki about manipulating VM allocation with the NUMA architecture if you want. But in normal use-cases it's not worth the effort.