XCP 8.2 VCPUs-max settings

jeff

Hello!
We have a server form Dell PowerEdge R840 with four socket: 4 x Intel Xeon Gold 6254
xl info in this server :
nr_cpus : 144
max_cpu_id : 223
nr_nodes : 4
cores_per_socket : 18
threads_per_core : 2

We assign 96 vCPUs to a VM CentOS Server, when we fire lscpu it only shows 64 vCPUs.

We use command settings the VCPUs-max = 96, like this

Is there any other parameter to force "Threads per core" Pass through or unlimit it ?

tuxen

@jeff In order to create a virtual NUMA topology and expose it to the guest, the vNUMA feature needs to be implemented at hypervisor level and accessible through XAPI. I'm not sure if that feature is fully supported at the moment. Maybe @olivierlambert can confirm this?

You could try adding the cores-per-socket attribute following the physical NUMA topology (96 / 4 nodes = 24):

xe vm-param-set platform:cores-per-socket=24 uuid=<VM UUID>

Let me know if it works.

jeff

@tuxen
Hi tuxen the cores-per-socket does not works in my server, did you have any ides, can help me to unmimite vCPU ?

xe vm-param-set platform:cores-per-socket=24 uuid=<VM UUID>

sapcode

@jeff Hi jeff, we have the same issue with another Xeon CPU's.

This thread should be linked with:
https://xcp-ng.org/forum/topic/4604/xcp-8-2-vcpus-max-80-but-vm-shows-64-cpu-only-numa-nodes-and-threads-per-core-not-matching/12

Can you fire the following command during all your VM's are started: xl vcpu-list

In our case we see all vCPUS's greater than 64 in the state "--p" which is bad:

Name                                ID  VCPU   CPU State   Time(s) Affinity (Hard / Soft)
Domain-0                             0     0   79   r--     257.2  all / all
0__Confluence & Jira & Gitlab        1     0   40   -b-     142.2  all / all
0__SAP S4H S4HANA 2.0 1709 SP4       2    64    -   --p       0.0  all / all
0__SAP S4H S4HANA 2.0 1709 SP4       2    65    -   --p       0.0  all / all
0__SAP S4H S4HANA 2.0 1709 SP4       2    66    -   --p       0.0  all / all
0__SAP S4H S4HANA 2.0 1709 SP4       2    67    -   --p       0.0  all / all
0__SAP S4H S4HANA 2.0 1709 SP4       2    68    -   --p       0.0  all / all
0__SAP S4H S4HANA 2.0 1709 SP4       2    69    -   --p       0.0  all / all
0__SAP S4H S4HANA 2.0 1709 SP4       2    70    -   --p       0.0  all / all
0__SAP S4H S4HANA 2.0 1709 SP4       2    71    -   --p       0.0  all / all
0__SAP S4H S4HANA 2.0 1709 SP4       2    72    -   --p       0.0  all / all
0__SAP S4H S4HANA 2.0 1709 SP4       2    73    -   --p       0.0  all / all
0__SAP S4H S4HANA 2.0 1709 SP4       2    74    -   --p       0.0  all / all
0__SAP S4H S4HANA 2.0 1709 SP4       2    75    -   --p       0.0  all / all
0__SAP S4H S4HANA 2.0 1709 SP4       2    76    -   --p       0.0  all / all
0__SAP S4H S4HANA 2.0 1709 SP4       2    77    -   --p       0.0  all / all
0__SAP S4H S4HANA 2.0 1709 SP4       2    78    -   --p       0.0  all / all
0__SAP S4H S4HANA 2.0 1709 SP4       2    79    -   --p       0.0  all / all

It looks like that XEN passes only 64 vCPU's and the rest of CPU's has state "--p" in the pause state, god only knows why and where this limit has been implemented....

Source: https://linux.die.net/man/1/xm

STATES
The State field lists 6 states for a Xen Domain, and which ones the current Domain is in.

r - running The domain is currently running on a CPU

b - blocked The domain is blocked, and not running or runnable. This can be caused because the domain is waiting on IO (a traditional wait state) or has gone to sleep because there was nothing else for it to do.

p - paused The domain has been paused, usually occurring through the administrator running xm pause. When in a paused state the domain will still consume allocated resources like memory, but will not be eligible for scheduling by the Xen hypervisor.

s - shutdown The guest has requested to be shutdown, rebooted or suspended, and the domain is in the process of being destroyed in response.

c - crashed The domain has crashed, which is always a violent ending. Usually this state can only occur if the domain has been configured not to restart on crash. See xmdomain.cfg for more info.

d - dying The domain is in process of dying, but hasn't completely shutdown or crashed.

olivierlambert

You should probably ask on Xen mailing list or in #xen on OFTC IRC

Justin Goldberg

@olivierlambert said in XCP 8.2 VCPUs-max settings:

You should probably ask on Xen mailing list or in #xen on OFTC IRC

Since the mass exodus from freenode, is the channel on oftc or libera.chat?

I get this message when connecting to OFTC:

Looking up irc.oftc.net
* Connecting to irc.geo.oftc.net (64.86.243.183:6667)
* Connected. Now logging in.
* *** Looking up your hostname...
* *** Checking Ident
* *** Couldn't look up your hostname
* *** No Ident response
* Capabilities supported: multi-prefix
* Capabilities requested: multi-prefix 
* Capabilities acknowledged: multi-prefix
* Closing Link: xx.xx.xx.xx (Invalid username [~_JustinGo])
* Disconnected (Remote host closed socket)

olivierlambert

Check https://www.oftc.net/

We are also on Discord https://discord.gg/f8FcuBDq

tomg

@sapcode Hi there --

I know this was quite a while ago but I thought I should post here since I ran into the same exact issue when trying to add >64 vCPUs. The issue ended up being an ACPI issue on the VM itself. Things apparently get wonky past 64 CPUs with ACPI, since VMs well don't really need ACPI you can disable and voila, more than 64 vCPUs :]

Just modify your VM's platform param, no need to pass it as a kernel option inside the VM.

xe vm-param-set platform:acpi=0

Hope this helps anyone else trying to run more than 64 vCPUs.

cc @jeff

olivierlambert

Hey thanks for the feedback