Max allowable deviation in top "st" parameter

  • Hi everyone.
    As far as I'm concerned, the st parameter or steal time is the % of cpu the hypervisor (Domu0) steal from the VM, because it can't supply enough time slots due to a overcommited host. When the host is empty and has plenty of resources, it may give as much cpu time as asked.
    I don't know if DomU0 works in extrict or best effort mode, giving just the resources aplied or even more when plenty of resources.

    My question is, keeping in mind that hypervisors are not ideal, what % of st is allowable when the host is relaxed?, because I observe st values below 1% in a VM on am empty host.

    I'm going to populate the host in next days, and a bit worried about not overcommit it, so I'll be watching this parameter.

    As always, if I'm wrong in some point, please correct me.

  • XCP-ng Team

    If you look in top, you are just seeing what's happening in the dom0, not what's happening at Xen level.

    Keep in mind Xen is the hypervisor, it's loaded before the dom0, which is just a VM with all permissions on the hardware.

    xentop makes more sense.

  • Anrother wrong concept here for me or I've explained myself wrong. Xen is the hypervisor, is not dom0 a hidden vm where you connect when ssh to the host?, where xe commands are played and the one who controls the rest of the VM running on the same host? Like xencenter, xcp-ng center or XO, dom0 is a way to give orders and get states to/from xen hypervisor via XenAPI.
    For instance, start or shutdown any VM is a order that dom0 gives to xen hypervisor, or at least it's what I've always understood.

    When I said that I was looking in top, I mean the top command in the VM, not the dom0

  • XCP-ng Team

    1. Xen isn't a VM, it's the hypervisor scheduling all the CPU load from all VMs. You can't "SSH to Xen".
    2. When you run any XAPI commands, XAPI is running on the dom0 and then talking to Xen with another lower level API (using xenops). Dom0 is your control center for all operations.
    3. Load average will be a useful way to see if your VM resources are starving, and both from CPU or storage. It's a good way to measure the broad responsiveness of the VM.

  • As always, reading you expands my knowledge. point 1, perfect, as I thought it was. Point 2, I thought dom0 talk to xen hypervisor by XAPI, thanks for your correction, I supposse dom0 is the only one that can communicate with xen and xcp-ng xencenter or XO use dom0 via XAPI, beeing dom0 who finally passes the order to xen. Point 3, that's what I want, observe parameters like wait and st from top command to detect bottleneck or overloaded resources, like CPU or SR access.

    And going back to my original question, what st values should be allowed?. for instance, 20% must indicate a serious lack os resources in the VM.

  • XCP-ng Team

    For the last question, IDK and I don't think it's the best metric.

  • Maybe somebody could point some metrics to predict future cpu/Network overcommit.

  • XCP-ng Team

    @Martín-Lorente said in Max allowable deviation in top "st" parameter:

    what % of st is allowable when the host is relaxed?

    I'd say 1-5%.. The only way to absolutely avoid it is by using CPU pinning to VMs. Generally xen scheduler keeps switching vcpu:pcpu pair to balance things.

  • 5% sounds too much for me, but I'll check it in some populated pools (with host's cpu below 80%) to have a better idea.

XCP-ng Pro Support

XCP-ng Pro Support