Question about hardware interrupts going exclusively to VCPU0
-
Xenproject's 'Network Throughput and Performance Guide' (located here - https://wiki.xen.org/wiki/Network_Throughput_and_Performance_Guide) mentions that Xen currently feeds all hardware interrupts for a guest to the guest's first VCPU, i.e. VCPU0.
We've seen this characteristic in our current Xen virtualization. In a network-intensive application we have, VCPU0 on what should have been a high-performance paravirtualized guest is flatlined by these interrupts, causing a destructive performance bottleneck.
In Xenproject there is apparently no possibility of distributing these hardware interrupts to other VCPUs. The only ways to address problems arising from this are minimalistic, such as trying to offload other system work to cores other than VCPU0.
So, wondering if xcp-ng has been able to change this characteristic of Xen, and instead distribute hardware interrupts for virtual network cards to more than one VCPU in guests, more like a hardware machine does.
Thanks!
-
Well, this is a pretty "advanced" question. Everyone is under heavy load right now, but I'd like to gather some stuff and provide you an detailed answer (as possible) when I can. Ping me back in one week if there's nothing here
-
Also, you should have a different behavior in HVM guests I suppose. Can you try with a HVM enabled VM?
edit: because PV guests tend to disappear anywayβ¦
-
@olivierlambert Thanks for taking a look into this.
One bit of documentation I found somewhere had specified the HVMs as doing this. But when I looked at our PV application it was plainly doing it.
We never loaded this application onto an HVM because the PVs are supposed to have the fastest performance for these kinds of applications. But when I log into a HVM that is doing something else and look at /proc/interrupts, the interrupts look entirely different. I would say that the the HVM doesn't do it, from looking at it.
xen_dyn_event with the ethernet cards is where it can be seen in /proc/interrupts. One of our PVMs with vcpu0 being blown away by the networking looks like this:
~# cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 0: 2974500869 0 0 0 0 0 xen-percpu-virq timer0 1: 0 0 0 0 0 0 xen-percpu-ipi spinlock0 2: 490061501 0 0 0 0 0 xen-percpu-ipi resched0 3: 2732 0 0 0 0 0 xen-percpu-ipi callfunc0 4: 0 0 0 0 0 0 xen-percpu-virq debug0 5: 45686 0 0 0 0 0 xen-percpu-ipi callfuncsingle0 6: 5 0 0 0 0 0 xen-percpu-ipi irqwork0 7: 0 1657415058 0 0 0 0 xen-percpu-virq timer1 8: 0 0 0 0 0 0 xen-percpu-ipi spinlock1 9: 0 108980851 0 0 0 0 xen-percpu-ipi resched1 10: 0 2928 0 0 0 0 xen-percpu-ipi callfunc1 11: 0 0 0 0 0 0 xen-percpu-virq debug1 12: 0 96434 0 0 0 0 xen-percpu-ipi callfuncsingle1 13: 0 0 0 0 0 0 xen-percpu-ipi irqwork1 14: 0 0 1563637041 0 0 0 xen-percpu-virq timer2 15: 0 0 0 0 0 0 xen-percpu-ipi spinlock2 16: 0 0 39816489 0 0 0 xen-percpu-ipi resched2 17: 0 0 3325 0 0 0 xen-percpu-ipi callfunc2 18: 0 0 0 0 0 0 xen-percpu-virq debug2 19: 0 0 90287 0 0 0 xen-percpu-ipi callfuncsingle2 20: 0 0 0 0 0 0 xen-percpu-ipi irqwork2 21: 0 0 0 1523086391 0 0 xen-percpu-virq timer3 22: 0 0 0 0 0 0 xen-percpu-ipi spinlock3 23: 0 0 0 99427908 0 0 xen-percpu-ipi resched3 24: 0 0 0 3281 0 0 xen-percpu-ipi callfunc3 25: 0 0 0 0 0 0 xen-percpu-virq debug3 26: 0 0 0 100709 0 0 xen-percpu-ipi callfuncsingle3 27: 0 0 0 0 0 0 xen-percpu-ipi irqwork3 28: 0 0 0 0 1465441862 0 xen-percpu-virq timer4 29: 0 0 0 0 0 0 xen-percpu-ipi spinlock4 30: 0 0 0 0 106093581 0 xen-percpu-ipi resched4 31: 0 0 0 0 3074 0 xen-percpu-ipi callfunc4 32: 0 0 0 0 0 0 xen-percpu-virq debug4 33: 0 0 0 0 90140 0 xen-percpu-ipi callfuncsingle4 34: 0 0 0 0 0 0 xen-percpu-ipi irqwork4 35: 0 0 0 0 0 1437046349 xen-percpu-virq timer5 36: 0 0 0 0 0 0 xen-percpu-ipi spinlock5 37: 0 0 0 0 0 79967461 xen-percpu-ipi resched5 38: 0 0 0 0 0 2902 xen-percpu-ipi callfunc5 39: 0 0 0 0 0 0 xen-percpu-virq debug5 40: 0 0 0 0 0 88717 xen-percpu-ipi callfuncsingle5 41: 0 0 0 0 0 0 xen-percpu-ipi irqwork5 42: 1731 0 0 0 0 0 xen-dyn-event xenbus 43: 4768 0 0 0 0 0 xen-dyn-event vfb 44: 9 0 0 0 0 0 xen-dyn-event hvc_console 45: 0 0 0 0 0 0 xen-dyn-event vkbd 46: 20621951 0 0 0 0 0 xen-dyn-event blkif 47: 0 0 0 0 0 0 xen-dyn-event blkif 48: 2783307200 0 0 0 0 0 xen-dyn-event eth0-q0-tx 49: 532932867 0 0 0 0 0 xen-dyn-event eth0-q0-rx 50: 1713842681 0 0 0 0 0 xen-dyn-event eth0-q1-tx 51: 2858888631 0 0 0 0 0 xen-dyn-event eth0-q1-rx 52: 852010 0 0 0 0 0 xen-dyn-event eth1-q0-tx 53: 19124356 0 0 0 0 0 xen-dyn-event eth1-q0-rx 54: 290765 0 0 0 0 0 xen-dyn-event eth1-q1-tx 55: 2453475 0 0 0 0 0 xen-dyn-event eth1-q1-rx 56: 3502207263 0 0 0 0 0 xen-dyn-event eth2-q0-tx 57: 15 0 0 0 0 0 xen-dyn-event eth2-q0-rx 58: 3455508607 0 0 0 0 0 xen-dyn-event eth2-q1-tx 59: 4443512 0 0 0 0 0 xen-dyn-event eth2-q1-rx 60: 91275228 0 0 0 0 0 xen-dyn-event eth3-q0-tx 61: 1664573 0 0 0 0 0 xen-dyn-event eth3-q0-rx 62: 98380230 0 0 0 0 0 xen-dyn-event eth3-q1-tx 63: 4 0 0 0 0 0 xen-dyn-event eth3-q1-rx NMI: 0 0 0 0 0 0 Non-maskable interrupts LOC: 0 0 0 0 0 0 Local timer interrupts SPU: 0 0 0 0 0 0 Spurious interrupts PMI: 0 0 0 0 0 0 Performance monitoring interrupts IWI: 5 0 0 0 0 0 IRQ work interrupts RTR: 0 0 0 0 0 0 APIC ICR read retries RES: 490061501 108980852 39816491 99427909 106093581 79967461 Rescheduling interrupts CAL: 48409 99344 93599 103979 93199 91617 Function call interrupts TLB: 9 18 13 11 15 2 TLB shootdowns TRM: 0 0 0 0 0 0 Thermal event interrupts THR: 0 0 0 0 0 0 Threshold APIC interrupts DFR: 0 0 0 0 0 0 Deferred Error APIC interrupts MCE: 0 0 0 0 0 0 Machine check exceptions MCP: 18543 18543 18543 18543 18543 18543 Machine check polls ERR: 0 MIS: 0 PIN: 0 0 0 0 0 0 Posted-interrupt notification event PIW: 0 0 0 0 0 0 Posted-interrupt wakeup event
An HVM that isn't doing a network-intensive application, but is doing some, doesn't look at all like that.
]# cat /proc/interrupts CPU0 CPU1 0: 122 0 IO-APIC-edge timer 1: 9 0 xen-pirq-ioapic-edge i8042 6: 3 0 xen-pirq-ioapic-edge floppy 8: 2 0 xen-pirq-ioapic-edge rtc0 9: 0 0 IO-APIC-fasteoi acpi 12: 144 0 xen-pirq-ioapic-edge i8042 14: 0 0 IO-APIC-edge ata_piix 15: 0 0 IO-APIC-edge ata_piix 23: 36 5 xen-pirq-ioapic-level uhci_hcd:usb1 48: 897396388 0 xen-percpu-virq timer0 49: 89501724 0 xen-percpu-ipi resched0 50: 0 0 xen-percpu-ipi callfunc0 51: 0 0 xen-percpu-virq debug0 52: 3261267 0 xen-percpu-ipi callfuncsingle0 53: 0 0 xen-percpu-ipi spinlock0 54: 0 974483568 xen-percpu-virq timer1 55: 0 60787726 xen-percpu-ipi resched1 56: 0 0 xen-percpu-ipi callfunc1 57: 0 0 xen-percpu-virq debug1 58: 0 2097853 xen-percpu-ipi callfuncsingle1 59: 0 0 xen-percpu-ipi spinlock1 60: 924 23 xen-dyn-event xenbus 61: 11 0 xen-dyn-event hvc_console 62: 35477 376475 xen-dyn-event blkif 63: 183145 5703441 xen-dyn-event blkif 64: 334273 23335271 xen-dyn-event blkif 65: 44619 627423 xen-dyn-event blkif 66: 1253815 61072372 xen-dyn-event eth0-q0-tx 67: 223593 168647134 xen-dyn-event eth0-q0-rx 68: 781991 48340043 xen-dyn-event eth0-q1-tx 69: 1 183 xen-dyn-event eth0-q1-rx 70: 18052 693241 xen-dyn-event eth1-q0-tx 71: 18426 730555 xen-dyn-event eth1-q0-rx 72: 12219 400548 xen-dyn-event eth1-q1-tx 73: 82841 5999147 xen-dyn-event eth1-q1-rx 74: 0 0 xen-dyn-event vkbd NMI: 0 0 Non-maskable interrupts LOC: 0 0 Local timer interrupts SPU: 0 0 Spurious interrupts PMI: 0 0 Performance monitoring interrupts IWI: 38281899 62816271 IRQ work interrupts RTR: 0 0 APIC ICR read retries RES: 89501724 60787726 Rescheduling interrupts CAL: 2776 2893 Function call interrupts TLB: 3258491 2094960 TLB shootdowns TRM: 0 0 Thermal event interrupts THR: 0 0 Threshold APIC interrupts DFR: 0 0 Deferred Error APIC interrupts MCE: 0 0 Machine check exceptions MCP: 38640 38640 Machine check polls HYP: 989541284 1320711409 Hypervisor callback interrupts ERR: 0 MIS: 0 PIN: 0 0 Posted-interrupt notification event NPI: 0 0 Nested posted-interrupt event PIW: 0 0 Posted-interrupt wakeup event
-
PV is only "better" in some marginal cases now, thanks to virt hardware instructions. Ideally, you should bench your app with PV and with (PV)HVM and compare the result. This might be really interesting for you and the community
-
Also you might use http://irqbalance.github.io/irqbalance/ to help "spreading" the load on various CPUs.