Epyc VM to VM networking slow
-
Posting my results:
XCP ng stable 8.2, up-to-date.
Epyc 7402 (1 socket)
512GB RAM 3200Mhz
Supermicro h12ssl-iNo cpu pinning
VM's: Ubuntu 22.04 Kernel 6.5.0-41-generic
v2m 1 thread: 3.5Gb/s - Dom0 140%, vm1 60%, vm2 55%
v2m 4 threads: 9.22Gb/s - Dom0 555%, vm1 320%, vm2 380%
h2m 1 thread: 10.4Gb/s - Dom0 183%, vm1 180%, vm2 0%
h2m 4 thread: 18.0Gb/s - Dom0 510%, vm1 490%, vm2 0%host : xcp-ng-7402 release : 4.19.0+1 version : #1 SMP Tue Jan 23 14:12:55 CET 2024 machine : x86_64 nr_cpus : 48 max_cpu_id : 47 nr_nodes : 1 cores_per_socket : 24 threads_per_core : 2 cpu_mhz : 2800.047 hw_caps : 178bf3ff:7ed8320b:2e500800:244037ff:0000000f:219c91a9:00400004:00000500 virt_caps : pv hvm hvm_directio pv_directio hap shadow total_memory : 524149 free_memory : 39528 sharing_freed_memory : 0 sharing_used_memory : 0 outstanding_claims : 0 free_cpus : 0 cpu_topology : cpu: core socket node 0: 0 0 0 1: 0 0 0 2: 1 0 0 3: 1 0 0 4: 2 0 0 5: 2 0 0 6: 4 0 0 7: 4 0 0 8: 5 0 0 9: 5 0 0 10: 6 0 0 11: 6 0 0 12: 8 0 0 13: 8 0 0 14: 9 0 0 15: 9 0 0 16: 10 0 0 17: 10 0 0 18: 12 0 0 19: 12 0 0 20: 13 0 0 21: 13 0 0 22: 14 0 0 23: 14 0 0 24: 16 0 0 25: 16 0 0 26: 17 0 0 27: 17 0 0 28: 18 0 0 29: 18 0 0 30: 20 0 0 31: 20 0 0 32: 21 0 0 33: 21 0 0 34: 22 0 0 35: 22 0 0 36: 24 0 0 37: 24 0 0 38: 25 0 0 39: 25 0 0 40: 26 0 0 41: 26 0 0 42: 28 0 0 43: 28 0 0 44: 29 0 0 45: 29 0 0 46: 30 0 0 47: 30 0 0 device topology : device node No device topology data available numa_info : node: memsize memfree distances 0: 525554 39528 10 xen_major : 4 xen_minor : 13 xen_extra : .5-9.40 xen_version : 4.13.5-9.40 xen_caps : xen-3.0-x86_64 hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 xen_scheduler : credit xen_pagesize : 4096 platform_params : virt_start=0xffff800000000000 xen_changeset : 708e83f0e7d1, pq 9a787e7255bc xen_commandline : dom0_mem=8192M,max:8192M watchdog ucode=scan dom0_max_vcpus=1-16 crashkernel=256M,below=4G console=vga vga=mode-0x0311 cc_compiler : gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28) cc_compile_by : mockbuild cc_compile_domain : [unknown] cc_compile_date : Thu Apr 11 18:03:32 CEST 2024 build_id : fae5f46d8ff74a86c439a8b222c4c8d50d11eb0a xend_config_format : 4
-
What do you mean by v2m and h2m ?
-
hi @olivierlambert ! it's the nomenclature @bleader used in the report table, sorry for the misunderstanding:
https://xcp-ng.org/forum/post/67750v2m 1 thread: throughput / cpu usage from xentop³ v2m 4 threads: throughput / cpu usage from xentop³ h2m 1 thread: througput / cpu usage from xentop³ h2m 4 threads: througput / cpu usage from xentop³
it's vm to vm and host (dom0) to vm.
Btw I'm super happy to do any more test that could help, with different kernels, OS's, xcp ng versions... whatever you need.
PS: vm to host resulted in unreachable host even though I could ping from vm to host just fine, I checked the iptables are blocked for the iperf port but open to ping, but I didn't want to mess with dom0.
-
Just for completeness. I have the same issue with older AMD 6380 Opterons. The same hardware had full speed on esxi hypervisor. Also have full speed using harvester / rancher. I'm using xcp-ng for more than two years and have that issue since day one.
----------------------------------------------------------- Server listening on 5201 (test #1 - dom0 to workstation) ----------------------------------------------------------- [ ID] Interval Transfer Bitrate [ 5] 0.00-10.00 sec 10.8 GBytes 9.29 Gbits/sec receiver ----------------------------------------------------------- Server listening on 5201 (test #2 - vm to workstation) ----------------------------------------------------------- [ ID] Interval Transfer Bitrate [ 5] 0.00-10.00 sec 3.18 GBytes 2.73 Gbits/sec receiver ----------------------------------------------------------- Server listening on 5201 (test #3 dom0 to vm) ----------------------------------------------------------- [ ID] Interval Transfer Bandwidth [ 4] 0.00-10.00 sec 2.23 GBytes 1.91 Gbits/sec receiver ----------------------------------------------------------- Server listening on 5201 (test #4 vom to vm) ----------------------------------------------------------- [ ID] Interval Transfer Bitrate [ 5] 0.00-10.00 sec 1.97 GBytes 1.69 Gbits/sec receiver -----------------------------------------------------------
-
I ran these tests now that newer updates have been released for 8.3-beta.
Results are as below:- iperf-sender -> iperf-receiver: 5.06Gbit/s
- iperf-sender -> iperf-receiver -P4: 7.53Gbit/s
- host -> iperf-receiver: 7.83Gbit/s
- host -> iperf-receiver -P4: 13.0Gbit/s
Host (dom0):
- CPU: AMD EPYC 7302P
- Sockets: 1
- RAM: 6.59GB (dom0) / 112GB for VMs
- MotherBoard: H12SSL-i
- NIC: X540-AT2 (rev 01)
xl info -n
host : xcp release : 4.19.0+1 version : #1 SMP Mon Jun 24 17:20:04 CEST 2024 machine : x86_64 nr_cpus : 32 max_cpu_id : 31 nr_nodes : 1 cores_per_socket : 16 threads_per_core : 2 cpu_mhz : 2999.997 hw_caps : 178bf3ff:7ed8320b:2e500800:244037ff:0000000f:219c91a9:00400004:00000780 virt_caps : pv hvm hvm_directio pv_directio hap gnttab-v1 gnttab-v2 total_memory : 114549 free_memory : 62685 sharing_freed_memory : 0 sharing_used_memory : 0 outstanding_claims : 0 free_cpus : 0 cpu_topology : cpu: core socket node 0: 0 0 0 1: 0 0 0 2: 1 0 0 3: 1 0 0 4: 4 0 0 5: 4 0 0 6: 5 0 0 7: 5 0 0 8: 8 0 0 9: 8 0 0 10: 9 0 0 11: 9 0 0 12: 12 0 0 13: 12 0 0 14: 13 0 0 15: 13 0 0 16: 16 0 0 17: 16 0 0 18: 17 0 0 19: 17 0 0 20: 20 0 0 21: 20 0 0 22: 21 0 0 23: 21 0 0 24: 24 0 0 25: 24 0 0 26: 25 0 0 27: 25 0 0 28: 28 0 0 29: 28 0 0 30: 29 0 0 31: 29 0 0 device topology : device node No device topology data available numa_info : node: memsize memfree distances 0: 115955 62685 10 xen_major : 4 xen_minor : 17 xen_extra : .4-3 xen_version : 4.17.4-3 xen_caps : xen-3.0-x86_64 hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 xen_scheduler : credit xen_pagesize : 4096 platform_params : virt_start=0xffff800000000000 xen_changeset : d530627aaa9b, pq 7587628e7d91 xen_commandline : dom0_mem=6752M,max:6752M watchdog ucode=scan dom0_max_vcpus=1-16 crashkernel=256M,below=4G console=vga vga=mode-0x0311 cc_compiler : gcc (GCC) 11.2.1 20210728 (Red Hat 11.2.1-1) cc_compile_by : mockbuild cc_compile_domain : [unknown] cc_compile_date : Thu Jun 20 18:17:10 CEST 2024 build_id : 9497a1ec7ec99f5075421732b0ec37781ba739a9 xend_config_format : 4
VMs - Sender and Receiver
- Distro: Ubuntu 24.04
- Kernel: 6.8.0-36-generic #36-Ubuntu SMP PREEMPT_DYNAMIC
- vCPUs: 32
- RAM: 4GB
-
@probain said in Epyc VM to VM networking slow:
I ran these tests now that newer updates have been released for 8.3-beta.
Results are as below:- iperf-sender -> iperf-receiver: 5.06Gbit/s
- iperf-sender -> iperf-receiver -P4: 7.53Gbit/s
- host -> iperf-receiver: 7.83Gbit/s
- host -> iperf-receiver -P4: 13.0Gbit/s
Host (dom0):
- CPU: AMD EPYC 7302P
- Sockets: 1
- RAM: 6.59GB (dom0) / 112GB for VMs
- MotherBoard: H12SSL-i
- NIC: X540-AT2 (rev 01)
xl info -n
host : xcp release : 4.19.0+1 version : #1 SMP Mon Jun 24 17:20:04 CEST 2024 machine : x86_64 nr_cpus : 32 max_cpu_id : 31 nr_nodes : 1 cores_per_socket : 16 threads_per_core : 2 cpu_mhz : 2999.997 hw_caps : 178bf3ff:7ed8320b:2e500800:244037ff:0000000f:219c91a9:00400004:00000780 virt_caps : pv hvm hvm_directio pv_directio hap gnttab-v1 gnttab-v2 total_memory : 114549 free_memory : 62685 sharing_freed_memory : 0 sharing_used_memory : 0 outstanding_claims : 0 free_cpus : 0 cpu_topology : cpu: core socket node 0: 0 0 0 1: 0 0 0 2: 1 0 0 3: 1 0 0 4: 4 0 0 5: 4 0 0 6: 5 0 0 7: 5 0 0 8: 8 0 0 9: 8 0 0 10: 9 0 0 11: 9 0 0 12: 12 0 0 13: 12 0 0 14: 13 0 0 15: 13 0 0 16: 16 0 0 17: 16 0 0 18: 17 0 0 19: 17 0 0 20: 20 0 0 21: 20 0 0 22: 21 0 0 23: 21 0 0 24: 24 0 0 25: 24 0 0 26: 25 0 0 27: 25 0 0 28: 28 0 0 29: 28 0 0 30: 29 0 0 31: 29 0 0 device topology : device node No device topology data available numa_info : node: memsize memfree distances 0: 115955 62685 10 xen_major : 4 xen_minor : 17 xen_extra : .4-3 xen_version : 4.17.4-3 xen_caps : xen-3.0-x86_64 hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 xen_scheduler : credit xen_pagesize : 4096 platform_params : virt_start=0xffff800000000000 xen_changeset : d530627aaa9b, pq 7587628e7d91 xen_commandline : dom0_mem=6752M,max:6752M watchdog ucode=scan dom0_max_vcpus=1-16 crashkernel=256M,below=4G console=vga vga=mode-0x0311 cc_compiler : gcc (GCC) 11.2.1 20210728 (Red Hat 11.2.1-1) cc_compile_by : mockbuild cc_compile_domain : [unknown] cc_compile_date : Thu Jun 20 18:17:10 CEST 2024 build_id : 9497a1ec7ec99f5075421732b0ec37781ba739a9 xend_config_format : 4
VMs - Sender and Receiver
- Distro: Ubuntu 24.04
- Kernel: 6.8.0-36-generic #36-Ubuntu SMP PREEMPT_DYNAMIC
- vCPUs: 32
- RAM: 4GB
have you tested without the 8.3 updates? The results seem still low. Any improvement?
-
@bullerwins Unfortunately I didn't. In hindsight I wish I did.
-
These latest 8.3 update speeds are still slower than a 13 year-old Xeon E3 1230.
-
I can unfortunately share that from ongoing ticket investigations in this, It is far more deeply rooted than something that a patch of going from one major kernel to another will "just fix" There are multiple leads being investigated and multiple vendors involved.
-
I'd like to check something to see if it's coherent with our tests, by using 2x similar VMs (4vCPUs/4G RAM):
- iperf monothread speed on a "fresh" Debian 10 install (4.19 kernel)
- the same bench with 5.10.0 kernel from backports (add
deb http://deb.debian.org/debian buster-backports main contrib non-free
in your source list and then apt install linux-image-5.10, don't forget to reboot to be on that kernel)
Do you see a performance diff between those?
-
@olivierlambert said in Epyc VM to VM networking slow:
I'd like to check something to see if it's coherent with our tests, by using 2x similar VMs (4vCPUs/4G RAM):
- iperf monothread speed on a "fresh" Debian 10 install (4.19 kernel)
- the same bench with 5.10.0 kernel from backports (add
deb http://deb.debian.org/debian buster-backports main contrib non-free
in your source list and then apt install linux-image-5.10, don't forget to reboot to be on that kernel)
Do you see a performance diff between those?
FYI, getting a Debian 10 backports or non-backports packages are going to now be extremely difficult. The Debian Linux 10 LTS has reached EOL. Now currently in ELTS from the beginning of this month until 30/06/2029, though covering only a subset of the packages.
-
I had no issue to test it quickly. The thing is for the sake of testing and try to identify a potential regression, not for production usage or whatnot.
-
I identified a specific regression in a Debian kernel build since 5.10, we are investigating the "why" (starting from this exact build: https://snapshot.debian.org/package/linux/5.10.92-1/)
-
@olivierlambert
Would it be possible for you to either offer a ISO to download? Or maybe seed one? I really want to help test this. But I'm getting lost with how Debian provides their legacy images and this jig-boo (intentionally misspelled) -
May someone could graph their vm.
Comparing a slow vm with a full speed could bring light into darknes.https://www.brendangregg.com/Articles/Linux_Kernel_Performance_Flame_Graphs.pdf
-
@probain Debian 10 is available in the XOA Hub.
-
@olivierlambert
I wasn't aware. Thanks! Downloading for doing a test, right awayTest done:
Run1 Run2 Run3 Sender: Debian10 kernel 4.19 4.81Gb 4.81Gb 4.83Gb Reveiver: Debian10 kernel 4.19 Sender: Debian10 kernel 5.10 5.13Gb 5.02Gb 5.12Gb Reveiver: Debian10 kernel 4.19 Sender: Debian10 kernel 5.10 4.98Gb 5.02Gb 4.97Gb Reveiver: Debian10 kernel 5.10
sender runs 'iperf -c <IP-to-receiver> -t 60'
Kernel 4.19 = 4.19.0-6-amd64
Kernel 5.10 = 5.10.0-0.deb10.24-amd64CPU 4 cores (AMD EPYC 7302P)
RAM 4GBCreated from XOA-hub
-
Thanks @probain , now can you try
iperf -s
in the Dom0 andiperf -c <IP dom0>
in the Debian guest? -
@olivierlambert
vm -> dom0 results in "no route to host": firewall?Results will be shown for dom0 -> vm. Listed by each kernel installed on vm.
Just as earlier. VM is installed via XOA Hub, with 4 CPU and 4GB RAM. Host CPU running on AMD EPYC 7302P.
VM kernel ver. Run1 Run2 Run3 kernel 4.19.0 8.47Gb 8.82Gb 8.43Gb kernel 5.10.0 7.12Gb 7.07Gb 7.11Gb
-
yes disable the fw first (only in a testing lab obviously) with
iptables -F