cpu指令集有差异
-
两台dell R630服务器,相同的CPU型号的服务器:16 CPUs x Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz,两个虚拟化平台,一台服务器安装虚拟化平台是xcp-ng,一台服务器安装虚拟化平台是VMware,结果是它们两个的CPU指令集不一样。xcp-ng平台指令集少很多,导致虚拟机运行比较慢,VMware平台运行快。请问大家这个是什么问题?可以解决的吗?
这个是VMware平台的指令集:
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm invpcid_single ssbd rsb_ctxsw ibrs ibpb stibp fsgsbase tsc_adjust bmi1 avx2 smep bmi2 invpcid xsaveopt arat md_clear spec_ctrl intel_stibp flush_l1d arch_capabilities这个是xcp-ng平台的指令集:
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush acpi mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm ssbd rsb_ctxsw ibrs ibpb stibp fsgsbase smep erms xsaveopt md_clear spec_ctrl intel_stibp flush_l1d -
@709885674 It would be good to post your question in english.
For those curious, the translated text (thanks to google translate)
Two dell R630 servers, servers with the same CPU model: 16 CPUs x Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz, two virtualization platforms, one server installed virtualization platform is xcp-ng The virtualization platform installed on a server is VMware, and the result is that the CPU instruction sets of the two are different. The xcp-ng platform has a much smaller instruction set, causing the virtual machine to run slower and the VMware platform to run faster. This is the instruction set of the VMware platform: This is the instruction set of the xcp-ng platform:
I assume the question is, if the different instruction sets result in a relevant performance difference between VMware and XCP-ng.
-
It would be nice to have a diff on each missing instruction
-
VMware instruction sets that XCP-ng does not show:
dts arch_perfmon pebs bts xtopology tsc_reliable nonstop_tsc fma movbe abm invpcid_single tsc_adjust bmi1 avx2 bmi2 invpcid arat arch_capabilities
XCP-ng instruction sets that VMware does not show:
acpi ht rep_good erms
Hope I did get that right - have not used
diff
for a very loooong time
Edit: some cut & past corrections -
arch_capabilities
is available since a bit, I think the XCP-ng version might not be fully up to date -
Also to answer the original question: on Haswell (the current CPU used), it won't change a lot the overall performance.
-
@olivierlambert Thank you for your answer. The latest version on the official website is xcp-ng-8.2.1, and the version I have installed is also the latest version of XCP-ng release 8.2.1 (xenenenterprise).
-
@gskger Thank you very much for your answer. What is the problem with the two different CPU instruction sets I sent above? Is it caused by different virtualization platforms? Is the VMware platform better? Looking forward to your reply, thank you
-
There's no problem, yes VMware expose more features, but what's important is the real case scenario so you should try and see on actual systems
-
@709885674 The different instruction sets usually have no real effect on the actual performance (unless you have a very specific use case). The exposed CPU instruction sets do not decide whether VMware is better than XCP-ng or not.
Check the comparison chart between VMware and XCP-ng for an overview of plattform features. Also Tom from Lawrence Systems has a good video about XCP-ng as a replacement for VMware.
Is there a reason you expect virtual machines to run slower with a smaller instruction set?
-
@gskger Yes, I have personally tested it and found that virtual machines on the VMware platform run faster, while virtual machines on the xcp ng platform run slower. That's why I am posting here to find problems and solutions.
-
It's a bit vague. What's slower exactly?
-
@olivierlambert Sorry, my description may be a bit vague. Let me provide a more detailed description of the testing process:
Using two Dell R630 servers with the same CPU model: 16 CPUs x Intel (R) Xeon (R) CPU E5-2630 v3 @ 2.40GHz, testing was conducted on two virtualization platforms. One server was installed with xcp ng virtualization platform, and the other server was installed with VMware Essi 6.5.0 virtualization platform. Both virtualization platforms deployed the same program, and the test results showed that the program on the VMware platform ran very fast, And the programs on the xcp-ng platform run very slowly. After various investigations, it was found that the CPU instruction sets of the two platforms are different. The instruction sets on the VMware platform are more than those on the xcp-ng platform. How did you conclude? The instruction set information is as follows:
This is the instruction set for the VMware platform:
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_ Tsc arch_ Perfmon pebs bts nopl xtology tsc_ Reliable nonstop_ Tsc eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_ 1 sse4_ 2 x2apic movbe popcnt tsc_ Deadline_ Timer AES xsave avx f16c rdrand hypervisor lahf_ LM ABM INVPCID_ Single ssbd rsb_ Ctxsw ibrs ibpp stibp fsgsbase tsc_ Adjust bmi1 avx2 smep bmi2 invpcid xsaveopt arat md_ Clear spec_ CTRL Intel_ Stimp flush_ L1d arch_ CapabilitiesThis is the instruction set for the xcp ng platform:
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush acpi mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_ Tsc rep_ Good nopl eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_ 1 sse4_ 2 x2apic popcnt tsc_ Deadline_ Timer AES xsave avx f16c rdrand hypervisor lahf_ Lm ssbd rsb_ Ctxsw ibrs ibpp stibp fsgsbase smep erms xsaveopt md_ Clear spec_ CTRL Intel_ Stimp flush_ L1d -
Don't jump on any conclusion: CPU and memory is probably where the gap with VMware is the smallest.
I have no idea what your app is doing, if it needs storage or network and such. Also, "very slowly" is vague, do you have actual numbers? Like a % in terms of time to execute the same request? Or a benchmark? What is the application or what it is doing? How the VM is configured? Do you have all the tools correctly installed?
As you can see, it's not trivial to answer and CPU instructions are just a small part of the equation.
-
@olivierlambert I have been testing again these days and the virtualization platform I installed this time is another platform called KVM. The same program was deployed and used very smoothly without any lag. So I am very curious why there is no AVX2 in the CPU instruction set of the xcp ng platform? The CPU instruction set on the KVM platform is as follows:
[ root@localhost ~]#Cat/proc/cpuinfo | grep flags
Flags: fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx rdtscp lm constant_ Tsc rep_ Good nopl eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_ 1 sse4_ 2 x2apic movbe popcnt tsc_ Deadline_ Timer AES xsave avx hypervisor lahf_ Lm 3dnowprefetch invpcid_ Single rsb_ Ctxsw fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt
Flags: fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx rdtscp lm constant_ Tsc rep_ Good nopl eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_ 1 sse4_ 2 x2apic movbe popcnt tsc_ Deadline_ Timer AES xsave avx hypervisor lahf_ Lm 3dnowprefetch invpcid_ Single rsb_ Ctxsw fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt
Flags: fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx rdtscp lm constant_ Tsc rep_ Good nopl eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_ 1 sse4_ 2 x2apic movbe popcnt tsc_ Deadline_ Timer AES xsave avx hypervisor lahf_ Lm 3dnowprefetch invpcid_ Single rsb_ Ctxsw fsgsbase bmi1 hle * * avx2 * * smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt
Flags: fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx rdtscp lm constant_ Tsc rep_ Good nopl eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_ 1 sse4_ 2 x2apic movbe popcnt tsc_ Deadline_ Timer AES xsave avx hypervisor lahf_ Lm 3dnowprefetch invpcid_ Single rsb_ Ctxsw fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt
[ root@localhost ~]#It was found that there are more instruction sets in kvm than in xcp ng platform, especially in xcp ng platform where AVX2 (Advanced Vector Extension 2) instruction set is not available. AVX2 (Advanced Vector Extension 2) is an advanced vector extension instruction set used in modern CPUs. It is an extension of the AVX instruction set, providing more advanced features and performance improvements. AVX2 is mainly used to enhance the performance of processors in handling floating-point and integer operations, especially when dealing with large amounts of data. It is widely used in fields such as high-performance computing tasks, graphics processing, scientific computing, and machine learning.
Can anyone solve this problem? Or is there an official solution to this problem?
-
@709885674 Linux Debian 11 guest VM shows AVX2 on my XCP 8.2 host (11th Gen i7):
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush acpi mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves umip pku ospke gfni vaes vpclmulqdq rdpid md_clear flush_l1d arch_capabilities
-
@Andrew Thank you for your valuable experience and suggestions. I will personally test it and provide feedback on the results after testing.