Posts made by Forza | XCP-ng and XO forum

Forza

Would sr-iov with xoa help backup speeds?

Forza

Hi,

It is not clear to me if the old XCP-ng PV drivers (8.2.2.200-RC1) are affected or not. How should we proceed if they are? AFAIK it is no easy task to migrate to WindowsUpdate drivers and it usually ends up with some issue.

Forza

@gudge25, Something like this

Forza

@gudge25 those are old mirrors. Alpine 3.10 is also EOL since some time now

You should be able to use setup-apkrepos to get new mirrors, though it might be a catch-22 situation...

The current main mirror is:
http://dl-cdn.alpinelinux.org/alpine/v3.21/main
http://dl-cdn.alpinelinux.org/alpine/v3.21/community

I create my own templates (as normal VMs that can be cloned) for the versions of Alpine we use, as it was too much trouble to use the old version.

Forza

OK, thanks for the update. I would be interesting to hear what AMD said about this issue.

Forza

@TeddyAstie Unfortunately not. This is a production pool on 8.2.1 so I do not want to try too experimental things.

Do we know if the issue happens on plain Xen on a modern (6.12-15) dom0 kernel?

Forza

@olivierlambert said in Epyc VM to VM networking slow:

If we become partners officially, we'll be able to have more advanced accesses with their teams. I still have hope, it's just that the pace isn't on me.

Hi, is there anything new to report on this? We have very powerful machines, but unfortunately limited by this stubborn issue.

Forza

@etlweather isnt netdata just monitoring dom0 cpu, not xen hypervisor itswlf.

Forza

@Bastien-Nollet Thanks for checking. At least I know it is set to 20000 entries at the moment. Thank you.

Forza

@olivierlambert thanks! Logrotate seems easy enough to understand and adjust if needed. It's the audit and XOA logs that I need help understanding.

Forza

Is there a way to define the maximum retention on audit logs (the audit plugin in XOA) and the general XOA and XCP-ng application logs?

Forza

Would be very interesting to see the performance on EPYC systems.

Forza

@olivierlambert aha, I misinderstood. Should I open another topic or perhaps a support ticket?

Forza

@dinhngtu Yes, looks like it. I stopped Netdata and the problem went away. But it is strange it started after the latest set of updates.

Forza

Hi, sorry for revisiting an older topic, but we have the same issue with slow VM migration. We changed from 2x1G to 2x10G network, but the migration performance from one host's local SR to another host local SR is not much improved.

Using XCP-ng 8.2.1 with up-to-date patches. Local storage on both hosts is SSD RAID1, ext4.

It would be very good if we could improve this situation. Currently we are seeing only 5% network utilisation.

Forza

Found a link with similar issue and a fix by disabling ACPI power monitoring. Would that have any impact in XCP-ng - i.e. is this feature used by anything?

https://www.suse.com/support/kb/doc/?id=000017865

EDIT: Perhaps it is netdata. I will disable netdata and check again.

Forza

With the latest XCP-ng updates, I am getting dmesg errors. They appeared immediately after yum update finished, and remain after reboot. Anyone seen this before and knows what to do?

[Mar19 10:20] ACPI Error: SMBus/IPMI/GenericSerialBus write requires Buffer of length 66, found length 32 (20180810/exfield-393)
[  +0.000009] ACPI Error: Method parse/execution failed \_SB.PMI0._PMM, AE_AML_BUFFER_LIMIT (20180810/psparse-516)
[  +0.000008] ACPI Error: AE_AML_BUFFER_LIMIT, Evaluating _PMM (20180810/power_meter-338)
[  +0.999960] ACPI Error: SMBus/IPMI/GenericSerialBus write requires Buffer of length 66, found length 32 (20180810/exfield-393)
[  +0.000008] ACPI Error: Method parse/execution failed \_SB.PMI0._PMM, AE_AML_BUFFER_LIMIT (20180810/psparse-516)
[  +0.000008] ACPI Error: AE_AML_BUFFER_LIMIT, Evaluating _PMM (20180810/power_meter-338)
[  +0.999961] ACPI Error: SMBus/IPMI/GenericSerialBus write requires Buffer of length 66, found length 32 (20180810/exfield-393)
[  +0.000009] ACPI Error: Method parse/execution failed \_SB.PMI0._PMM, AE_AML_BUFFER_LIMIT (20180810/psparse-516)
[  +0.000008] ACPI Error: AE_AML_BUFFER_LIMIT, Evaluating _PMM (20180810/power_meter-338)

This is a HPE DL325 Gen10 EPYC system with XCP-ng 8.2.1.

Forza

@olivierlambert said in Epyc VM to VM networking slow:

No obvious solution yet, it's likely due to an architecture problem on AMD, because of CCDs and how CPUs are made. So the solution (if there's any) will be likely a sum of various small improvements to make it bearable.

I'm going to Santa Clara to discuss that with AMD directly (among other things).

Do we have other data to back this? The issue is not really common outside of Xen. I do hope some solution comes out from the meeting with AMD.

Forza

@olivierlambert Nice Thanks for the feedback.

Forza

@DustinB Yes, I remember it being like this. However, it would be nice if it wasn't So I'd see this as a feature request.