@mdavico said in Debian 9 virtual machine does not start in xcp-ng 8.3:
Update: If I change the vCPU configuration to 1 socket with 4 cores per socket the VM starts correctly
Interesring. First time I heard it had any effect at all on a VM.
@mdavico said in Debian 9 virtual machine does not start in xcp-ng 8.3:
Update: If I change the vCPU configuration to 1 socket with 4 cores per socket the VM starts correctly
Interesring. First time I heard it had any effect at all on a VM.
Tested the new updates on my prod EPYC 7402P pool with iperf3
. Seems like quite a good uplift
Ubuntu 24.04 VM (6 cores) -> bare metal server (6 cores) over a 2x25Gbit LACP link.
Ubuntu 24.04 VM (6 cores) -> Ubuntu 24.04 VM (6 cores) on the same host
Forgot to test this...
Our servers have Last-Level Cache (LLC) as NUMA Node
enabled as most our VMs do not have huge amount of vCPUs assigned. This means for the EPYC 7402P (24c/48t) we have 8 NUMA nodes. We however do not use xl cpupool-numa-split
.
It is a good suggestion. I have asked for similar in the past too I think having retetion periods like that is very good practice.
I have tried to do something similar by creating multiple backup jobs and assigning tags to them.
It is not perfect, but it is easy to determine what jobs specific VMs belong to. I would love to be able to create a progressive schedule like in your example, as well as being able to create one-off snapshots and backups that stay outside the normal schedules.
@olivierlambert said in NVMe SSD not found when installing:
@DustinB Last time I checked, VMD is a shitting half-baked soft/hard RAID.
Indeed. The firmware hides pci devices behind this vmd thing. It is absolutely unstable and unfixable.
We have been using software raid1 for years on Intel hardware for industrial focused computers. However since Intel switched to VMD we started to get very odd problems like blue screens (windows) and spontaneous reboots and hard lockups where the raid volume wouldn't come back unless we did full power cycle. After some months we found a reproducer which we sent to our vendor, who in turn were able to reproduce it on different motherboards of different manufacturers with different chipsets supporting VMD. Until today we have not found a fix.
That is odd. Try another bootable linux distro like Fedora that has newer kernels and tools?
❯ nvme list
Node Generic SN Model Namespace Usage Format FW Rev
--------------------- --------------------- -------------------- ---------------------------------------- ---------- -------------------------- ---------------- --------
/dev/nvme0n1 /dev/ng0n1 123456789012 WD_BLACK SN850X HS 2000GB 0x1 2.00 TB / 2.00 TB 4 KiB + 0 B 620331WD
I'd guess that most SD Samsung SSD's are 512e unless the user changes it specifically. But the fact that it is not listed when you boot a live USB is a problem. Not sure why that would be. Perhaps some other BIOS setting is available? On my servers I can opt to use UEFI firmware from the NVME device or use the generic build-in firmware.
EDIT: Are you using too many PCIe devices? If not enough lanes are available it could be that the nvme device is not found. Another issue can be with PCIe-nvme adapters that required bifurbication..
@Kennet You need to boot in UEFI mode. And secondly, you must make sure your nvme is using 512e insted of 4kn sector size. You can use nvme-cli command to check sector size.
@jangar don't they have a statically linked version of storcli that can be downloaded?
Hi,
It is not clear to me if the old XCP-ng PV drivers (8.2.2.200-RC1) are affected or not. How should we proceed if they are? AFAIK it is no easy task to migrate to WindowsUpdate drivers and it usually ends up with some issue.
@gudge25 those are old mirrors. Alpine 3.10 is also EOL since some time now
You should be able to use setup-apkrepos
to get new mirrors, though it might be a catch-22 situation...
The current main mirror is:
http://dl-cdn.alpinelinux.org/alpine/v3.21/main
http://dl-cdn.alpinelinux.org/alpine/v3.21/community
I create my own templates (as normal VMs that can be cloned) for the versions of Alpine we use, as it was too much trouble to use the old version.
OK, thanks for the update. I would be interesting to hear what AMD said about this issue.
@TeddyAstie Unfortunately not. This is a production pool on 8.2.1 so I do not want to try too experimental things.
Do we know if the issue happens on plain Xen on a modern (6.12-15) dom0 kernel?
@olivierlambert said in Epyc VM to VM networking slow:
If we become partners officially, we'll be able to have more advanced accesses with their teams. I still have hope, it's just that the pace isn't on me.
Hi, is there anything new to report on this? We have very powerful machines, but unfortunately limited by this stubborn issue.
@etlweather isnt netdata just monitoring dom0 cpu, not xen hypervisor itswlf.
@Bastien-Nollet Thanks for checking. At least I know it is set to 20000 entries at the moment. Thank you.
@olivierlambert thanks! Logrotate seems easy enough to understand and adjust if needed. It's the audit and XOA logs that I need help understanding.
Is there a way to define the maximum retention on audit logs (the audit plugin in XOA) and the general XOA and XCP-ng application logs?
Would be very interesting to see the performance on EPYC systems.
@olivierlambert aha, I misinderstood. Should I open another topic or perhaps a support ticket?