@dnikola said in [HELP] XCP-ng 4.17.5 dom0 kernel panic — page fault in TCP stack, crashdump attached:
Has anyone experienced similar page faults in the dom0 TCP stack on 4.19 kernels or XCP-ng 4.17.5?
Not that I know of.
Are there any known issues with network drivers on this kernel/hypervisor combo?
No, there can be issues with some drivers, you should have specified which network NICs and drivers you are using.
Would you recommend moving to a newer dom0 kernel or hypervisor build?
On XCP-ng, the latest version is 8.3 which you didn't specify in your post, but you're using the latest version of Xen, so I assume it is an up to date 8.3, so there is no newer build.
Could a memory issue cause this specific kind of page table inconsistency during a kernel panic?
Yes, it can be a bug in the the code, but it absolutely could be a hardware issues.
Any advice on additional debug steps or log files I should collect next time?
I would start by running a memtest on that host to make sure the memory is not having issues.
Do you know if there was a specific VM doing something specific at that time? We had some issues in the past with FreeBSD VMs using wireguard, but it does not look similar, and it should be fixed now.
What kind of guests were running on that host? linux, windows, some BSD based?
If running windows guests please be sure to have read this blog post and ensure to comply with the guidelines there.
From a quick look, I don't see anything obvious. Follow Olivier's suggestion first, if you still have issues after that, you can share an additional report using xen-bugtool -y
. But please be sure to update your bios first, check your memory, and then do that.