Weppel

Weppel

@stormi Confirmed

Weppel

@bleader Thank you very much for the quick discovery of this, impressive work! I'm glad I could help!

Weppel

FYI this is not fixed yet in the latest EL kernel 4.18.0-477.15.1.el8_8.x86_64

Weppel

@olivierlambert

I've created a bug report at Rocky Linux: https://bugs.rockylinux.org/view.php?id=3565

Feel free to add to this if I missed any relevant information.

Weppel

@KFC-Netearth

The Rocky Linux bugtracker indeed mentions it's mostly fixed, but there are still some kernel errors present: https://bugs.rockylinux.org/view.php?id=3565#c4293

Weppel

FYI this is not fixed yet in the latest EL kernel 4.18.0-477.15.1.el8_8.x86_64

Weppel

@olivierlambert said in kswapd0: page allocation failure under high load:

I know it's not an answer to your original problem, but FYI, you would have for less Dom0 load doing XO incremental backup (especially using NBD).

Thanks for the hint. I've been looking at switching to XO for these (and other) tasks but pricing is currently refraining me from switching.

Weppel

The export is creating heavy load. It's doing full VM dumps with compression on all 3 nodes in the cluster at the same time (one VM per node).

The export is done through CLI (/usr/bin/xe vm-export vm=$VM filename="$FILENAME" compress=zstd) to a locally mounted nfs4 folder, not via XO.

The original VM storage is indeed on shared SR, a Ceph RBD.

The setup has not changed for ~1,5 years. This issue started popping up about 1-2 months ago (could be more, I unfortunately do not have a specific time/date this happened first).

Weppel

When running VM exports the master infrequently gives multiple page allocation failures, like this one:

[1015432.935572] kswapd0: page allocation failure: order:0, mode:0x480020(GFP_ATOMIC), nodemask=(null)
[1015432.935572] kswapd0 cpuset=/ mems_allowed=0
[1015432.935573] CPU: 4 PID: 109 Comm: kswapd0 Tainted: G           O      4.19.0+1 #1
[1015432.935573] Hardware name: Supermicro Super Server/H12SSL-CT, BIOS 2.3 10/20/2021
[1015432.935573] Call Trace:
[1015432.935574]  <IRQ>
[1015432.935574]  dump_stack+0x5a/0x73
[1015432.935575]  warn_alloc+0xee/0x180
[1015432.935576]  __alloc_pages_slowpath+0x84d/0xa09
[1015432.935577]  ? get_page_from_freelist+0x14c/0xf00
[1015432.935578]  ? ttwu_do_wakeup+0x19/0x140
[1015432.935579]  ? _raw_spin_unlock_irqrestore+0x14/0x20
[1015432.935580]  ? try_to_wake_up+0x54/0x450
[1015432.935581]  __alloc_pages_nodemask+0x271/0x2b0
[1015432.935582]  bnxt_rx_pages+0x194/0x4f0 [bnxt_en]
[1015432.935584]  bnxt_rx_pkt+0xccd/0x1510 [bnxt_en]
[1015432.935586]  __bnxt_poll_work+0x10e/0x2a0 [bnxt_en]
[1015432.935588]  bnxt_poll+0x8d/0x640 [bnxt_en]
[1015432.935589]  net_rx_action+0x2a5/0x3e0
[1015432.935590]  __do_softirq+0xd1/0x28c
[1015432.935590]  irq_exit+0xa8/0xc0
[1015432.935591]  xen_evtchn_do_upcall+0x2c/0x50
[1015432.935592]  xen_do_hypervisor_callback+0x29/0x40
[1015432.935592]  </IRQ>
[1015432.935593] RIP: e030:xen_hypercall_xen_version+0xa/0x20
[1015432.935593] Code: 51 41 53 b8 10 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 11 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
[1015432.935594] RSP: e02b:ffffc900404af950 EFLAGS: 00000246
[1015432.935594] RAX: 000000000004000d RBX: 000000000000000f RCX: ffffffff8100122a
[1015432.935595] RDX: 000000000000000f RSI: 0000000000000000 RDI: 0000000000000000
[1015432.935595] RBP: ffffffffffffffff R08: 00000000ffffff31 R09: 000000000000000f
[1015432.935595] R10: 0002f632ef000006 R11: 0000000000000246 R12: 0000000000000000
[1015432.935596] R13: 0002f632ef040006 R14: ffffc900404afa50 R15: ffffc900404afac8
[1015432.935596]  ? xen_hypercall_xen_version+0xa/0x20
[1015432.935597]  ? xen_force_evtchn_callback+0x9/0x10
[1015432.935598]  ? check_events+0x12/0x20
[1015432.935598]  ? xen_irq_enable_direct+0x19/0x20
[1015432.935599]  ? truncate_exceptional_pvec_entries.part.16+0x175/0x1d0
[1015432.935600]  ? truncate_inode_pages_range+0x280/0x7d0
[1015432.935601]  ? deactivate_slab.isra.74+0xef/0x400
[1015432.935602]  ? __inode_wait_for_writeback+0x75/0xe0
[1015432.935603]  ? init_wait_var_entry+0x40/0x40
[1015432.935605]  ? nfs4_evict_inode+0x15/0x70 [nfsv4]
[1015432.935606]  ? evict+0xc6/0x1a0
[1015432.935607]  ? dispose_list+0x35/0x50
[1015432.935608]  ? prune_icache_sb+0x52/0x70
[1015432.935608]  ? super_cache_scan+0x13c/0x190
[1015432.935609]  ? do_shrink_slab+0x166/0x300
[1015432.935610]  ? shrink_slab+0xdd/0x2a0
[1015432.935611]  ? shrink_node+0xf1/0x480
[1015432.935612]  ? kswapd+0x2b7/0x730
[1015432.935613]  ? kthread+0xf8/0x130
[1015432.935614]  ? mem_cgroup_shrink_node+0x180/0x180
[1015432.935615]  ? kthread_bind+0x10/0x10
[1015432.935616]  ? ret_from_fork+0x22/0x40

I've done extensive memory tests, all looks fine. The hosts are currently at 8.2.1 (with latest updates). This looks to be a recent issue, I have not seen this issue pop up prior to upgrading to xcp-ng 8.2.x.

Weppel

@bleader Thank you very much for the quick discovery of this, impressive work! I'm glad I could help!

Weppel

@stormi Confirmed

Weppel

@stormi Thanks for noticing, my bad

Weppel

@olivierlambert

I've created a bug report at Rocky Linux: https://bugs.rockylinux.org/view.php?id=3565

Feel free to add to this if I missed any relevant information.

Weppel

@olivierlambert Thank you very much for the quick follow-up.

I've done some testing with a colleague and it looks to be kernel related. The stock Rocky Linux 8.8 kernel (4.18.0-477.13.1.el8_8.x86_64) causes the reboot to happen. Upgrading the kernel to kernel-lt (5.4.245-1.1.el8.elrepo.x86_64) allows the VM to be live-migrated again without reboot/crash.

Weppel

@Weppel

Best posts made by Weppel

Latest posts made by Weppel