kernel NULL pointer
-
I've see a few of these....
Standard kernel, standard install of XCP 8.2. Server does keep running. Related to SMB ISO share going off-line?
Linux xcp4 4.19.0+1 #1 SMP Tue Mar 30 22:34:15 CEST 2021 x86_64 x86_64 x86_64 GNU/Linux
[ 3171.503116] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 [ 3171.503132] PGD 24dcab067 P4D 24dcab067 PUD 24c49c067 PMD 0 [ 3171.503142] Oops: 0000 [#1] SMP NOPTI [ 3171.503149] CPU: 12 PID: 1783 Comm: PLUGIN[diskspac Tainted: G O 4.19.0+1 #1 [ 3171.503157] Hardware name: HP ProLiant DL360p Gen8, BIOS P71 05/24/2019 [ 3171.503186] RIP: e030:SMB2_query_info_free+0x8/0x10 [cifs] [ 3171.503193] Code: c0 31 f6 48 c7 c7 80 80 79 c0 31 c0 e8 05 d0 95 c0 eb d9 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 48 8b 07 <48> 8b 38 e9 60 22 fe ff 66 66 66 66 90 48 83 ec 30 4d 63 c0 48 8d [ 3171.503210] RSP: e02b:ffffc9004101fbc8 EFLAGS: 00010246 [ 3171.503217] RAX: 0000000000000000 RBX: ffffc9004101fd50 RCX: 0000000000000000 [ 3171.503225] RDX: ffff88824aefa970 RSI: ffff88824d5a0200 RDI: ffffc9004101fd78 [ 3171.503233] RBP: ffffc9004101fe00 R08: 0000000000000000 R09: 0000000000000000 [ 3171.503241] R10: 0000000000007ff0 R11: 000002e511064fb5 R12: ffff88824aef9800 [ 3171.503249] R13: ffffc9004101fc30 R14: ffff88822f839600 R15: 0000000000000000 [ 3171.503271] FS: 00007f925a36b700(0000) GS:ffff888251500000(0000) knlGS:0000000000000000 [ 3171.503280] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3171.503287] CR2: 0000000000000000 CR3: 000000024c152000 CR4: 0000000000040660 [ 3171.503300] Call Trace: [ 3171.503317] smb2_queryfs+0x13a/0x310 [cifs] [ 3171.503329] ? lookup_fast+0xcb/0x2b0 [ 3171.503335] ? __follow_mount_rcu.isra.42+0x3c/0xf0 [ 3171.503342] ? walk_component+0x48/0x280 [ 3171.503347] ? legitimize_path.isra.44+0x28/0x50 [ 3171.503354] ? terminate_walk+0x55/0xb0 [ 3171.503367] cifs_statfs+0xb0/0x290 [cifs] [ 3171.503376] statfs_by_dentry+0x99/0x120 [ 3171.503383] vfs_statfs+0x16/0xc0 [ 3171.503389] user_statfs+0x50/0x90 [ 3171.503395] __do_sys_statfs+0x20/0x50 [ 3171.503403] do_syscall_64+0x4e/0x100 [ 3171.503411] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 3171.503418] RIP: 0033:0x7f925c8b5787 [ 3171.503423] Code: 2d 00 64 c7 00 16 00 00 00 b8 ff ff ff ff c3 48 8b 15 fd 66 2d 00 f7 d8 64 89 02 48 83 c8 ff c3 0f 1f 00 b8 89 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d9 66 2d 00 f7 d8 64 89 01 48 [ 3171.503441] RSP: 002b:00007f925a369848 EFLAGS: 00000206 ORIG_RAX: 0000000000000089 [ 3171.503450] RAX: ffffffffffffffda RBX: 0000000002027a00 RCX: 00007f925c8b5787 [ 3171.503458] RDX: 0000000000000030 RSI: 00007f925a369850 RDI: 0000000002027a00 [ 3171.503467] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000010 [ 3171.503475] R10: 00007f925782018c R11: 0000000000000206 R12: 00000000000008a1 [ 3171.503483] R13: 00007f925a369a70 R14: 0000000001fe8d60 R15: 0044b82fa09b5a53 [ 3171.503492] Modules linked in: tun arc4 md4 sha512_ssse3 sha512_generic cmac nls_utf8 cifs ccm rpcsec_gss_krb5 auth_rpcgss oid_registry nfsv4 nfs lockd grace fscache bnx2fc(O) cnic(O) uio fcoe libfcoe libfc scsi_transport_fc openvswitch nsh nf_nat_ipv6 nf_nat_ipv4 nf_conncount nf_nat 8021q garp mrp stp llc dm_multipath ipt_REJECT nf_reject_ipv4 xt_tcpudp xt_multiport xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter sb_edac crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper sunrpc dm_mod psmouse sg lpc_ich hpilo ipmi_si ip_tables x_tables sr_mod cdrom ata_generic pata_acpi hid_generic usbhid hid sd_mod uhci_hcd serio_raw ata_piix ehci_pci libata ehci_hcd hpsa scsi_transport_sas ixgbe(O) scsi_dh_rdac scsi_dh_hp_sw [ 3171.503587] scsi_dh_emc scsi_dh_alua scsi_mod ipmi_watchdog ipmi_devintf ipmi_msghandler ipv6 crc_ccitt [ 3171.503600] CR2: 0000000000000000 [ 3171.503606] ---[ end trace 105dc5c6a051198c ]--- [ 3171.503623] RIP: e030:SMB2_query_info_free+0x8/0x10 [cifs] [ 3171.503630] Code: c0 31 f6 48 c7 c7 80 80 79 c0 31 c0 e8 05 d0 95 c0 eb d9 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 48 8b 07 <48> 8b 38 e9 60 22 fe ff 66 66 66 66 90 48 83 ec 30 4d 63 c0 48 8d [ 3171.503647] RSP: e02b:ffffc9004101fbc8 EFLAGS: 00010246 [ 3171.503653] RAX: 0000000000000000 RBX: ffffc9004101fd50 RCX: 0000000000000000 [ 3171.503662] RDX: ffff88824aefa970 RSI: ffff88824d5a0200 RDI: ffffc9004101fd78 [ 3171.503670] RBP: ffffc9004101fe00 R08: 0000000000000000 R09: 0000000000000000 [ 3171.503678] R10: 0000000000007ff0 R11: 000002e511064fb5 R12: ffff88824aef9800 [ 3171.503686] R13: ffffc9004101fc30 R14: ffff88822f839600 R15: 0000000000000000 [ 3171.503701] FS: 00007f925a36b700(0000) GS:ffff888251500000(0000) knlGS:0000000000000000 [ 3171.503710] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3171.503717] CR2: 0000000000000000 CR3: 000000024c152000 CR4: 0000000000040660
[2058948.013167] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 [2058948.013186] PGD 8b3c7067 P4D 8b3c7067 PUD 88485067 PMD 0 [2058948.013196] Oops: 0000 [#2] SMP NOPTI [2058948.013204] CPU: 9 PID: 28192 Comm: sadc Tainted: G D O 4.19.0+1 #1 [2058948.013214] Hardware name: HP ProLiant DL360p Gen8, BIOS P71 05/24/2019 [2058948.013244] RIP: e030:SMB2_query_info_free+0x8/0x10 [cifs] [2058948.013254] Code: c0 31 f6 48 c7 c7 80 80 79 c0 31 c0 e8 05 d0 95 c0 eb d9 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 48 8b 07 <48> 8b 38 e9 60 22 fe ff 66 66 66 66 90 48 83 ec 30 4d 63 c0 48 8d [2058948.013273] RSP: e02b:ffffc9004a477bc8 EFLAGS: 00010246 [2058948.013280] RAX: 0000000000000000 RBX: ffffc9004a477d50 RCX: 0000000000000000 [2058948.013290] RDX: ffff88824aefa970 RSI: ffff88824cb30200 RDI: ffffc9004a477d78 [2058948.013299] RBP: ffffc9004a477e00 R08: 0000000000000000 R09: 0000000000000000 [2058948.013308] R10: 0000000000007ff0 R11: 0007509cd5a53b48 R12: ffff88824aef9800 [2058948.013317] R13: ffffc9004a477c30 R14: ffff88822f839600 R15: 0000000000000000 [2058948.013337] FS: 00007f2b5a861740(0000) GS:ffff888251440000(0000) knlGS:0000000000000000 [2058948.013347] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 [2058948.013356] CR2: 0000000000000000 CR3: 00000000365b0000 CR4: 0000000000040660 [2058948.013371] Call Trace: [2058948.013390] smb2_queryfs+0x13a/0x310 [cifs] [2058948.013401] ? lookup_fast+0xcb/0x2b0 [2058948.013408] ? __follow_mount_rcu.isra.42+0x3c/0xf0 [2058948.013416] ? walk_component+0x48/0x280 [2058948.013423] ? legitimize_path.isra.44+0x28/0x50 [2058948.013430] ? terminate_walk+0x55/0xb0 [2058948.013445] cifs_statfs+0xb0/0x290 [cifs] [2058948.013455] statfs_by_dentry+0x99/0x120 [2058948.013462] vfs_statfs+0x16/0xc0 [2058948.013469] user_statfs+0x50/0x90 [2058948.013476] __do_sys_statfs+0x20/0x50 [2058948.013484] do_syscall_64+0x4e/0x100 [2058948.013492] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [2058948.013500] RIP: 0033:0x7f2b5a15f787 [2058948.013506] Code: 2d 00 64 c7 00 16 00 00 00 b8 ff ff ff ff c3 48 8b 15 fd 66 2d 00 f7 d8 64 89 02 48 83 c8 ff c3 0f 1f 00 b8 89 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d9 66 2d 00 f7 d8 64 89 01 48 [2058948.013525] RSP: 002b:00007ffcf7cc3d58 EFLAGS: 00000206 ORIG_RAX: 0000000000000089 [2058948.013535] RAX: ffffffffffffffda RBX: 00007ffcf7cc3f90 RCX: 00007f2b5a15f787 [2058948.013545] RDX: 000000000000002f RSI: 00007ffcf7cc3d60 RDI: 00007ffcf7cc3f90 [2058948.013554] RBP: 0000000000000001 R08: 00007f2b5a437060 R09: 00007f2b539a1a4c [2058948.013563] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000d49590 [2058948.013573] R13: 00007ffcf7cc3ea0 R14: 0000000000000001 R15: 00007ffcf7cc4578 [2058948.013591] Modules linked in: tun arc4 md4 sha512_ssse3 sha512_generic cmac nls_utf8 cifs ccm rpcsec_gss_krb5 auth_rpcgss oid_registry nfsv4 nfs lockd grace fscache bnx2fc(O) cnic(O) uio fcoe libfcoe libfc scsi_transport_fc openvswitch nsh nf_nat_ipv6 nf_nat_ipv4 nf_conncount nf_nat 8021q garp mrp stp llc dm_multipath ipt_REJECT nf_reject_ipv4 xt_tcpudp xt_multiport xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter sb_edac crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper sunrpc dm_mod psmouse sg lpc_ich hpilo ipmi_si ip_tables x_tables sr_mod cdrom ata_generic pata_acpi hid_generic usbhid hid sd_mod uhci_hcd serio_raw ata_piix ehci_pci libata ehci_hcd hpsa scsi_transport_sas ixgbe(O) scsi_dh_rdac scsi_dh_hp_sw [2058948.013694] scsi_dh_emc scsi_dh_alua scsi_mod ipmi_watchdog ipmi_devintf ipmi_msghandler ipv6 crc_ccitt [2058948.013709] CR2: 0000000000000000 [2058948.013715] ---[ end trace 105dc5c6a051198d ]--- [2058948.013737] RIP: e030:SMB2_query_info_free+0x8/0x10 [cifs] [2058948.013746] Code: c0 31 f6 48 c7 c7 80 80 79 c0 31 c0 e8 05 d0 95 c0 eb d9 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 48 8b 07 <48> 8b 38 e9 60 22 fe ff 66 66 66 66 90 48 83 ec 30 4d 63 c0 48 8d [2058948.013765] RSP: e02b:ffffc9004101fbc8 EFLAGS: 00010246 [2058948.013772] RAX: 0000000000000000 RBX: ffffc9004101fd50 RCX: 0000000000000000 [2058948.013781] RDX: ffff88824aefa970 RSI: ffff88824d5a0200 RDI: ffffc9004101fd78 [2058948.013791] RBP: ffffc9004101fe00 R08: 0000000000000000 R09: 0000000000000000 [2058948.013800] R10: 0000000000007ff0 R11: 000002e511064fb5 R12: ffff88824aef9800 [2058948.013809] R13: ffffc9004101fc30 R14: ffff88822f839600 R15: 0000000000000000 [2058948.013826] FS: 00007f2b5a861740(0000) GS:ffff888251440000(0000) knlGS:0000000000000000 [2058948.013836] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 [2058948.013844] CR2: 0000000000000000 CR3: 00000000365b0000 CR4: 0000000000040660
-
Yes, very likely.
-
Also in my setup I see several kernel NULL pointer followed by a system crash and a reboot after several days. in log files (kern.log) there is a hole from several hours before the crash.
XCP-ng 8.2
CPU AMD EPYC Milan
SMB ISO mounted on a self hosted VM based on windows 2019.
Several error relater to this storage repository also in not used (no anymore ISO inside, and so no link on virtual DVD drive active).
Now I delete this SMB ISO repo looking forward to see if the problem it's solved or not.... quite impressed for the chance to create a server crash with a similar bug... -
Seems like it's also reported upstream
https://bugs.xenserver.org/projects/XSO/issues/XSO-1021 -
@hoerup Ye we have reported it upstream after investigating and testing changes.
-
It means the kernel tried to deference a null pointer. This generates a page fault which can't be handled in the kernel- if it's running a user task (but in kernel space), it generally makes an "Oops" which (uncleanly) kills the current task and may leak kernel resources. If it's in some other context, e.g. an interrupt, it generally causes a kernel panic.
-
This post is deleted! -
Follow-up on this subject: a fix is still queued for the next kernel update on XCP-ng 8.2, and is already present in XCP-ng 8.3 Alpha.
-
An update candidate has a fix for this and should be published tomorrow as an official update.
-