XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Kernel panic on fresh install

    Scheduled Pinned Locked Moved Compute
    34 Posts 5 Posters 4.8k Views 4 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • olivierlambertO Offline
      olivierlambert Vates 🪐 Co-Founder CEO
      last edited by

      So you had the same issue after all the last updates + a "manual" reboot?

      S 1 Reply Last reply Reply Quote 0
      • S Offline
        sasha @olivierlambert
        last edited by

        @olivierlambert said in Kernel panic on fresh install:

        So you had the same issue after all the last updates + a "manual" reboot?

        No reboot after updates. Just a mention, that in my case reducing MTU on heavy-utilised Wireguard interface didn't help.

        Also these reboots completely unpredictable, sometimes during busy day, but more often during night hours where only backups can run.

        1 Reply Last reply Reply Quote 0
        • olivierlambertO Offline
          olivierlambert Vates 🪐 Co-Founder CEO
          last edited by

          Okay so now you have the updates really installed, we'll see if it happens 🙂

          S 2 Replies Last reply Reply Quote 1
          • S Offline
            sasha @olivierlambert
            last edited by

            @olivierlambert said in Kernel panic on fresh install:

            Okay so now you have the updates really installed, we'll see if it happens 🙂

            Just got crash reboot again, while trying to restart a VM from XCP-center from another VM in the pool. This reboot should apply all patches from last updates.

            1 Reply Last reply Reply Quote 1
            • S Offline
              sasha @olivierlambert
              last edited by sasha

              @olivierlambert

              Just had two consecutive crashes 15 minutes apart.
              This is comparison between old crash before updates and after latest updates.
              f19d588d-7201-46f2-badb-b51945b13f4e-image.png

              S 1 Reply Last reply Reply Quote 1
              • S Offline
                sasha @sasha
                last edited by

                New crash, same message...

                1 Reply Last reply Reply Quote 0
                • olivierlambertO Offline
                  olivierlambert Vates 🪐 Co-Founder CEO
                  last edited by

                  It's weird, OVS is not involved. So it might be something else 🤔

                  Any chance you know how to trigger it artificially? That would be really helpful to pinpoint the issue.

                  S 1 Reply Last reply Reply Quote 0
                  • S Offline
                    sasha @olivierlambert
                    last edited by

                    @olivierlambert
                    I would love to! Only thing I can say - is when I was using windows XCP-console from VM inside the pool, once starting VM caused whole system to crash with this error, another day changing VAPP config also crashed server. That is why I am trying to avoid using xcp-console during business time. I'll try to reproduce it once again and reply.

                    T 1 Reply Last reply Reply Quote 1
                    • T Offline
                      tuxen Top contributor @sasha
                      last edited by

                      @sasha It's worth notice that the BIOS (from 2019) is relatively old/outdated. It's recommended to update the BIOS to a more recent version.

                      S 1 Reply Last reply Reply Quote 1
                      • S Offline
                        sasha @tuxen
                        last edited by

                        @tuxen said in Kernel panic on fresh install:

                        @sasha It's worth notice that the BIOS (from 2019) is relatively old/outdated. It's recommended to update the BIOS to a more recent version.

                        Thank you for pointing out! I'll try to reach support team for this.

                        1 Reply Last reply Reply Quote 0
                        • S Offline
                          sasha
                          last edited by

                          Got similar crash today on new hardware

                          WARN: Hardware name: Dell Inc. PowerEdge R640/0W23H8, BIOS 2.18.1 02/22/2023
                          ...
                          WARN: CR2: 0000000000000008 CR3: 000000000384a000 CR4: 0000000000040660
                          [ 484188.711152]   WARN: Call Trace:
                          [ 484188.711163]   WARN:  <IRQ>
                          [ 484188.711178]   WARN:  ? _raw_spin_unlock_irqrestore+0x14/0x20
                          [ 484188.711203]   WARN:  tun_net_xmit+0x3de/0x460 [tun]
                          [ 484188.711223]   WARN:  dev_hard_start_xmit+0xa4/0x210
                          [ 484188.711242]   WARN:  sch_direct_xmit+0x10d/0x350
                          [ 484188.711256]   WARN:  __qdisc_run+0x167/0x4e0
                          [ 484188.711269]   WARN:  ? pfifo_fast_enqueue+0x92/0xf0
                          [ 484188.711284]   WARN:  __dev_queue_xmit+0x511/0x900
                          [ 484188.711300]   WARN:  ? skb_copy_ubufs+0x5b0/0x5f0
                          [ 484188.711334]   WARN:  do_execute_actions+0x157f/0x1750 [openvswitch]
                          [ 484188.711369]   WARN:  ? __radix_tree_lookup+0x80/0xf0
                          [ 484188.711394]   WARN:  ? notify_remote_via_irq+0x4a/0x70
                          [ 484188.711417]   WARN:  ? check_preempt_curr+0x6b/0x90
                          [ 484188.711437]   WARN:  ? ttwu_do_wakeup+0x19/0x140
                          [ 484188.711456]   WARN:  ? _raw_spin_unlock_irqrestore+0x14/0x20
                          [ 484188.711478]   WARN:  ? try_to_wake_up+0x54/0x450
                          [ 484188.711503]   WARN:  ? __raw_callee_save_xen_vcpu_stolen+0x11/0x20
                          [ 484188.711530]   WARN:  ? trigger_load_balance+0x54/0x170
                          [ 484188.711549]   WARN:  ovs_execute_actions+0x47/0x120 [openvswitch]
                          [ 484188.711578]   WARN:  ovs_dp_process_packet+0x7d/0x110 [openvswitch]
                          [ 484188.711609]   WARN:  ? key_extract+0xa53/0xd60 [openvswitch]
                          [ 484188.711638]   WARN:  ovs_vport_receive+0x6e/0xd0 [openvswitch]
                          [ 484188.711654]   WARN:  ? __alloc_skb+0x4e/0x270
                          [ 484188.711667]   WARN:  ? __alloc_skb+0x76/0x270
                          [ 484188.711684]   WARN:  ? arch_local_irq_restore+0x5/0x10
                          [ 484188.711700]   WARN:  ? __slab_alloc.constprop.81+0x42/0x4e
                          [ 484188.711715]   WARN:  ? __alloc_skb+0x76/0x270
                          [ 484188.711727]   WARN:  ? __kmalloc_track_caller+0x58/0x200
                          [ 484188.711747]   WARN:  ? __kmalloc_reserve.isra.48+0x29/0x70
                          [ 484188.711768]   WARN:  netdev_frame_hook+0x105/0x180 [openvswitch]
                          [ 484188.711785]   WARN:  __netif_receive_skb_core+0x211/0xb30
                          [ 484188.711802]   WARN:  __netif_receive_skb_one_core+0x36/0x70
                          [ 484188.711818]   WARN:  netif_receive_skb_internal+0x34/0xe0
                          [ 484188.711838]   WARN:  xenvif_tx_action+0x55c/0x990
                          [ 484188.711853]   WARN:  xenvif_poll+0x27/0x70
                          [ 484188.711867]   WARN:  net_rx_action+0x2a5/0x3e0
                          [ 484188.711882]   WARN:  __do_softirq+0xd1/0x28c
                          [ 484188.711899]   WARN:  irq_exit+0xa8/0xc0
                          [ 484188.711913]   WARN:  xen_evtchn_do_upcall+0x2c/0x50
                          [ 484188.711930]   WARN:  xen_do_hypervisor_callback+0x29/0x40
                          [ 484188.711949]   WARN:  </IRQ>
                          [ 484188.711968]   WARN: RIP: e030:xen_hypercall_sched_op+0xa/0x20
                          [ 484188.711999]   WARN: Code: 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
                          [ 484188.712051]   WARN: RSP: e02b:ffffc900401e3eb0 EFLAGS: 00000246
                          [ 484188.712074]   WARN: RAX: 0000000000000000 RBX: ffff8882a5e43a00 RCX: ffffffff810013aa
                          [ 484188.712101]   WARN: RDX: ffffffff8203d250 RSI: 0000000000000000 RDI: 0000000000000001
                          [ 484188.712127]   WARN: RBP: 000000000000000a R08: 00000000e94460e8 R09: 0000000000000000
                          [ 484188.712155]   WARN: R10: 0000000000007ff0 R11: 0000000000000246 R12: 0000000000000000
                          [ 484188.712176]   WARN: R13: 0000000000000000 R14: ffff8882a5e43a00 R15: ffff8882a5e43a00
                          [ 484188.712200]   WARN:  ? xen_hypercall_sched_op+0xa/0x20
                          [ 484188.712218]   WARN:  ? xen_safe_halt+0xc/0x20
                          [ 484188.712232]   WARN:  ? default_idle+0x1a/0x140
                          [ 484188.712245]   WARN:  ? do_idle+0x1ea/0x260
                          [ 484188.712259]   WARN:  ? cpu_startup_entry+0x6f/0x80
                          [ 484188.712271]   WARN: Modules linked in: tun hid_generic usbhid hid bnx2fc(O) cnic(O) uio fcoe libfcoe libfc scsi_transport_fc openvswitch nsh nf_nat_ipv6 nf_nat_ipv4 nf_conncount nf_nat 8021q garp mrp stp llc dm_multipath ipt_REJECT nf_reject_ipv4 xt_tcpudp xt_multiport xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter skx_edac intel_powerclamp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel sunrpc pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper nls_iso8859_1 nls_cp437 vfat fat dcdbas i2c_i801 lpc_ich ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter ip_tables x_tables raid1 raid0 md_mod nvme ahci nvme_core libahci xhci_pci ixgbe(O) igb(O) libata xhci_hcd dm_mod scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua scsi_mod efivarfs ipv6 crc_ccitt
                          [ 484188.712495]   WARN: CR2: 0000000000000008
                          [ 484188.712517]   WARN: ---[ end trace 0bd4f18732c111b7 ]---
                          [ 484188.752460]   WARN: RIP: e030:skb_copy_ubufs+0x19c/0x5f0
                          [ 484188.752485]   WARN: Code: 90 cc 00 00 00 48 03 90 d0 00 00 00 48 63 44 24 40 48 83 c0 03 48 c1 e0 04 48 01 d0 48 89 18 c7 40 08 00 00 00 00 44 89 78 0c <48> 8b 43 08 a8 01 0f 85 3f 04 00 00 48 8b 44 24 30 48 83 78 20 ff
                          [ 484188.752513]   WARN: RSP: e02b:ffff8882a7083668 EFLAGS: 00010282
                          [ 484188.752525]   WARN: RAX: ffff88822933c2e0 RBX: 0000000000000000 RCX: 00000000000000c0
                          [ 484188.752537]   WARN: RDX: ffff88822933c2c0 RSI: ffff88822933c2c0 RDI: ffffea000abec0c0
                          [ 484188.752548]   WARN: RBP: 0000000000000000 R08: ffff88822933c200 R09: 0000000000000001
                          [ 484188.752559]   WARN: R10: 0000000000000320 R11: ffff88829d3ded40 R12: ffff8881d9b6ef00
                          [ 484188.752571]   WARN: R13: 0000000000000000 R14: ffff8882395988c0 R15: 0000000000000000
                          [ 484188.752594]   WARN: FS:  0000000000000000(0000) GS:ffff8882a7080000(0000) knlGS:0000000000000000
                          [ 484188.752606]   WARN: CS:  e033 DS: 002b ES: 002b CR0: 0000000080050033
                          [ 484188.752614]   WARN: CR2: 0000000000000008 CR3: 000000000384a000 CR4: 0000000000040660
                          [ 484188.752631]  EMERG: Kernel panic - not syncing: Fatal exception in interrupt
                          
                          
                          DanpD 1 Reply Last reply Reply Quote 0
                          • DanpD Offline
                            Danp Pro Support Team @sasha
                            last edited by

                            @sasha BIOS is still outdated on your new hardware.

                            S 1 Reply Last reply Reply Quote 0
                            • S Offline
                              sasha @Danp
                              last edited by

                              @Danp said in Kernel panic on fresh install:

                              @sasha BIOS is still outdated on your new hardware.

                              Agree. But I can't do anything about it 😞

                              1 Reply Last reply Reply Quote 0
                              • S Offline
                                sasha
                                last edited by

                                A little update on this. I deactivated the OpenVPN server on OpnSense in December and since then there have been no reboots or kernel panics.

                                1 Reply Last reply Reply Quote 0
                                • olivierlambertO Offline
                                  olivierlambert Vates 🪐 Co-Founder CEO
                                  last edited by olivierlambert

                                  Hi!

                                  It's likely due to a tricky problem that was finally identified. A security patch should come soon 🙂 For some reason, FreeBSD is the most likely to trigger it, especially when you do some kind of VPN-related software.

                                  1 Reply Last reply Reply Quote 0
                                  • olivierlambertO Offline
                                    olivierlambert Vates 🪐 Co-Founder CEO
                                    last edited by

                                    The security issue (XSA) is now publicly accessible: https://xenbits.xenproject.org/xsa/advisory-448.html

                                    Transmit requests in Xen's virtual network protocol can consist of
                                    multiple parts. While not really useful, except for the initial part
                                    any of them may be of zero length, i.e. carry no data at all. Besides a
                                    certain initial portion of the to be transferred data, these parts are
                                    directly translated into what Linux calls SKB fragments. Such converted
                                    request parts can, when for a particular SKB they are all of length
                                    zero, lead to a de-reference of NULL in core networking code.

                                    If you want, we can make you a test update first so you can try it and see if you are now protected despite getting your OpnSense with OpenVPN up and running.

                                    S 1 Reply Last reply Reply Quote 0
                                    • S Offline
                                      sasha @olivierlambert
                                      last edited by

                                      @olivierlambert these events helped me finally stop supporting OpenVPN and I'm not going back 🙂 Wireguard works much better for my setup. Thank you!

                                      1 Reply Last reply Reply Quote 0
                                      • olivierlambertO Offline
                                        olivierlambert Vates 🪐 Co-Founder CEO
                                        last edited by

                                        No problem 🙂 FYI the update is now available: https://xcp-ng.org/forum/post/70029

                                        1 Reply Last reply Reply Quote 1
                                        • First post
                                          Last post