XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    XCP-ng 8.3 and Dell R660 - crash during boot, halts remainder of installer process (bnxt_en?)

    Scheduled Pinned Locked Moved Hardware
    5 Posts 4 Posters 83 Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • U Offline
      umbradark
      last edited by umbradark

      I posted another report in the Install portion of the forum, but I suspect this may be a hardware compatibility issue at it's core. Apologies if this feels like a duplicate post, not sure where to proceed with this issue.

      I attempted to install XCP-ng 8.3 on a Dell PowerEdge R660 and the installer crashes regardless of the boot option chosen (default, safe, alt kernel, etc). During boot, the system consistently hits a crash in the Broadcom bnxt_en network driver, and after that the installer hangs on different systemd unit jobs. Ultimately the boot process hangs and the installer never proceeds.

      Is this hardware expected to be compatible with XCP-ng 8.3?

      System:

      Model: Dell PowerEdge R660 (16G Monolithic)
      BIOS: 2.7.5 (released 2025-07-31)
      CPLD: 1.2.6
      Lifecycle Controller / iDRAC: 7.20.10.50
      Processors:
      2 × Intel Xeon Silver 4509Y (8 cores / 16 threads each, 2.6 GHz base, up to 4.0 GHz turbo)
      Memory:
      16 × 16 GB DDR5 RDIMM (Hynix HMCG78AGBRA190N, 5600 MHz rated, operating at 4400 MT/s)
      Total: 256 GB
      Storage / RAID:
      Controller: Dell PERC H355 Front (Broadcom/LSI SAS38xx)
      PCI IDs: Vendor 1000, Device 10E6, SubVendor 1028, SubDevice 2173
      Firmware: 52.26.0-5179
      Virtual Disk: RAID-1 of 2 × Micron MTFDDAK480TGB-1B 480 GB SATA SSDs
      Firmware: D4DK003
      
      Networking:
      Embedded NICs: 2 × Broadcom NetXtreme BCM5720 1GbE (Base-T)
      Firmware: BIOS 1.39, EFI 21.6.85, FamilyVersion 23.21.6, LAN driver 3.137
      
      Integrated NIC (OCP3.0): Broadcom BCM57504 Quad-Port 25GbE
      Firmware: BIOS 233.0.196.0, EFI 233.0.192.0, FamilyVersion 23.31.18.10, LAN driver 1.10.0
      
      Add-in NIC (Slot 2): Broadcom BCM57414 Dual-Port 25GbE RDMA
      Firmware: BIOS 233.0.195.0, EFI 233.0.192.0, FamilyVersion 23.31.18.10, LAN driver 1.10.0
      
      Optics: FS SFP-10GMSR-85 and FS SFP-10GSR-85 modules
      
      Other components:
      
      Embedded AHCI: Intel Sapphire Rapids (device IDs 1BD2 / 1BF2)
      Video: Matrox G200eW3 (PCI ID 0536)
      PSU: 2 × 800 W Dell (Model 0C8T2PA04, FW 00.17.31)
      

      Here's the output of the crash that occurs during boot:

      [   24.451551] BUG: unable to handle kernel NULL pointer dereference at 0000000000000060
      [   24.451561] PGD 0 P4D 0
      [   24.451563] Oops: 0002 [#1] SMP NOPTI
      [   24.451571] CPU: 13 PID: 236 Comm: systemd-udevd Tainted: G        W         4.19.0+1 #1
      [   24.451573] Hardware name: Dell Inc. PowerEdge R660/09W9M4, BIOS 2.7.5 07/31/2025
      [   24.451574] RIP: e030:bnxt_hwrm_func_qcaps+0x53/0x130 [bnxt_en]
      [   24.451574] Code: d1 83 f8 00 75 09 48 8b 07 48 8b 00 48 85 c0 74 11 48 8b 4b 28 4c 89 e7 e8 c4 11 bb 01 00 00 45 31 e1 45 31 c9 48 31 c9 48 85 c0 74 05 <31> c9 48 89 0f eb 3c
      [   24.451575] RSP: 0018:ffff88002f5a7b00 EFLAGS: 00010246
      [   24.451576] RAX: 0000000000000000 RBX: ffff889724750280 RCX: 0000000000000000
      [   24.451577] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff889724750280
      [   24.451577] RBP: ffff88002f5a7b28 R08: 0000000000000000 R09: 0000000000000000
      [   24.451578] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
      [   24.451578] R13: ffff889724750280 R14: ffff889724750200 R15: ffff889724750280
      [   24.451579] FS: 0000000000000000(0000) GS:ffff88987f880000(0000) knlGS:0000000000000000
      [   24.451580] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   24.451580] CR2: 0000000000000060 CR3: 00000007e1422002 CR4: 0000000000771ef0
      [   24.451581] PKRU: 55555554
      [   24.451581] Call Trace:
      [   24.451582]  bnxt_init_one+0x902/0x1530 [bnxt_en]
      [   24.451583]  pci_user_read_config_dword+0x70/0xb0
      [   24.451584]  bnxt_init_one_p2+0x20/0x30 [bnxt_en]
      [   24.451584]  pci_device_probe+0xbf/0x140
      [   24.451585]  really_probe+0x1e5/0x3a0
      [   24.451586]  driver_probe_device+0xc9/0x130
      [   24.451587]  bus_for_each_dev+0x6c/0xc0
      [   24.451587]  bus_add_driver+0x1e2/0x230
      [   24.451588]  driver_register+0x8d/0x160
      [   24.451589]  __pci_register_driver+0x4b/0x50
      [   24.451590]  do_one_initcall+0x41/0x1a0
      [   24.451590]  __cond_resched+0x15/0x50
      [   24.451591]  kmem_cache_alloc_trace+0x115/0x1c0
      [   24.451592]  do_init_module+0x45/0x1f0
      [   24.451593]  load_module+0x1a2a/0x1f30
      [   24.451593]  do_sys_finit_module+0x65/0xb0
      [   24.451594]  __x64_sys_finit_module+0x19/0x20
      [   24.451595]  do_syscall_64+0x33/0xc0
      [   24.451595]  entry_SYSCALL_64_after_hwframe+0x49/0xb7
      [   24.451596] RIP: 0033:0x7f791f9a5ec9
      [   24.451597] Code: 48 8b 05 9f 97 f1 74 89 ca 14 89 c2 4d 89 cb 48 8b 4c 24 08 0f 05 <48> 34 01 f0
      [   24.451598] RSP: 002b:00007ffe70a46948 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
      [   24.451598] RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f791f9a5ec9
      [   24.451599] RDX: 0000000000000000 RSI: 00007ffe70a46a20 RDI: 0000000000000009
      [   24.451600] RBP: 00007ffe70a46a20 R08: 0000000000000000 R09: 0000000000000000
      [   24.451600] R10: 0000000000000000 R11: 0000000000000246 R12: 000055bc1b4a3970
      [   24.451601] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
      [   24.451602] Modules linked in: aesni_intel(-) xhci_pci(-) aes_x86_64 crypto_simd cryptd bnxt_en(+) ipmi_ssif megaraid_sas(0) dcdbas glue_helper tg3 devlink ipmi_devintf ipmi_si acpi_ipmi ipmi_msghandler acpi_power_meter iscsi_ibft iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi scsi_mod efi_pstore ip_tables ip6_tables
      [   24.451603] CR2: 0000000000000060
      [   24.451616] ---[ end trace 56ae6a9ca59b4d10 ]---
      [   24.517912] RIP: e030:bnxt_hwrm_func_qcaps+0x53/0x130 [bnxt_en]
      [   24.517912] Code: d1 83 f8 00 75 09 48 8b 07 48 8b 00 48 85 c0 74 11 48 8b 4b 28 4c 89 e7 e8 c4 11 bb 01 00 00 45 31 c9 48 31 c9 48 85 c0 74 05 <31> c9 48 89 0f eb 3c
      [   24.517912] RSP: 0018:ffb880010002fc50 EFLAGS: 00010282
      [   24.517912] RAX: 0000000000000000 RBX: ffff889724750280 RCX: 0000000000000000
      [   24.517912] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff889724750280
      [   24.517912] RBP: ffb880010002fc70 R08: 0000000000000000 R09: 0000000000000000
      [   24.517912] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
      [   24.517912] R13: ffff889724750280 R14: ffff889724750200 R15: ffff889724750280
      [   24.517912] FS: 0000000000000000(0000) GS:ffb8800100000000(0000) knlGS:0000000000000000
      [   24.517912] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   24.517912] CR2: 0000000000000060 CR3: 0000000732f0a004 CR4: 0000000000771ef0
      [   24.517912] PKRU: 55555554
      [   24.517912] Call Trace:
      [   24.517912]  bnxt_init_one+0x902/0x1530 [bnxt_en]
      [   24.517912]  local_pci_probe+0x44/0x90
      [   24.517912]  pci_device_probe+0xbf/0x140
      [   24.517912]  really_probe+0x1e5/0x3a0
      [   24.517912]  __driver_probe_device+0x99/0x160
      [   24.517912]  driver_probe_device+0x19/0x30
      [   24.517912]  __driver_attach+0x9d/0x150
      [   24.517912]  bus_for_each_dev+0x79/0xd0
      [   24.517912]  driver_attach+0x1e/0x20
      [   24.517912]  bus_add_driver+0x121/0x210
      [   24.517912]  driver_register+0x8d/0x160
      [   24.517912]  __pci_register_driver+0x42/0x50
      [   24.517912]  bnxt_init+0x3b/0x1000 [bnxt_en]
      [   24.517912]  do_one_initcall+0x41/0x1b0
      [   24.517912]  do_init_module+0x4e/0x1e0
      [   24.517912]  load_module+0x1bf6/0x2010
      [   24.517912]  __do_sys_finit_module+0xaa/0x120
      [   24.517912]  __x64_sys_finit_module+0x19/0x20
      [   24.517912]  do_syscall_64+0x3b/0xc0
      [   24.517912]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
      
      D tjkreidlT 2 Replies Last reply Reply Quote 0
      • olivierlambertO Offline
        olivierlambert Vates 🪐 Co-Founder CEO
        last edited by

        Hi,

        Yes, it's pretty standard hardware. My first reflex would be to run a memtest on this config first.

        1 Reply Last reply Reply Quote 0
        • D Offline
          dinhngtu Vates 🪐 XCP-ng Team @umbradark
          last edited by

          @umbradark Hello, there are new fixed drivers in the latest bnxt_en driver disk. Could you specify modprobe.blacklist=bnxt_en then apply the linked driver disk to see if it works?

          1 Reply Last reply Reply Quote 0
          • tjkreidlT Offline
            tjkreidl Ambassador @umbradark
            last edited by

            @umbradark Maybe too obvious, but is your boot configuration set up to be BIOS or EUFI mode?

            1 Reply Last reply Reply Quote 0
            • olivierlambertO Offline
              olivierlambert Vates 🪐 Co-Founder CEO
              last edited by

              Broadcom hardware never disappoints.

              1 Reply Last reply Reply Quote 0
              • First post
                Last post