XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. maximsachs
    3. Posts
    Offline
    • Profile
    • Following 0
    • Followers 0
    • Topics 1
    • Posts 1
    • Groups 0

    Posts

    Recent Best Controversial
    • XCP-ng 8.3: Broadcom BCM57414 `bnxt_en` Driver Fails to Probe on HPE DL380a Gen12

      XCP-ng 8.3: Broadcom BCM57414 bnxt_en Driver Fails to Probe on HPE DL380a Gen12

      Summary

      We have been unable to install XCP-ng 8.3 (xcp-ng-8.3.0-20250606.2.iso) on an HPE ProLiant DL380a Gen12 equipped with a Broadcom BCM57414 dual-port 25GbE NIC. The bnxt_en driver fails to probe the NIC during boot, and as a result the installer finds no network interfaces and does not allow us to proceed.

      This appears to be the same class of issue reported in XCP-ng forum topic #11366 for Dell R660 systems with a Broadcom BCM57504. In that case the workaround was to disable the NIC in BIOS, install via another NIC, yum update to get the newer driver, then re-enable the NIC. We attempted a similar approach by using an answerfile to bypass the installer's network interface check (equivalent to what disabling the NIC achieves during install). However, even with 8.3 fully installed, the bnxt_en driver still fails to probe the NIC on boot and the NIC never comes up, so the problem appears to go beyond the installer.

      XCP-ng 8.2.1 installs without any issues on this same hardware using broadcom-bnxt-en-1.10.0_216.0.119.1-3.

      Hardware

      Component Detail
      Server HPE ProLiant DL380a Gen12
      Boot mode UEFI only (HPE DL380a Gen12 has dropped legacy BIOS support)
      NIC Broadcom BCM57414 Dual-Port 25GbE (PCI 0001:86:00.0 / 0001:86:00.1), firmware confirmed at latest version via HPE iLO firmware updater
      OS disk /dev/nvme0n1

      Symptom

      During boot on XCP-ng 8.3, the bnxt_en driver attempts to probe the two BCM57414 ports and fails. Both ports fail to initialize, so no network interfaces are available. In the interactive installer, this means the NIC selection screen has no interfaces to choose from and does not allow the user to proceed. After a full install (via answerfile, see below), the same probe failure occurs on every boot, leaving the system with no network.

      The specific error messages vary by driver version (see test details below), but a representative example from a post-install boot with the stock v232 driver:

      [root@localhost ~]# dmesg | grep -i bnxt
      [   63.591964] Broadcom NetXtreme-C/E/S driver bnxt_en v1.10.3-232.0.155.5+
      [   67.525804] bnxt_en 0001:86:00.0 (unnamed net_device) (uninitialized): Firmware not responding, rc: -16 status: 0x8000
      [   67.547729] bnxt_en: probe of 0001:86:00.0 failed with error -16
      [   71.408224] bnxt_en 0001:86:00.1 (unnamed net_device) (uninitialized): Firmware not responding, rc: -16 status: 0x8000
      [   71.439436] bnxt_en: probe of 0001:86:00.1 failed with error -16
      [root@localhost ~]#
      

      Variations Tested

      We tested five different driver/kernel combinations, none of which resolved the issue.


      Test 1: Stock XCP-ng 8.3 ISO

      ISO: xcp-ng-8.3.0-20250606.2.iso (unmodified)
      Driver: broadcom-bnxt-en-1.10.3_232.0.155.5-1 (ships with the ISO)
      Kernel: 4.19.0+1 (default)
      Result: FAIL — installer stops at NIC detection

      With the stock ISO and the interactive installer, the bnxt_en driver fails to probe the NIC during boot. The installer then finds no network interfaces and does not allow us to continue.

      Boot dmesg output:

      [   15.179392] bnxt_en 0001:86:00.0 (unnamed net_device) (uninitialized): Firmware not responding, rc: -16 status: 0x8000
      [   18.020473] bnxt_en 0001:86:00.1 (unnamed net_device) (uninitialized): Firmware not responding, rc: -16 status: 0x8000
      

      Answerfile approach used for Tests 2–5

      Since the interactive installer requires at least one working network interface to proceed, we could not test different driver or kernel combinations through the normal install flow. To work around this, we built custom ISOs with an upgrade answerfile baked into install.img and referenced from the boot entries. The answerfile uses mode="upgrade" with source type="local", which skips the NIC selection screen entirely because it reads packages from the ISO media rather than over the network.

      <?xml version="1.0"?>
      <installation mode="upgrade" repo-gpgcheck="false">
        <existing-installation>/dev/nvme0n1</existing-installation>
        <source type="local"/>
        <ui-confirmation-prompt>true</ui-confirmation-prompt>
      </installation>
      

      We first installed XCP-ng 8.2.1 (which works on this hardware), then used each custom 8.3 ISO to perform an answerfile-based upgrade. This allowed the installation to complete in all cases, but after rebooting into 8.3 the bnxt_en driver still failed to probe the NIC, leaving the system with no network connectivity.

      The custom ISOs were built by extracting install.img, replacing the bnxt_en.ko driver module, repacking, and updating the on-media Packages/ repository.


      Test 2: Latest driver (v237) on default kernel

      Driver: broadcom-bnxt-en-1.10.3_237.1.20.0-8.1.xcpng8.3.x86_64.rpm
      Kernel: 4.19.0+1 (default)
      Result: FAIL — same firmware timeout error as stock ISO

      Driver version confirmed loaded:

      11:00 localhost /# modinfo bnxt_en | grep version
      version:        1.10.3-237.1.20.0
      srcversion:     533BB7E5866E52F63B9ACCB
      vermagic:       4.19.0+1 SMP mod_unload modversions
      

      Boot dmesg output:

      [   14.868610] bnxt_en 0001:86:00.0 (unnamed net_device) (uninitialized): Firmware not responding, rc: -16 status: 0x8000
      [   18.397942] bnxt_en 0001:86:00.1 (unnamed net_device) (uninitialized): Firmware not responding, rc: -16 status: 0x8000
      

      Test 3: 8.2 driver (v216) on default kernel

      Driver: broadcom-bnxt-en-1.10.0_216.0.119.1-3.xcpng8.3.x86_64.rpm (same driver that works on 8.2.1)
      Kernel: 4.19.0+1 (default)
      Result: FAIL — different error message but still no NIC enumeration

      This is the same driver version that works on XCP-ng 8.2.1 on this hardware (with the same kernel 4.19.0+1). On 8.3 it produces a different error than the v232/v237 drivers: Error (timeout: 500015) instead of Firmware not responding.

      Boot dmesg output:

      [   13.252206] bnxt_en 0001:86:00.0 (unnamed net_device) (uninitialized): Error (timeout: 500015) msg (0x0 0x0) len:0
      [   15.051022] bnxt_en 0001:86:00.1 (unnamed net_device) (uninitialized): Error (timeout: 500015) msg (0x0 0x0) len:0
      

      Post-boot driver version and kernel confirmed:

      11:53 sv26257 /# modinfo bnxt_en | grep version
      version:        1.10.0-216.0.119.1
      srcversion:     533BB7E5866E52F63B9ACCB
      vermagic:       4.19.0+1 SMP mod_unload modversions
      11:54 sv26257 /# uname -a
      Linux sv26257.dmdelf01.root.lan 4.19.0+1 #1 SMP Tue May 6 15:24:43 CEST 2025 x86_64 x86_64 x86_64 GNU/Linux
      

      Test 4: Alternate kernel with bundled driver (v1.9.2)

      Driver: bnxt_en v1.9.2 (bundled with the alternate kernel)
      Kernel: 4.19.322+1 (alternate)
      Result: FAIL — no error message, but NIC still not enumerated

      The alternate kernel does not produce the bnxt_en error messages visible with the default kernel, which initially looked promising. However, the NIC still does not appear in ip link show. No bnxt_en messages appear at all — the driver seems to load silently without bringing up any interfaces.

      Post-boot driver version:

      12:32 sv26259 /# uname -a
      Linux sv26259 4.19.322+1 #1 SMP Fri Oct 11 14:37:51 CEST 2024 x86_64 x86_64 x86_64 GNU/Linux
      12:32 sv26259 /# modinfo bnxt_en | grep version
      version:        1.9.2
      srcversion:     E9EB303EF3601778A241B46
      vermagic:       4.19.322+1 SMP mod_unload modversions
      

      Test 5: Alternate kernel with latest driver (v237)

      Driver: broadcom-bnxt-en-1.10.3_237.1.20.0-8.1.xcpng8.3.x86_64.rpm
      Kernel: 4.19.322+1 (alternate)
      Result: FAIL — NIC still not functional

      We also tried combining the alternate kernel with the latest v237 driver. This also did not resolve the issue.


      Summary Table

      # Variation Driver Version Kernel Error Result
      1 Stock 8.3 ISO (interactive) 1.10.3-232.0.155.5 4.19.0+1 Firmware not responding, rc: -16 status: 0x8000 FAIL
      2 Latest driver (v237) 1.10.3-237.1.20.0 4.19.0+1 Firmware not responding, rc: -16 status: 0x8000 FAIL
      3 8.2 driver (v216) 1.10.0-216.0.119.1 4.19.0+1 Error (timeout: 500015) msg (0x0 0x0) len:0 FAIL
      4 Alternate kernel, bundled driver 1.9.2 4.19.322+1 No error output, NIC silently not enumerated FAIL
      5 Alternate kernel, latest driver (v237) 1.10.3-237.1.20.0 4.19.322+1 NIC not functional FAIL

      Analysis

      What we observed

      • XCP-ng 8.2.1 works on this hardware with the v216 driver. No errors.
      • XCP-ng 8.3 fails on this hardware with every driver version we tested (v216, v232, v237, v1.9.2).
      • Both 8.2.1 and 8.3 use the same kernel version: 4.19.0+1.
      • Both the default kernel (4.19.0+1) and the alternate kernel (4.19.322+1) fail on 8.3.
      • The hardware is identical across all tests. The NIC firmware is confirmed at the latest version via HPE iLO.
      • The problem persists after installation — it does not appear to be limited to the installer environment. We confirmed this by using answerfile-based upgrades (see methodology above) to get 8.3 installed, and the driver still failed to probe the NIC on every boot.
      • The HPE DL380a Gen12 only supports UEFI boot. All testing was performed under UEFI.

      What we believe we can rule out

      • Driver version: We tested four different driver versions on 8.3. All fail. This suggests the driver version alone does not appear to be the cause.
      • Kernel version: Both 8.2.1 and 8.3 use 4.19.0+1. The alternate kernel (4.19.322+1) also fails. The kernel version does not appear to be the differentiator.
      • Hardware / NIC firmware: The hardware is unchanged between all tests, and the NIC firmware is at the latest version. We do not believe hardware is the cause.
      • Installer-only issue: The problem reproduces after a full install and reboot, not just during the installer.

      What remains

      Based on the above, the main variable that changes between the working 8.2.1 setup and the failing 8.3 setup appears to be the XCP-ng release itself — something other than the kernel or the bnxt_en driver may have changed between 8.2.1 and 8.3 that is causing the NIC probe to fail. We have not been able to identify what that is.

      Additional notes

      • On 8.3, different driver versions produce different error messages (Firmware not responding, rc: -16 status: 0x8000 for v232/v237 vs Error (timeout: 500015) msg (0x0 0x0) len:0 for v216), but in all cases the NIC does not appear in ip link show.
      • The alternate kernel does not produce any bnxt_en error messages at all, but the NIC still does not appear in ip link show.
      • The Dell R660 forum thread mentions the successful workaround was done via BIOS boot and they "did not attempt UEFI". Since our platform only supports UEFI, we cannot test whether BIOS boot would make a difference.

      Environment

      • Working: XCP-ng 8.2.1 with broadcom-bnxt-en-1.10.0_216.0.119.1-3, kernel 4.19.0+1
      • Failing: XCP-ng 8.3 (xcp-ng-8.3.0-20250606.2.iso) with all tested driver versions
      • Server: HPE ProLiant DL380a Gen12

      Related

      • XCP-ng Forum: XCP-ng 8.3 and Dell R660 - crash during boot (bnxt_en) — same bnxt_en probe failure on Dell R660 with BCM57504
      posted in Hardware
      maximsachsM
      maximsachs