Issue installing latest pfSense Plus (24.03 release)
-
I am coming across an issue where I am unable to update pfSense Plus 23.09.1 to pfSense Plus 24.03.
I have tested in several hypervisors, a Dell R620 and another server with AMD Ryzern 5 3600 CPU's.
On both servers the error is exactly the same. I plugged in the console and captured the output from boot to crash.
Also adding the backtrace.Autoboot in 0 seconds. [Space] to pause Loading kernel... /boot/kernel/kernel text=0x19eec0 text=0xff4c38 text=0x17e3db4 data=0x180 data=0x22d718+0x3d18e8 0x8+0x1cb0f0+0x8+0x1da290 Loading configured modules... /boot/entropy size=0x1000 /boot/kernel/zfs.ko size 0x5ea9a0 at 0x35a7000 /boot/kernel/opensolaris.ko size 0x1e2f0 at 0x3b92000 /boot/kernel/cryptodev.ko size 0x7718 at 0x3bb1000 can't find '/etc/hostid' staging 0x73600000-0x779e3000 (not copying) tramp 0x779e3000 PT4 0x779e4000 Start @ 0xffffffff8039f000 ... EFI framebuffer information: addr, size 0xf0000000, 0x240000 dimensions 1024 x 768 stride 1024 masks 0x00ff0000, 0x0000ff00, 0x000000ff, 0x00000000 GDB: no debug ports present KDB: debugger backends: ddb KDB: current backend: ddb ---<<BOOT>>--- Copyright (c) 1992-2024 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 15.0-CURRENT #0 plus-RELENG_24_03-n256311-e71f834dd81: Fri Apr 19 00:28:14 UTC 2024 root@freebsd:/var/jenkins/workspace/pfSense-Plus-snapshots-24_03-main/obj/amd64/Y4MAEJ2R/var/jenkins/workspace/pfSense-Plus-snapshots-24_03-main/sources/FreeBSD-src-plus-RELENG_24_03/amd64.amd64/sys/pfSense amd64 FreeBSD clang version 17.0.6 (https://github.com/llvm/llvm-project.git llvmorg-17.0.6-0-g6009708b4367) VT(efifb): resolution 1024x768 Hyper-V Version: 0.0.0 [SP0] Features=0x870<APIC,HYPERCALL,VPINDEX,TMFREQ> PM Features=0x0 [C0] Features3=0x8<PCPUDPE> CPU: AMD Ryzen 5 3600 6-Core Processor (3593.36-MHz K8-class CPU) Origin="AuthenticAMD" Id=0x870f10 Family=0x17 Model=0x71 Stepping=0 Features=0x1783fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE,SSE2,HTT> Features2=0xfed83203<SSE3,PCLMULQDQ,SSSE3,FMA,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND,HV> AMD Features=0x2e500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM> AMD Features2=0x40001f3<LAHF,CMP,CR8,ABM,SSE4A,MAS,Prefetch,DBE> Structured Extended Features=0x219c01a9<FSGSBASE,BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,SHA> Structured Extended Features2=0x400004<UMIP,RDPID> XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES> AMD Extended Feature Extensions ID EBX=0x1005<CLZERO,XSaveErPtr,IBPB> Hypervisor: Origin = "Microsoft Hv" real memory = 2143289344 (2044 MB) avail memory = 2012315648 (1919 MB) Event timer "LAPIC" quality 100 ACPI APIC Table: <Xen HVM> random: registering fast source Intel Secure Key RNG random: fast provider: "Intel Secure Key RNG" random: unblocking device. ioapic0: MADT APIC ID 1 != hw id 0 MADT: Forcing active-low polarity and level trigger for SCI ioapic0 <Version 1.1> irqs 0-47 TCP_ratelimit: Is now initialized ipw_bss: You need to read the LICENSE file in /usr/share/doc/legal/intel_ipw.LICENSE. ipw_bss: If you agree with the license, set legal.intel_ipw.license_ack=1 in /boot/loader.conf. module_register_init: MOD_LOAD (ipw_bss_fw, 0xffffffff80750310, 0) error 1 ipw_ibss: You need to read the LICENSE file in /usr/share/doc/legal/intel_ipw.LICENSE. ipw_ibss: If you agree with the license, set legal.intel_ipw.license_ack=1 in /boot/loader.conf. module_register_init: MOD_LOAD (ipw_ibss_fw, 0xffffffff807503c0, 0) error 1 ipw_monitor: You need to read the LICENSE file in /usr/share/doc/legal/intel_ipw.LICENSE. ipw_monitor: If you agree with the license, set legal.intel_ipw.license_ack=1 in /boot/loader.conf. module_register_init: MOD_LOAD (ipw_monitor_fw, 0xffffffff80750470, 0) error 1 iwi_bss: You need to read the LICENSE file in /usr/share/doc/legal/intel_iwi.LICENSE. iwi_bss: If you agree with the license, set legal.intel_iwi.license_ack=1 in /boot/loader.conf. module_register_init: MOD_LOAD (iwi_bss_fw, 0xffffffff80770010, 0) error 1 iwi_ibss: You need to read the LICENSE file in /usr/share/doc/legal/intel_iwi.LICENSE. iwi_ibss: If you agree with the license, set legal.intel_iwi.license_ack=1 in /boot/loader.conf. module_register_init: MOD_LOAD (iwi_ibss_fw, 0xffffffff807700c0, 0) error 1 iwi_monitor: You need to read the LICENSE file in /usr/share/doc/legal/intel_iwi.LICENSE. iwi_monitor: If you agree with the license, set legal.intel_iwi.license_ack=1 in /boot/loader.conf. module_register_init: MOD_LOAD (iwi_monitor_fw, 0xffffffff80770170, 0) error 1 random: entropy device external interface wlan: mac acl policy registered kbd1 at kbdmux0 WARNING: Device "spkr" is Giant locked and may be deleted before FreeBSD 15.0. efirtc0: <EFI Realtime Clock> efirtc0: registered as a time-of-day clock, resolution 1.000000s netgate0: <unknown hardware> smbios0: <System Management BIOS> at iomem 0x7f3cc000-0x7f3cc01e smbios0: Version: 2.8, BCD Revision: 2.8 acpi0: <Xen> acpi0: Power Button (fixed) acpi0: Sleep Button (fixed) cpu0: <ACPI CPU> on acpi0 hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0 Timecounter "HPET" frequency 62500000 Hz quality 950 attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0 Timecounter "i8254" frequency 1193182 Hz quality 0 Event timer "i8254" frequency 1193182 Hz quality 100 atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0 atrtc0: registered as a time-of-day clock, resolution 1.000000s Event timer "RTC" frequency 32768 Hz quality 0 Timecounter "ACPI-fast" frequency 3579545 Hz quality 900 acpi_timer0: <32-bit timer at 3.579545MHz> port 0xb008-0xb00b on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 isab0: <PCI-ISA bridge> at device 1.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <Intel PIIX3 WDMA2 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xc1a0-0xc1af at device 1.1 on pci0 ata0: <ATA channel> at channel 0 on atapci0 ata1: <ATA channel> at channel 1 on atapci0 uhci0: <Intel 82371SB (PIIX3) USB controller> port 0xc180-0xc19f irq 23 at device 1.2 on pci0 usbus0 on uhci0 pci0: <bridge> at device 1.3 (no driver attached) vgapci0: <VGA-compatible display> mem 0xf0000000-0xf1ffffff,0xf3042000-0xf3042fff at device 2.0 on pci0 vgapci0: Boot video device xenpci0: <Xen Platform Device> port 0xc000-0xc0ff mem 0xf2000000-0xf2ffffff irq 28 at device 3.0 on pci0 Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x2dee022 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff8128c005 stack pointer = 0x28:0xffffffff83f0da88 frame pointer = 0x28:0xffffffff83f0dad0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (swapper) rdi: 0000000000000000 rsi: ffffffff83f0da98 rdx: 0000000000000009 rcx: 0000000000001800 r8: 0000000000000007 r9: 0000000000000002 rax: 0000000002dee022 rbx: fffff800016fc000 rbp: ffffffff83f0dad0 r10: 0000000000000000 r11: ffffffff83f0d8f4 r12: ffffffff82d5aee0 r13: fffff800017c0690 r14: fffff800016fc600 r15: 0000000000001800 trap number = 12 panic: page fault cpuid = 0 time = 1 KDB: enter: panic [ thread pid 0 tid 100000 ] Stopped at kdb_enter+0x33: movq $0,0x235af42(%rip)
and the backtrace:
db> bt Tracing pid 0 tid 100000 td 0xffffffff8303de40 kdb_enter() at kdb_enter+0x33/frame 0xffffffff83f0c890 panic() at panic+0x43/frame 0xffffffff83f0c8f0 trap_fatal() at trap_fatal+0x40f/frame 0xffffffff83f0c950 trap_pfault() at trap_pfault+0x4f/frame 0xffffffff83f0c9b0 calltrap() at calltrap+0x8/frame 0xffffffff83f0c9b0 --- trap 0xc, rip = 0xffffffff8128c005, rsp = 0xffffffff83f0ca88, rbp = 0xffffffff83f0cad0 --- xen_start32() at xen_start32+0x5/frame 0xffffffff83f0cad0 xenpci_attach() at xenpci_attach+0x207/frame 0xffffffff83f0cb10 device_attach() at device_attach+0x3b5/frame 0xffffffff83f0cb60 bus_generic_attach() at bus_generic_attach+0x4b/frame 0xffffffff83f0cb90 pci_attach() at pci_attach+0xcb/frame 0xffffffff83f0cbd0 acpi_pci_attach() at acpi_pci_attach+0x17/frame 0xffffffff83f0cc10 device_attach() at device_attach+0x3b5/frame 0xffffffff83f0cc60 bus_generic_attach() at bus_generic_attach+0x4b/frame 0xffffffff83f0cc90 acpi_pcib_acpi_attach() at acpi_pcib_acpi_attach+0x42f/frame 0xffffffff83f0ccf0 device_attach() at device_attach+0x3b5/frame 0xffffffff83f0cd40 bus_generic_attach() at bus_generic_attach+0x4b/frame 0xffffffff83f0cd70 acpi_probe_children() at acpi_probe_children+0x237/frame 0xffffffff83f0cdd0 acpi_attach() at acpi_attach+0x972/frame 0xffffffff83f0ce60 device_attach() at device_attach+0x3b5/frame 0xffffffff83f0ceb0 bus_generic_attach() at bus_generic_attach+0x4b/frame 0xffffffff83f0cee0 device_attach() at device_attach+0x3b5/frame 0xffffffff83f0cf30 bus_generic_new_pass() at bus_generic_new_pass+0x127/frame 0xffffffff83f0cf60 root_bus_configure() at root_bus_configure+0x36/frame 0xffffffff83f0cf90 configure() at configure+0x9/frame 0xffffffff83f0cfa0 mi_startup() at mi_startup+0x1c8/frame 0xffffffff83f0cff0 db>
I also tried changing the NIC type on XO > VM > Advanced > NIC Type from Realtek to Intel e1000 but had no effect.
-
I'm looking for some expert insights from the XCP-ng project developers...
or will XCP-ng be incompatible with future FreeBSD versions? -
@Affonso Hi !
Are you using XCP-ng 8.2.1 or 8.3 ?
I can't find the 24.03 iso, is this the paid version ? -
What happends if you create a new VM with the template "other" and attempt to make a clean installation? Do you get the same error?
Also, what kind of configuration have to given the machine? For example NIC? RAM? Dynamic RAM or static?
-
@AtaxyaNetwork I'm using XCP-ng 8.2.1
There isn't a 24.03 ISO as it is the paid (Plus) option.Right now to install I have to install the 2.7.2 CE version, then Netgate systems identify the Netgate ID of the device and allow the upgrade to pfSense 23.09.1 Plus version, and then you are offered the 24.03 Plus version. There isn't a direct update from 2.7.2 CE to 24.03 Plus.
I have installed the 24.03 version on physical devices. However the virtualised devices, after the update and upon first boot, fail to boot as they crash.
I'm not guy to diagnose boot crashes. I made the possible captures.
All I can affirm without doubt is that the error is exactly the same on two completely different machines.@nikade they have always been created using the "other" template.
The original VM's had 1 vCPU and 2 GB preset on the template. Aside from the values changed initially, I didn't made changes to the memory/resources.
NIC: 2 NIC's (one for WAN another for LAN). Initially they always have the "Realtek" driver, so all I did on one test was changing that to Intel e1000. But there was no change to the outcome. -
@nikade they have always been created using the "other" template.
The original VM's had 1 vCPU and 2 GB preset on the template. Aside from the values changed initially, I didn't made changes to the memory/resources.
NIC: 2 NIC's (one for WAN another for LAN). Initially they always have the "Realtek" driver, so all I did on one test was changing that to Intel e1000. But there was no change to the outcome.Hi,
Can you go to the "Advanced" tab of the VM and show the memory settings?
I want to make sure you're not using dynamic memory since it is really unstable in Xen. -
@nikade it shows like this
should I make any change?
-
Can you try to add a bit more RAM in case? Like 4GiB/4GiB to see if it's better
-
@olivierlambert I changed the memory settings like this
I'm going to proceed with the upgrade and post the outcome
-
Perfect, keep us posted!
-
The same error
-
Is there a way we can test that ISO? It's hard to reproduce if we don't have any way to test it here or in the community
-
@olivierlambert let me talk to Netgate support and work some way out
-
Googling the error "supervisor read data page not present" gives a lot of hints towards bad memory, are you using ECC or non ECC?
-
The Dell R620 (Intel):
# dmidecode -t memory | grep -i "ecc" Error Correction Type: Multi-bit ECC
The lab AMD Razor
# dmidecode -t memory | grep -i "ecc" #
One server has ECC memory, the other doesn't. I believe it would be a very odd case of having "bad memory" on two completely different instances, and the fault being exactly the same on the different machines ... but as I said, this is not my area of expertise.
-
@Affonso I had a problem upgrading OPNsense to 24.7 which uses FreeBSD 14 using XCP 8.2.1
OPNsense had a kernel crash related to Xen using some FreeBSD kernel options. OPNsense was able to update their kernel to resolve the crash issue.
It was a OPNsense/FreeBSD issue, not a XCP issue. It has been resolved.
Here's the OPNsense github issue. I don't know if it's related.
-
Thank you @Andrew I will mention this to pfSense. lets hope
-
So just to give a quick update on how this went:
Since pfSense 24.03 is based on FreeBSD 15 I proceeded with a FreeBSD 15 installation on XCP-ng to see if the issue stemmed from there. FreeBSD15 installed and booted correctly.
From there I ended up testing the development snapshot 24.08.
Also installed correctly and booted.So whatever issue was there between pfSense 24.03 and XCP-ng that was preventing it from booting, was only present on version 24.03.
-
@Affonso Looks like it might have been Bug 15684 in 24.03 that was resolved for 24.11 (release notes).