Posts made by Maelstrom96 | XCP-ng and XO forum

Maelstrom96

@dinhngtu What is the exact kernel patch that is required for the xen-platform-pci-bar-uc=false fix to work on a Linux guest? We're looking at potentially compiling our own kernel with the xen-netfront.c patch, and we would like to see about adding the other part of the Kernel code needed for the Grant table fix.

Maelstrom96

@dinhngtu we couldn't delay any longer and had to deploy the cluster so we swapped the drives with some CD6-R instead. We'll most likely re-explore this later as we have 3 other nodes to deploy soon-ish, and I'll get some logs then.

Maelstrom96

@olivierlambert you basically replied just after that I noticed that and deleted my message...

Maelstrom96

@dinhngtu The only output in hypervisor.log file is what I sent earlier.

Here is daemon.log:

Oct 10 16:30:01 lab-pprod-xen03 systemd[1]: Started Session c18 of user root.
Oct 10 16:30:01 lab-pprod-xen03 systemd[1]: Starting Session c18 of user root.
Oct 10 16:30:01 lab-pprod-xen03 systemd[1]: Started Session c19 of user root.
Oct 10 16:30:01 lab-pprod-xen03 systemd[1]: Starting Session c19 of user root.
Oct 10 16:30:13 lab-pprod-xen03 tapdisk[20267]: received 'sring disconnect' message (uuid = 0)
Oct 10 16:30:13 lab-pprod-xen03 tapdisk[20267]: disconnecting domid=6, devid=768
Oct 10 16:30:13 lab-pprod-xen03 tapdisk[20267]: sending 'sring disconnect rsp' message (uuid = 0)
Oct 10 16:30:13 lab-pprod-xen03 systemd[1]: Stopping transient unit for varstored-6...
Oct 10 16:30:13 lab-pprod-xen03 systemd[1]: Stopped transient unit for varstored-6.
Oct 10 16:30:13 lab-pprod-xen03 qemu-dm-6[20398]: qemu-dm-6: terminating on signal 15 from pid 2169 (/usr/sbin/xenopsd-xc)
Oct 10 16:30:14 lab-pprod-xen03 /opt/xensource/libexec/xcp-clipboardd[20392]: poll failed because revents=0x11 (qemu socket)
Oct 10 16:30:14 lab-pprod-xen03 ovs-ofctl: ovs|00001|ofp_port|WARN|Negative value -1 is not a valid port number.
Oct 10 16:30:14 lab-pprod-xen03 tapdisk[20267]: received 'close' message (uuid = 0)
Oct 10 16:30:14 lab-pprod-xen03 tapdisk[20267]: nbd: NBD server pause(0x198d410)
Oct 10 16:30:14 lab-pprod-xen03 tapdisk[20267]: nbd: NBD server pause(0x198d610)
Oct 10 16:30:14 lab-pprod-xen03 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=30 -- --if-exists del-port vif6.0
Oct 10 16:30:14 lab-pprod-xen03 ovs-ofctl: ovs|00001|ofp_port|WARN|Negative value -1 is not a valid port number.
Oct 10 16:30:14 lab-pprod-xen03 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=30 -- --if-exists del-port vif6.0
Oct 10 16:30:14 lab-pprod-xen03 ovs-ofctl: ovs|00001|ofp_port|WARN|Negative value -1 is not a valid port number.
Oct 10 16:30:14 lab-pprod-xen03 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=30 -- --if-exists del-port vif6.1
Oct 10 16:30:14 lab-pprod-xen03 tapdisk[20267]: nbd: NBD server free(0x198d410)
Oct 10 16:30:14 lab-pprod-xen03 tapdisk[20267]: nbd: NBD server free(0x198d610)
Oct 10 16:30:14 lab-pprod-xen03 tapdisk[20267]: gaps written/skipped: 444/0
Oct 10 16:30:14 lab-pprod-xen03 tapdisk[20267]: /var/run/sr-mount/5bd2dee1-cfb7-be70-0326-3f9070c4ca2d/721646c6-7a3f-4909-bde8-70dac75f5361.vhd: b: 25600, a: 2686, f: 2658, n: 11023552
Oct 10 16:30:14 lab-pprod-xen03 tapdisk[20267]: closed image /var/run/sr-mount/5bd2dee1-cfb7-be70-0326-3f9070c4ca2d/721646c6-7a3f-4909-bde8-70dac75f5361.vhd (0 users, state: 0x00000000, ty$
Oct 10 16:30:14 lab-pprod-xen03 tapdisk[20267]: sending 'close response' message (uuid = 0)
Oct 10 16:30:14 lab-pprod-xen03 tapdisk[20267]: received 'detach' message (uuid = 0)
Oct 10 16:30:14 lab-pprod-xen03 tapdisk[20267]: sending 'detach response' message (uuid = 0)
Oct 10 16:30:14 lab-pprod-xen03 tapdisk[20267]: tapdisk-log: closing after 0 errors
Oct 10 16:30:14 lab-pprod-xen03 tapdisk[20267]: tapdisk-syslog: 32 messages, 2739 bytes, xmits: 33, failed: 0, dropped: 0
Oct 10 16:30:14 lab-pprod-xen03 tapdisk[20267]: tapdisk-control: draining 1 connections
Oct 10 16:30:14 lab-pprod-xen03 tapdisk[20267]: tapdisk-control: done
Oct 10 16:30:16 lab-pprod-xen03 tapback[20277]: backend.c:1246 domain removed, exit
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: Command line: -controloutfd 8 -controlinfd 9 -mode hvm_build -image /usr/libexec/xen/boot/hvmloader -domid 7 -store_port 5 -store_d$
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: Domain Properties: Type HVM, hap 1
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: Determined the following parameters from xenstore:
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: vcpu/number:4 vcpu/weight:256 vcpu/cap:0
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: nx: 1, pae 1, cores-per-socket 0, x86-fip-width 0, nested 0
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: apic: 1 acpi: 1 acpi_s4: 0 acpi_s3: 0 tsc_mode: 0 hpet: 1
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: nomigrate 0, timeoffset 0 mmio_hole_size 0
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: viridian: 0, time_ref_count: 0, reference_tsc: 0 hcall_remote_tlb_flush: 0 apic_assist: 0 crash_ctl: 0 stimer: 0 hcall_ipi: 0
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: vcpu/0/affinity:1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111$
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: vcpu/1/affinity:1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111$
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: vcpu/2/affinity:1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111$
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: vcpu/3/affinity:1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111$
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_allocate: cmdline="", features=""
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_kernel_file: filename="/usr/libexec/xen/boot/hvmloader"
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_malloc_filemap    : 631 kB
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_module_file: filename="/usr/share/ipxe/ipxe.bin"
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_malloc_filemap    : 132 kB
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_boot_xen_init: ver 4.17, caps xen-3.0-x86_64 hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_parse_image: called
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_find_loader: trying multiboot-binary loader ...
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: loader probe failed
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_find_loader: trying HVM-generic loader ...
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: loader probe OK
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: xc: detail: ELF: phdr: paddr=0x100000 memsz=0x57e24
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: xc: detail: ELF: memory: 0x100000 -> 0x157e24
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_compat_check: supported guest type: xen-3.0-x86_64
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_compat_check: supported guest type: hvm-3.0-x86_32 <= matches
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_compat_check: supported guest type: hvm-3.0-x86_32p
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_compat_check: supported guest type: hvm-3.0-x86_64
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: Getting RMRRs for device '0000:f1:00.0'
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: Getting total MMIO space occupied for device '0000:f1:00.0'
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: Getting RMRRs for device '0000:f3:00.0'
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: Getting total MMIO space occupied for device '0000:f3:00.0'
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: Getting RMRRs for device '0000:21:00.0'
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: Getting total MMIO space occupied for device '0000:f3:00.0'
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: Getting RMRRs for device '0000:21:00.0'
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: Getting total MMIO space occupied for device '0000:21:00.0'
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: Getting RMRRs for device '0000:64:00.0'
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: Getting total MMIO space occupied for device '0000:64:00.0'
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: Getting RMRRs for device '0000:63:00.0'
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: Getting total MMIO space occupied for device '0000:63:00.0'
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: Getting RMRRs for device '0000:23:00.0'
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: Getting total MMIO space occupied for device '0000:23:00.0'
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: Getting RMRRs for device '0000:22:00.0'
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: Getting total MMIO space occupied for device '0000:22:00.0'
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: Calculated provisional MMIO hole size as 0x20000000
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: Loaded OVMF from /usr/share/edk2/OVMF-release.fd
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_mem_init: mem 8184 MB, pages 0x1ff800 pages, 4k each
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_mem_init: 0x1ff800 pages
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_boot_mem_init: called
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: range: start=0x0 end=0xe0000000
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: range: start=0x100000000 end=0x21f800000
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: xc: detail: PHYSICAL MEMORY ALLOCATION:
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: xc: detail:   4KB PAGES: 0x0000000000000200
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: xc: detail:   2MB PAGES: 0x00000000000003fb
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: xc: detail:   1GB PAGES: 0x0000000000000006
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: Final lower MMIO hole size is 0x20000000
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_build_image: called
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_pfn_to_ptr_retcount: domU mapping: pfn 0x100+0x58 at 0x7f6d1170f000
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_alloc_segment:   kernel       : 0x100000 -> 0x157e24  (pfn 0x100 + 0x58 pages)
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: xc: detail: ELF: phdr 0 at 0x7f6d0fb1e000 -> 0x7f6d0fb6f200
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_pfn_to_ptr_retcount: domU mapping: pfn 0x158+0x200 at 0x7f6d0f976000
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_alloc_segment:   System Firmware module : 0x158000 -> 0x358000  (pfn 0x158 + 0x200 pages)
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_pfn_to_ptr_retcount: domU mapping: pfn 0x358+0x22 at 0x7f6d116ed000
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_alloc_segment:   module0      : 0x358000 -> 0x379200  (pfn 0x358 + 0x22 pages)
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_pfn_to_ptr_retcount: domU mapping: pfn 0x37a+0x1 at 0x7f6d118cd000
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_alloc_segment:   HVM start info : 0x37a000 -> 0x37a878  (pfn 0x37a + 0x1 pages)
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_build_image  : virt_alloc_end : 0x37b000
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_build_image  : virt_pgtab_end : 0x0
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_boot_image: called
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: domain builder memory footprint
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail:    allocated
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail:       malloc             : 18525 bytes
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail:       anon mmap          : 0 bytes
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail:    mapped
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail:       file mmap          : 764 kB
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail:       domU mmap          : 2540 kB
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: Adding module 0 guest_addr 358000 len 135680
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: vcpu_hvm: called
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_set_gnttab_entry: d7 gnt[0] -> d0 0xfefff
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_set_gnttab_entry: d7 gnt[1] -> d0 0xfeffc
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: Parsing '178bfbff-f6fa3203-2e500800-040001f3-0000000f-219c07a9-0040060c-00000000-311ed005-00000010-00000000-18000064-00000000-00000$
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_release: called
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: Writing to control: 'result:1044476 1044479#012'
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: domainbuilder: detail: xc_dom_release: called
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: Writing to control: 'result:1044476 1044479#012'
Oct 10 16:30:16 lab-pprod-xen03 xenguest-7-build[47808]: All done
Oct 10 16:30:17 lab-pprod-xen03 ovs-vsctl: ovs|00001|db_ctl_base|ERR|no row "vif7.0" in table Interface
Oct 10 16:30:17 lab-pprod-xen03 ovs-vsctl: ovs|00001|db_ctl_base|ERR|no row "vif7.1" in table Interface
Oct 10 16:30:17 lab-pprod-xen03 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=30 -- --if-exists del-port vif7.0
Oct 10 16:30:17 lab-pprod-xen03 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=30 -- --if-exists del-port vif7.1
Oct 10 16:30:17 lab-pprod-xen03 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=30 add-port xapi0 vif7.0 -- set interface vif7.0 "external-ids:\"xs-vm-uuid\"=\"ab6fa81f-59d2-$
Oct 10 16:30:17 lab-pprod-xen03 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=30 add-port xapi3 vif7.1 -- set interface vif7.1 "external-ids:\"xs-vm-uuid\"=\"ab6fa81f-59d2-$
Oct 10 16:30:18 lab-pprod-xen03 tapdisk[48097]: tapdisk-control: init, 10 x 4k buffers
Oct 10 16:30:18 lab-pprod-xen03 tapdisk[48097]: I/O queue driver: lio
Oct 10 16:30:18 lab-pprod-xen03 tapdisk[48097]: I/O queue driver: lio
Oct 10 16:30:18 lab-pprod-xen03 tapdisk[48097]: tapdisk-log: started, level 0
Oct 10 16:30:18 lab-pprod-xen03 tapdisk[48097]: Tapdisk running, control on /var/run/blktap-control/ctl48097
Oct 10 16:30:18 lab-pprod-xen03 tapdisk[48097]: nbd: Set up local unix domain socket on path '/var/run/blktap-control/nbdclient48097'
Oct 10 16:30:18 lab-pprod-xen03 tapdisk[48097]: received 'attach' message (uuid = 0)
Oct 10 16:30:18 lab-pprod-xen03 tapdisk[48097]: sending 'attach response' message (uuid = 0)
Oct 10 16:30:18 lab-pprod-xen03 tapdisk[48097]: received 'open' message (uuid = 0)
Oct 10 16:30:18 lab-pprod-xen03 tapdisk[48097]: /var/run/sr-mount/5bd2dee1-cfb7-be70-0326-3f9070c4ca2d/721646c6-7a3f-4909-bde8-70dac75f5361.vhd version: tap 0x00010003, b: 25600, a: 2686, $
Oct 10 16:30:18 lab-pprod-xen03 tapdisk[48097]: opened image /var/run/sr-mount/5bd2dee1-cfb7-be70-0326-3f9070c4ca2d/721646c6-7a3f-4909-bde8-70dac75f5361.vhd (1 users, state: 0x00000001, ty$
Oct 10 16:30:18 lab-pprod-xen03 tapdisk[48097]: VBD CHAIN:
Oct 10 16:30:18 lab-pprod-xen03 tapdisk[48097]: /var/run/sr-mount/5bd2dee1-cfb7-be70-0326-3f9070c4ca2d/721646c6-7a3f-4909-bde8-70dac75f5361.vhd: type:vhd(4) storage:ext(2)
Oct 10 16:30:18 lab-pprod-xen03 tapdisk[48097]: bdev: capacity=104857600 sector_size=512/512 flags=0
Oct 10 16:30:18 lab-pprod-xen03 tapdisk[48097]: nbd: Set up local unix domain socket on path '/var/run/blktap-control/nbdserver48097.0'
Oct 10 16:30:18 lab-pprod-xen03 tapdisk[48097]: nbd: registering for unix_listening_fd
Oct 10 16:30:18 lab-pprod-xen03 tapdisk[48097]: nbd: Successfully started NBD server on /var/run/blktap-control/nbd-old48097.0
Oct 10 16:30:18 lab-pprod-xen03 tapdisk[48097]: nbd: Set up local unix domain socket on path '/var/run/blktap-control/nbdserver-new48097.0'
Oct 10 16:30:18 lab-pprod-xen03 tapdisk[48097]: nbd: registering for unix_listening_fd
Oct 10 16:30:18 lab-pprod-xen03 tapdisk[48097]: nbd: Successfully started NBD server on /var/run/blktap-control/nbd48097.0
Oct 10 16:30:18 lab-pprod-xen03 tapdisk[48097]: sending 'open response' message (uuid = 0)
Oct 10 16:30:18 lab-pprod-xen03 tapback[48107]: tapback.c:445 slave tapback daemon started, only serving domain 7
Oct 10 16:30:18 lab-pprod-xen03 tapback[48107]: backend.c:406 768 physical_device_changed
Oct 10 16:30:18 lab-pprod-xen03 tapback[48107]: backend.c:406 768 physical_device_changed
Oct 10 16:30:18 lab-pprod-xen03 tapback[48107]: backend.c:492 768 found tapdisk[48097], for 254:0
Oct 10 16:30:18 lab-pprod-xen03 tapdisk[48097]: received 'disk info' message (uuid = 0)
Oct 10 16:30:18 lab-pprod-xen03 tapdisk[48097]: VBD 0 got disk info: sectors=104857600 sector size=512, info=0
Oct 10 16:30:18 lab-pprod-xen03 tapdisk[48097]: sending 'disk info rsp' message (uuid = 0)
Oct 10 16:30:18 lab-pprod-xen03 systemd[1]: Started transient unit for varstored-7.
Oct 10 16:30:18 lab-pprod-xen03 systemd[1]: Starting transient unit for varstored-7...
Oct 10 16:30:18 lab-pprod-xen03 varstored-7[48166]: main: --domain = '7'
Oct 10 16:30:18 lab-pprod-xen03 varstored-7[48166]: main: --chroot = '/var/run/xen/varstored-root-7'
Oct 10 16:30:18 lab-pprod-xen03 varstored-7[48166]: main: --depriv = '(null)'
Oct 10 16:30:18 lab-pprod-xen03 varstored-7[48166]: main: --uid = '65542'
Oct 10 16:30:18 lab-pprod-xen03 varstored-7[48166]: main: --gid = '998'
Oct 10 16:30:18 lab-pprod-xen03 varstored-7[48166]: main: --backend = 'xapidb'
Oct 10 16:30:18 lab-pprod-xen03 varstored-7[48166]: main: --arg = 'socket:/xapi-depriv-socket'
Oct 10 16:30:18 lab-pprod-xen03 varstored-7[48166]: main: --pidfile = '/var/run/xen/varstored-7.pid'
Oct 10 16:30:18 lab-pprod-xen03 varstored-7[48166]: main: --arg = 'uuid:ab6fa81f-59d2-8bb1-fdf8-35969838ec7a'
Oct 10 16:30:18 lab-pprod-xen03 varstored-7[48166]: main: --arg = 'save:/efi-vars-save.dat'
Oct 10 16:30:18 lab-pprod-xen03 varstored-7[48166]: varstored_initialize: 4 vCPU(s)
Oct 10 16:30:18 lab-pprod-xen03 varstored-7[48166]: main: --arg = 'save:/efi-vars-save.dat'
Oct 10 16:30:18 lab-pprod-xen03 varstored-7[48166]: varstored_initialize: 4 vCPU(s)
Oct 10 16:30:18 lab-pprod-xen03 varstored-7[48166]: varstored_initialize: ioservid = 0
Oct 10 16:30:18 lab-pprod-xen03 varstored-7[48166]: varstored_initialize: iopage = 0x7f5b175d1000
Oct 10 16:30:18 lab-pprod-xen03 varstored-7[48166]: varstored_initialize: VCPU0: 7 -> 356
Oct 10 16:30:18 lab-pprod-xen03 varstored-7[48166]: varstored_initialize: VCPU1: 8 -> 357
Oct 10 16:30:18 lab-pprod-xen03 varstored-7[48166]: varstored_initialize: VCPU2: 9 -> 358
Oct 10 16:30:18 lab-pprod-xen03 varstored-7[48166]: varstored_initialize: VCPU3: 10 -> 359
Oct 10 16:30:18 lab-pprod-xen03 varstored-7[48166]: load_one_auth_data: Auth file '/var/lib/varstored/dbx.auth' is missing!
Oct 10 16:30:18 lab-pprod-xen03 varstored-7[48166]: load_one_auth_data: Auth file '/var/lib/varstored/db.auth' is missing!
Oct 10 16:30:18 lab-pprod-xen03 varstored-7[48166]: load_one_auth_data: Auth file '/var/lib/varstored/KEK.auth' is missing!
Oct 10 16:30:18 lab-pprod-xen03 varstored-7[48166]: initialize_settings: Secure boot enable: false
Oct 10 16:30:18 lab-pprod-xen03 varstored-7[48166]: initialize_settings: Authenticated variables: enforcing
Oct 10 16:30:18 lab-pprod-xen03 varstored-7[48166]: IO request not ready
Oct 10 16:30:18 lab-pprod-xen03 varstored-7[48166]: message repeated 3 times: [ IO request not ready]
Oct 10 16:30:18 lab-pprod-xen03 forkexecd: [ info||0 ||forkexecd] qemu-dm-7[48182]: Arguments: 7 --syslog -std-vga -videoram 8 -vnc unix:/var/run/xen/vnc-7,lock-key-sync=off -acpi -priv -m$
Oct 10 16:30:18 lab-pprod-xen03 forkexecd: [ info||0 ||forkexecd] qemu-dm-7[48182]: Exec: /usr/lib64/xen/bin/qemu-system-i386 qemu-dm-7 -machine pc-i440fx-2.10,accel=xen,max-ram-below-4g=3$
Oct 10 16:30:18 lab-pprod-xen03 qemu-dm-7[48225]: Moving to cgroup slice 'vm.slice'
Oct 10 16:30:18 lab-pprod-xen03 qemu-dm-7[48225]: core dump limit: 67108864
Oct 10 16:30:18 lab-pprod-xen03 qemu-dm-7[48225]: char device redirected to /dev/pts/2 (label serial0)
Oct 10 16:30:18 lab-pprod-xen03 ovs-vsctl: ovs|00001|db_ctl_base|ERR|no row "tap7.0" in table Interface
Oct 10 16:30:18 lab-pprod-xen03 ovs-vsctl: ovs|00001|db_ctl_base|ERR|no row "tap7.1" in table Interface
Oct 10 16:30:18 lab-pprod-xen03 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=30 -- --if-exists del-port tap7.0
Oct 10 16:30:18 lab-pprod-xen03 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=30 -- --if-exists del-port tap7.1
Oct 10 16:30:18 lab-pprod-xen03 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=30 add-port xapi0 tap7.0 -- set interface tap7.0 "external-ids:\"xs-vm-uuid\"=\"ab6fa81f-59d2-$
Oct 10 16:30:18 lab-pprod-xen03 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=30 add-port xapi3 tap7.1 -- set interface tap7.1 "external-ids:\"xs-vm-uuid\"=\"ab6fa81f-59d2-$
Oct 10 16:30:19 lab-pprod-xen03 qemu-dm-7[48225]: [00:08.0] xen_pt_region_update: Error: create new mem mapping failed! (err: 1)
Oct 10 16:30:33 lab-pprod-xen03 qemu-dm-7[48225]: Detected Xen version 4.17
Oct 10 16:30:34 lab-pprod-xen03 qemu-dm-7[48225]: [00:08.0] xen_pt_region_update: Error: remove old mem mapping failed! (err: 1)
Oct 10 16:30:35 lab-pprod-xen03 qemu-dm-7[48225]: [00:08.0] xen_pt_region_update: Error: create new mem mapping failed! (err: 1)
Oct 10 16:30:35 lab-pprod-xen03 qemu-dm-7[48225]: [00:08.0] xen_pt_region_update: Error: remove old mem mapping failed! (err: 1)
Oct 10 16:30:36 lab-pprod-xen03 qemu-dm-7[48225]: [00:08.0] xen_pt_region_update: Error: create new mem mapping failed! (err: 1)
Oct 10 16:30:37 lab-pprod-xen03 tapdisk[48097]: received 'sring connect' message (uuid = 0)
Oct 10 16:30:37 lab-pprod-xen03 tapdisk[48097]: connecting VBD 0 domid=7, devid=768, pool (null), evt 16, poll duration 1000, poll idle threshold 50
Oct 10 16:30:37 lab-pprod-xen03 tapdisk[48097]: ring 0xbed010 connected
Oct 10 16:30:37 lab-pprod-xen03 tapdisk[48097]: sending 'sring connect rsp' message (uuid = 0)
Oct 10 16:30:37 lab-pprod-xen03 qemu-dm-7[48225]: XenPvBlk: New disk with 104857600 sectors of 512 bytes
Oct 10 16:30:38 lab-pprod-xen03 qemu-dm-7[48225]: About to call StartImage (0xDEC16D18)
Oct 10 16:30:40 lab-pprod-xen03 qemu-dm-7[48225]: ExitBootServices -> (0xDEC16D18, 0xD9D)
Oct 10 16:30:40 lab-pprod-xen03 tapdisk[48097]: received 'sring disconnect' message (uuid = 0)
Oct 10 16:30:40 lab-pprod-xen03 tapdisk[48097]: disconnecting domid=7, devid=768
Oct 10 16:30:40 lab-pprod-xen03 tapdisk[48097]: sending 'sring disconnect rsp' message (uuid = 0)
Oct 10 16:30:40 lab-pprod-xen03 qemu-dm-7[48225]: ExitBootServices <- (Success)
Oct 10 16:30:41 lab-pprod-xen03 qemu-dm-7[48225]: SetVirtualAddressMap -> (0x4B0, 0x30, 0x1)
Oct 10 16:30:41 lab-pprod-xen03 qemu-dm-7[48225]: SetVirtualAddressMap <- (Success)
Oct 10 16:30:41 lab-pprod-xen03 ovs-ofctl: ovs|00001|ofp_port|WARN|Negative value -1 is not a valid port number.
Oct 10 16:30:41 lab-pprod-xen03 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=30 -- --if-exists del-port tap7.1
Oct 10 16:30:41 lab-pprod-xen03 ovs-ofctl: ovs|00001|ofp_port|WARN|Negative value -1 is not a valid port number.
Oct 10 16:30:41 lab-pprod-xen03 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=30 -- --if-exists del-port tap7.0
Oct 10 16:30:41 lab-pprod-xen03 qemu-dm-7[48225]: [00:08.0] xen_pt_region_update: Error: remove old mem mapping failed! (err: 1)
Oct 10 16:30:41 lab-pprod-xen03 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=30 -- --if-exists del-port tap7.0
Oct 10 16:30:41 lab-pprod-xen03 qemu-dm-7[48225]: [00:08.0] xen_pt_region_update: Error: remove old mem mapping failed! (err: 1)
Oct 10 16:30:41 lab-pprod-xen03 qemu-dm-7[48225]: [00:08.0] xen_pt_region_update: Error: create new mem mapping failed! (err: 1)
Oct 10 16:30:43 lab-pprod-xen03 tapback[48107]: frontend.c:216 768 front-end supports persistent grants but we don't
Oct 10 16:30:43 lab-pprod-xen03 tapdisk[48097]: received 'sring connect' message (uuid = 0)
Oct 10 16:30:43 lab-pprod-xen03 tapdisk[48097]: connecting VBD 0 domid=7, devid=768, pool (null), evt 49, poll duration 1000, poll idle threshold 50
Oct 10 16:30:43 lab-pprod-xen03 tapdisk[48097]: ring 0xbee810 connected
Oct 10 16:30:43 lab-pprod-xen03 tapdisk[48097]: sending 'sring connect rsp' message (uuid = 0)
Oct 10 16:30:47 lab-pprod-xen03 systemd[1]: Stopping transient unit for varstored-7...
Oct 10 16:30:47 lab-pprod-xen03 systemd[1]: Stopped transient unit for varstored-7.
Oct 10 16:30:47 lab-pprod-xen03 qemu-dm-7[48225]: qemu-dm-7: terminating on signal 15 from pid 2169 (/usr/sbin/xenopsd-xc)
Oct 10 16:30:47 lab-pprod-xen03 /opt/xensource/libexec/xcp-clipboardd[48221]: poll failed because revents=0x11 (qemu socket)
Oct 10 16:30:47 lab-pprod-xen03 tapdisk[48097]: received 'sring disconnect' message (uuid = 0)
Oct 10 16:30:47 lab-pprod-xen03 tapdisk[48097]: disconnecting domid=7, devid=768
Oct 10 16:30:47 lab-pprod-xen03 tapdisk[48097]: sending 'sring disconnect rsp' message (uuid = 0)
Oct 10 16:30:47 lab-pprod-xen03 tapdisk[48097]: received 'close' message (uuid = 0)
Oct 10 16:30:47 lab-pprod-xen03 tapdisk[48097]: nbd: NBD server pause(0xbfe410)
Oct 10 16:30:47 lab-pprod-xen03 tapdisk[48097]: nbd: NBD server pause(0xbfe610)
Oct 10 16:30:48 lab-pprod-xen03 tapdisk[48097]: nbd: NBD server free(0xbfe410)
Oct 10 16:30:48 lab-pprod-xen03 tapdisk[48097]: nbd: NBD server free(0xbfe610)
Oct 10 16:30:48 lab-pprod-xen03 tapdisk[48097]: gaps written/skipped: 2/0
Oct 10 16:30:48 lab-pprod-xen03 tapdisk[48097]: /var/run/sr-mount/5bd2dee1-cfb7-be70-0326-3f9070c4ca2d/721646c6-7a3f-4909-bde8-70dac75f5361.vhd: b: 25600, a: 2686, f: 2658, n: 11023552
Oct 10 16:30:48 lab-pprod-xen03 tapdisk[48097]: closed image /var/run/sr-mount/5bd2dee1-cfb7-be70-0326-3f9070c4ca2d/721646c6-7a3f-4909-bde8-70dac75f5361.vhd (0 users, state: 0x00000000, ty$
Oct 10 16:30:48 lab-pprod-xen03 tapdisk[48097]: sending 'close response' message (uuid = 0)
Oct 10 16:30:48 lab-pprod-xen03 tapdisk[48097]: received 'detach' message (uuid = 0)
Oct 10 16:30:48 lab-pprod-xen03 tapdisk[48097]: sending 'detach response' message (uuid = 0)
Oct 10 16:30:48 lab-pprod-xen03 tapdisk[48097]: tapdisk-log: closing after 0 errors
Oct 10 16:30:48 lab-pprod-xen03 tapdisk[48097]: tapdisk-syslog: 32 messages, 2735 bytes, xmits: 33, failed: 0, dropped: 0
Oct 10 16:30:48 lab-pprod-xen03 ovs-ofctl: ovs|00001|ofp_port|WARN|Negative value -1 is not a valid port number.
Oct 10 16:30:48 lab-pprod-xen03 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=30 -- --if-exists del-port vif7.1
Oct 10 16:30:48 lab-pprod-xen03 tapdisk[48097]: tapdisk-control: draining 1 connections
Oct 10 16:30:48 lab-pprod-xen03 tapdisk[48097]: tapdisk-control: done
Oct 10 16:30:48 lab-pprod-xen03 ovs-ofctl: ovs|00001|ofp_port|WARN|Negative value -1 is not a valid port number.
Oct 10 16:30:48 lab-pprod-xen03 ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl --timeout=30 -- --if-exists del-port vif7.0
Oct 10 16:30:49 lab-pprod-xen03 tapback[48107]: backend.c:1246 domain removed, exit

@olivierlambert SKU is KCMYXRUG15T3

Maelstrom96

@TeddyAstie

I've checked an I do not have any patch available for the host:

[16:06 lab-pprod-xen03 ~]# yum update
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
Excluding mirror: updates.xcp-ng.org
 * xcp-ng-base: mirrors.xcp-ng.org
Excluding mirror: updates.xcp-ng.org
 * xcp-ng-updates: mirrors.xcp-ng.org
No packages marked for update
[16:06 lab-pprod-xen03 ~]# yum upgrade
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
Excluding mirror: updates.xcp-ng.org
 * xcp-ng-base: mirrors.xcp-ng.org
Excluding mirror: updates.xcp-ng.org
 * xcp-ng-updates: mirrors.xcp-ng.org
No packages marked for update
[16:06 lab-pprod-xen03 ~]#

I've run the commands, but there doesn't seem to be additionnal logs. Should I be looking elsewhere?

[2025-10-10 16:28:15] (XEN) [ 4959.062050] 'G' pressed -> guest log level adjustments enabled
[2025-10-10 16:28:16] (XEN) [ 4960.384372] '+' pressed -> guest log level: Errors (rate limited Errors and warnings)
[2025-10-10 16:30:47] (XEN) [ 5110.535350] domain_crash called from svm_vmexit_handler+0x129f/0x1480
[2025-10-10 16:30:47] (XEN) [ 5110.535351] Domain 7 (vcpu#3) crashed on cpu#3:
[2025-10-10 16:30:47] (XEN) [ 5110.535354] ----[ Xen-4.17.5-15  x86_64  debug=n  Not tainted ]----
[2025-10-10 16:30:47] (XEN) [ 5110.535354] CPU:    3
[2025-10-10 16:30:47] (XEN) [ 5110.535355] RIP:    0010:[<ffffffffa6d1b5d0>]
[2025-10-10 16:30:47] (XEN) [ 5110.535356] RFLAGS: 0000000000000286   CONTEXT: hvm guest (d7v3)
[2025-10-10 16:30:47] (XEN) [ 5110.535358] rax: ffff9e0280081200   rbx: ffff9e028007fba4   rcx: 0000000000000000
[2025-10-10 16:30:47] (XEN) [ 5110.535359] rdx: 00000000fee97000   rsi: 0000000000000000   rdi: 0000000000000000
[2025-10-10 16:30:47] (XEN) [ 5110.535359] rbp: ffff8dc094984780   rsp: ffff9e028007fb58   r8:  0000000000000000
[2025-10-10 16:30:47] (XEN) [ 5110.535360] r9:  0000000000000000   r10: ffff9e028007fb18   r11: 0000000000000000
[2025-10-10 16:30:47] (XEN) [ 5110.535361] r12: 0000000000000197   r13: ffff8dc040bd40c8   r14: 0000000000000011
[2025-10-10 16:30:47] (XEN) [ 5110.535361] r15: 0000000000000001   cr0: 0000000080050033   cr4: 0000000000770ef0
[2025-10-10 16:30:47] (XEN) [ 5110.535362] cr3: 000000010200c006   cr2: 0000000000000000
[2025-10-10 16:30:47] (XEN) [ 5110.535362] fsb: 0000000000000000   gsb: ffff8dc157580000   gss: 0000000000000000
[2025-10-10 16:30:47] (XEN) [ 5110.535363] ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0018   cs: 0010

Maelstrom96

@TeddyAstie It's a newly deployed Rocky Linux 9.6 with all the latest updates applied to it.

Nested virtualization is disabled.

Maelstrom96

@andSmv Any news on a test build with the patch? I'm wondering if this issue is related and would love to be able to test.

Maelstrom96

I'm having a weird issue with PCIe pass-through for our KIOXIA CM7 drives. We have a bunch of KIOXIA CX6 drives and those are being pass-through without any issue.

First thing I've noticed is that the device ID is not being properly recognized, and instead of showing the CM7 name, it's only displaying the device as Device 0013.

I've tried raising the IRQ limit as explained in the doc. /opt/xensource/libexec/xen-cmdline --set-xen "extra_guest_irqs=128"

Here are some logs from hypervisor.log when the VM crashes during boot.

[2025-10-08 19:15:52] (XEN) [   93.430177] d[IDLE]v14: Unsupported MSI delivery mode 7 for Dom1
[2025-10-08 19:15:52] (XEN) [   93.436563] d[IDLE]v14: Unsupported MSI delivery mode 7 for Dom1
[2025-10-08 19:15:52] (XEN) [   93.439733] d1v0: Unsupported MSI delivery mode 7 for Dom1
[2025-10-08 19:15:52] (XEN) [   93.448323] d1v0: Unsupported MSI delivery mode 7 for Dom1
[2025-10-08 19:15:52] (XEN) [   93.448801] d1v0: Unsupported MSI delivery mode 7 for Dom1
[2025-10-08 19:15:52] (XEN) [   93.457235] d1v0: Unsupported MSI delivery mode 7 for Dom1
[2025-10-08 19:27:23] (XEN) [  784.468669] domain_crash called from svm_vmexit_handler+0x129f/0x1480
[2025-10-08 19:27:23] (XEN) [  784.468671] Domain 3 (vcpu#1) crashed on cpu#1:
[2025-10-08 19:27:23] (XEN) [  784.468673] ----[ Xen-4.17.5-15  x86_64  debug=n  Not tainted ]----
[2025-10-08 19:27:23] (XEN) [  784.468674] CPU:    1
[2025-10-08 19:27:23] (XEN) [  784.468674] RIP:    0010:[<ffffffffa751b5d0>]
[2025-10-08 19:27:23] (XEN) [  784.468675] RFLAGS: 0000000000000286   CONTEXT: hvm guest (d3v1)
[2025-10-08 19:27:23] (XEN) [  784.468676] rax: ffffb90bc0071200   rbx: ffffb90bc007fba4   rcx: 0000000000000000
[2025-10-08 19:27:23] (XEN) [  784.468677] rdx: 00000000fee97000   rsi: 0000000000000000   rdi: 0000000000000000
[2025-10-08 19:27:23] (XEN) [  784.468678] rbp: ffff944cba5c9a80   rsp: ffffb90bc007fb58   r8:  0000000000000000
[2025-10-08 19:27:23] (XEN) [  784.468678] r9:  0000000000000000   r10: ffffb90bc007fb18   r11: 0000000000000000
[2025-10-08 19:27:23] (XEN) [  784.468679] r12: 0000000000000197   r13: ffff944c80c830c8   r14: 0000000000000011
[2025-10-08 19:27:23] (XEN) [  784.468679] r15: 0000000000000001   cr0: 0000000080050033   cr4: 0000000000770ef0
[2025-10-08 19:27:23] (XEN) [  784.468680] cr3: 00000001105dc006   cr2: 0000000000000000
[2025-10-08 19:27:23] (XEN) [  784.468680] fsb: 0000000000000000   gsb: ffff944d97480000   gss: 0000000000000000
[2025-10-08 19:27:23] (XEN) [  784.468681] ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0018   cs: 0010

The unsupported MSI delivery mode seems to be from our CX6 drives, but they seem to be working fine.

Here is a lspci output for one of the drive:

lspci -s e1:00.0 -vv
e1:00.0 Non-Volatile memory controller: KIOXIA Corporation Device 0013 (rev 01) (prog-if 02 [NVM Express])
        Subsystem: KIOXIA Corporation Device 0043
        Physical Slot: 0-2
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 182
        Region 0: Memory at f2810000 (64-bit, non-prefetchable) [size=64K]
        Expansion ROM at f2800000 [disabled] [size=64K]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [70] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 1024 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 75.000W
                DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported-
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
                        MaxPayload 512 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
                LnkCap: Port #0, Speed unknown, Width x4, ASPM not supported, Exit Latency L0s <2us, L1 <64us
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 16GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                LnkCtl2: Target Link Speed: Unknown, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+, EqualizationPhase1+
                         EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
        Capabilities: [b0] MSI-X: Enable- Count=129 Masked-
                Vector table: BAR=0 offset=00005200
                PBA: BAR=0 offset=0000d600
        Capabilities: [d0] Vital Product Data
                Product Name: KIOXIA ESSD
                Read-only fields:
                        [PN] Part number: KIOXIA KCMYXRUG15T3
                        [EC] Engineering changes: 0001
                        [SN] Serial number: 3DH0A00A0LP1
                        [MN] Manufacture ID: 31 45 30 46
                        [RV] Reserved: checksum good, 26 byte(s) reserved
                End
        Capabilities: [100 v2] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO+ CmpltAbrt- UnxCmplt+ RxOF+ MalfTLP+ ECRC+ UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                AERCap: First Error Pointer: 00, GenCap+ CGenEn+ ChkCap+ ChkEn+
        Capabilities: [148 v1] Device Serial Number 8c-e3-8e-e3-00-32-1f-01
        Capabilities: [168 v1] Alternative Routing-ID Interpretation (ARI)
                ARICap: MFVC- ACS-, Next Function: 0
                ARICtl: MFVC- ACS-, Function Group: 0
        Capabilities: [178 v1] #19
        Capabilities: [198 v1] #26
        Capabilities: [1c0 v1] #27
        Capabilities: [1e8 v1] #2a
        Capabilities: [210 v1] Single Root I/O Virtualization (SR-IOV)
                IOVCap: Migration-, Interrupt Message Number: 000
                IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+
                IOVSta: Migration-
                Initial VFs: 32, Total VFs: 32, Number of VFs: 0, Function Dependency Link: 00
                VF offset: 1, stride: 1, Device ID: 0013
                Supported Page Size: 00000553, System Page Size: 00000001
                Region 0: Memory at 00000000f2600000 (64-bit, non-prefetchable)
                VF Migration: offset: 00000000, BIR: 0
        Kernel driver in use: pciback
        Kernel modules: nvme

The only thing I haven't tried to do yet is enable SR-IOV, but I don't think that would really change anything.

Thanks in advance for anyone chiming in!

Maelstrom96

@olivierlambert Thanks for the suggestion! The problem is I have no idea where I would start to build it for Slackware. I'll see if I can figure it out but with my research, I'm not quite sure I'll be able to.

Maelstrom96

@eurodrigolira Sorry to revive this topic, but do you have pointers on how to build a slackware package for xe-guest-utilities? I'm trying to add the VM guest tools to UnRAID and I'm not having much luck.

Maelstrom96

@T3CCH What you might be looking for: https://xcp-ng.org/docs/networking.html#full-mesh-network

Maelstrom96

@ronan-a Thanks a lot for that procedure.

Ended up needing to do a little bit more, since for some reason, "evacuate" failed. I deleted the node and then went and just manually recreated my resources using:

linstor resource create --auto-place +1 <resource_name>

Which didn't work at first because the new node didn't have a storage-pool configured, which required this command to work (NOTE - This is only valid if your SR was setup as thin):

linstor storage-pool create lvmthin <node_name> xcp-sr-linstor_group_thin_device linstor_group/thin_device

Also, worth nothing that before actually re-creating the resources, you might want to manually clean up the lingering Logical Volumes that weren't cleaned up if evacuate failed.

Find volumes with:

lvdisplay

and then delete them with:

lvremove <LV Path>

example:

lvremove /dev/linstor_group/xcp-persistent-database_00000

Maelstrom96

@ronan-a Do you know of a way to update a node name in Linstor? I've tried to look in their documentation and checked through CLI commands but couldn't find a way.

Maelstrom96

@ronan-a I will be testing my theory a little bit later today, but I believe it might be a hostname mismatch between the node name it expects in linstor and what it set to now on Dom0. We had the hostname of the node updated before the cluster was spinned up, but I think it still had the previous name active when the linstor SR was created.

This means that the node name doesn't match here:
https://github.com/xcp-ng/sm/blob/e951676098c80e6da6de4d4653f496b15f5a8cb9/drivers/linstorvolumemanager.py#L2641C21-L2641C41

I will try to revert the hostname and see if it fixes everything.

Edit: Just tested and reverted the hostname to the default one, which matches what's in linstor, and it works again. So seems like changing a hostname after the cluster is provisionned is a no-no.

Maelstrom96

@ronan-a said in XOSTOR hyperconvergence preview:

drbdsetup events2

Host1:

[09:49 xcp-ng-labs-host01 ~]# systemctl status linstor-controller
● linstor-controller.service - drbd-reactor controlled linstor-controller
   Loaded: loaded (/usr/lib/systemd/system/linstor-controller.service; disabled; vendor preset: disabled)
  Drop-In: /run/systemd/system/linstor-controller.service.d
           └─reactor.conf
   Active: active (running) since Thu 2024-05-02 13:24:32 PDT; 20h ago
 Main PID: 21340 (java)
   CGroup: /system.slice/linstor-controller.service
           └─21340 /usr/lib/jvm/jre-11/bin/java -Xms32M -classpath /usr/share/linstor-server/lib/conf:/usr/share/linstor-server/lib/* com.linbit.linstor.core.Controller --logs=/var/log/linstor-controller --config-directory=/etc/linstor
[09:49 xcp-ng-labs-host01 ~]# systemctl status linstor-satellite
● linstor-satellite.service - LINSTOR Satellite Service
   Loaded: loaded (/usr/lib/systemd/system/linstor-satellite.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/linstor-satellite.service.d
           └─override.conf
   Active: active (running) since Wed 2024-05-01 16:04:05 PDT; 1 day 17h ago
 Main PID: 1947 (java)
   CGroup: /system.slice/linstor-satellite.service
           ├─1947 /usr/lib/jvm/jre-11/bin/java -Xms32M -classpath /usr/share/linstor-server/lib/conf:/usr/share/linstor-server/lib/* com.linbit.linstor.core.Satellite --logs=/var/log/linstor-satellite --config-directory=/etc/linstor
           ├─2109 drbdsetup events2 all
           └─2347 /usr/sbin/dmeventd
[09:49 xcp-ng-labs-host01 ~]# systemctl status drbd-reactor
● drbd-reactor.service - DRBD-Reactor Service
   Loaded: loaded (/usr/lib/systemd/system/drbd-reactor.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/drbd-reactor.service.d
           └─override.conf
   Active: active (running) since Wed 2024-05-01 16:04:11 PDT; 1 day 17h ago
     Docs: man:drbd-reactor
           man:drbd-reactorctl
           man:drbd-reactor.toml
 Main PID: 1950 (drbd-reactor)
   CGroup: /system.slice/drbd-reactor.service
           ├─1950 /usr/sbin/drbd-reactor
           └─1976 drbdsetup events2 --full --poll
[09:49 xcp-ng-labs-host01 ~]# mountpoint /var/lib/linstor
/var/lib/linstor is a mountpoint
[09:49 xcp-ng-labs-host01 ~]# drbdsetup events2
exists resource name:xcp-persistent-database role:Primary suspended:no force-io-failures:no may_promote:no promotion_score:10103
exists connection name:xcp-persistent-database peer-node-id:1 conn-name:xcp-ng-labs-host03 connection:Connected role:Secondary
exists connection name:xcp-persistent-database peer-node-id:2 conn-name:xcp-ng-labs-host02 connection:Connected role:Secondary
exists device name:xcp-persistent-database volume:0 minor:1000 backing_dev:/dev/linstor_group/xcp-persistent-database_00000 disk:UpToDate client:no quorum:yes
exists peer-device name:xcp-persistent-database peer-node-id:1 conn-name:xcp-ng-labs-host03 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
exists path name:xcp-persistent-database peer-node-id:1 conn-name:xcp-ng-labs-host03 local:ipv4:10.100.0.200:7000 peer:ipv4:10.100.0.202:7000 established:yes
exists peer-device name:xcp-persistent-database peer-node-id:2 conn-name:xcp-ng-labs-host02 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
exists path name:xcp-persistent-database peer-node-id:2 conn-name:xcp-ng-labs-host02 local:ipv4:10.100.0.200:7000 peer:ipv4:10.100.0.201:7000 established:yes
exists resource name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 role:Secondary suspended:no force-io-failures:no may_promote:no promotion_score:10103
exists connection name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:1 conn-name:xcp-ng-labs-host03 connection:Connected role:Secondary
exists connection name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:0 conn-name:xcp-ng-labs-host02 connection:Connected role:Primary
exists device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 volume:0 minor:1001 backing_dev:/dev/linstor_group/xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0_00000 disk:UpToDate client:no quorum:yes
exists peer-device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:1 conn-name:xcp-ng-labs-host03 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
exists path name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:1 conn-name:xcp-ng-labs-host03 local:ipv4:10.100.0.200:7001 peer:ipv4:10.100.0.202:7001 established:yes
exists peer-device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:0 conn-name:xcp-ng-labs-host02 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
exists path name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:0 conn-name:xcp-ng-labs-host02 local:ipv4:10.100.0.200:7001 peer:ipv4:10.100.0.201:7001 established:yes
exists -

Host2:

[09:51 xcp-ng-labs-host02 ~]# systemctl status linstor-controller
● linstor-controller.service - drbd-reactor controlled linstor-controller
   Loaded: loaded (/usr/lib/systemd/system/linstor-controller.service; disabled; vendor preset: disabled)
  Drop-In: /run/systemd/system/linstor-controller.service.d
           └─reactor.conf
   Active: inactive (dead)
[09:51 xcp-ng-labs-host02 ~]# systemctl status linstor-satellite
● linstor-satellite.service - LINSTOR Satellite Service
   Loaded: loaded (/usr/lib/systemd/system/linstor-satellite.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/linstor-satellite.service.d
           └─override.conf
   Active: active (running) since Thu 2024-05-02 10:26:59 PDT; 23h ago
 Main PID: 1990 (java)
   CGroup: /system.slice/linstor-satellite.service
           ├─1990 /usr/lib/jvm/jre-11/bin/java -Xms32M -classpath /usr/share/linstor-server/lib/conf:/usr/share/linstor-server/lib/* com.linbit.linstor.core.Satellite --logs=/var/log/linstor-satellite --config-directory=/etc/linstor
           ├─2128 drbdsetup events2 all
           └─2552 /usr/sbin/dmeventd
[09:51 xcp-ng-labs-host02 ~]# systemctl status drbd-reactor
● drbd-reactor.service - DRBD-Reactor Service
   Loaded: loaded (/usr/lib/systemd/system/drbd-reactor.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/drbd-reactor.service.d
           └─override.conf
   Active: active (running) since Thu 2024-05-02 10:27:07 PDT; 23h ago
     Docs: man:drbd-reactor
           man:drbd-reactorctl
           man:drbd-reactor.toml
 Main PID: 1989 (drbd-reactor)
   CGroup: /system.slice/drbd-reactor.service
           ├─1989 /usr/sbin/drbd-reactor
           └─2035 drbdsetup events2 --full --poll
[09:51 xcp-ng-labs-host02 ~]# mountpoint /var/lib/linstor
/var/lib/linstor is not a mountpoint
[09:51 xcp-ng-labs-host02 ~]# drbdsetup events2
exists resource name:xcp-persistent-database role:Secondary suspended:no force-io-failures:no may_promote:no promotion_score:10103
exists connection name:xcp-persistent-database peer-node-id:0 conn-name:xcp-ng-labs-host01 connection:Connected role:Primary
exists connection name:xcp-persistent-database peer-node-id:1 conn-name:xcp-ng-labs-host03 connection:Connected role:Secondary
exists device name:xcp-persistent-database volume:0 minor:1000 backing_dev:/dev/linstor_group/xcp-persistent-database_00000 disk:UpToDate client:no quorum:yes
exists peer-device name:xcp-persistent-database peer-node-id:0 conn-name:xcp-ng-labs-host01 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
exists path name:xcp-persistent-database peer-node-id:0 conn-name:xcp-ng-labs-host01 local:ipv4:10.100.0.201:7000 peer:ipv4:10.100.0.200:7000 established:yes
exists peer-device name:xcp-persistent-database peer-node-id:1 conn-name:xcp-ng-labs-host03 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
exists path name:xcp-persistent-database peer-node-id:1 conn-name:xcp-ng-labs-host03 local:ipv4:10.100.0.201:7000 peer:ipv4:10.100.0.202:7000 established:yes
exists resource name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 role:Primary suspended:no force-io-failures:no may_promote:no promotion_score:10103
exists connection name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:2 conn-name:xcp-ng-labs-host01 connection:Connected role:Secondary
exists connection name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:1 conn-name:xcp-ng-labs-host03 connection:Connected role:Secondary
exists device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 volume:0 minor:1001 backing_dev:/dev/linstor_group/xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0_00000 disk:UpToDate client:no quorum:yes
exists peer-device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:2 conn-name:xcp-ng-labs-host01 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
exists path name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:2 conn-name:xcp-ng-labs-host01 local:ipv4:10.100.0.201:7001 peer:ipv4:10.100.0.200:7001 established:yes
exists peer-device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:1 conn-name:xcp-ng-labs-host03 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
exists path name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:1 conn-name:xcp-ng-labs-host03 local:ipv4:10.100.0.201:7001 peer:ipv4:10.100.0.202:7001 established:yes
exists -

Host3:

[09:51 xcp-ng-labs-host03 ~]# systemctl status linstor-controller
● linstor-controller.service - drbd-reactor controlled linstor-controller
   Loaded: loaded (/usr/lib/systemd/system/linstor-controller.service; disabled; vendor preset: disabled)
  Drop-In: /run/systemd/system/linstor-controller.service.d
           └─reactor.conf
   Active: inactive (dead)
[09:52 xcp-ng-labs-host03 ~]# systemctl status linstor-satellite
● linstor-satellite.service - LINSTOR Satellite Service
   Loaded: loaded (/usr/lib/systemd/system/linstor-satellite.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/linstor-satellite.service.d
           └─override.conf
   Active: active (running) since Thu 2024-05-02 10:10:16 PDT; 23h ago
 Main PID: 1937 (java)
   CGroup: /system.slice/linstor-satellite.service
           ├─1937 /usr/lib/jvm/jre-11/bin/java -Xms32M -classpath /usr/share/linstor-server/lib/conf:/usr/share/linstor-server/lib/* com.linbit.linstor.core.Satellite --logs=/var/log/linstor-satellite --config-directory=/etc/linstor
           ├─2151 drbdsetup events2 all
           └─2435 /usr/sbin/dmeventd
[09:52 xcp-ng-labs-host03 ~]# systemctl status drbd-reactor
● drbd-reactor.service - DRBD-Reactor Service
   Loaded: loaded (/usr/lib/systemd/system/drbd-reactor.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/drbd-reactor.service.d
           └─override.conf
   Active: active (running) since Thu 2024-05-02 10:10:26 PDT; 23h ago
     Docs: man:drbd-reactor
           man:drbd-reactorctl
           man:drbd-reactor.toml
 Main PID: 1939 (drbd-reactor)
   CGroup: /system.slice/drbd-reactor.service
           ├─1939 /usr/sbin/drbd-reactor
           └─1981 drbdsetup events2 --full --poll
[09:52 xcp-ng-labs-host03 ~]# mountpoint /var/lib/linstor
/var/lib/linstor is not a mountpoint
[09:52 xcp-ng-labs-host03 ~]# drbdsetup events2
exists resource name:xcp-persistent-database role:Secondary suspended:no force-io-failures:no may_promote:no promotion_score:10103
exists connection name:xcp-persistent-database peer-node-id:0 conn-name:xcp-ng-labs-host01 connection:Connected role:Primary
exists connection name:xcp-persistent-database peer-node-id:2 conn-name:xcp-ng-labs-host02 connection:Connected role:Secondary
exists device name:xcp-persistent-database volume:0 minor:1000 backing_dev:/dev/linstor_group/xcp-persistent-database_00000 disk:UpToDate client:no quorum:yes
exists peer-device name:xcp-persistent-database peer-node-id:0 conn-name:xcp-ng-labs-host01 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
exists path name:xcp-persistent-database peer-node-id:0 conn-name:xcp-ng-labs-host01 local:ipv4:10.100.0.202:7000 peer:ipv4:10.100.0.200:7000 established:yes
exists peer-device name:xcp-persistent-database peer-node-id:2 conn-name:xcp-ng-labs-host02 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
exists path name:xcp-persistent-database peer-node-id:2 conn-name:xcp-ng-labs-host02 local:ipv4:10.100.0.202:7000 peer:ipv4:10.100.0.201:7000 established:yes
exists resource name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 role:Secondary suspended:no force-io-failures:no may_promote:no promotion_score:10103
exists connection name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:2 conn-name:xcp-ng-labs-host01 connection:Connected role:Secondary
exists connection name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:0 conn-name:xcp-ng-labs-host02 connection:Connected role:Primary
exists device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 volume:0 minor:1001 backing_dev:/dev/linstor_group/xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0_00000 disk:UpToDate client:no quorum:yes
exists peer-device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:2 conn-name:xcp-ng-labs-host01 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
exists path name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:2 conn-name:xcp-ng-labs-host01 local:ipv4:10.100.0.202:7001 peer:ipv4:10.100.0.200:7001 established:yes
exists peer-device name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:0 conn-name:xcp-ng-labs-host02 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
exists path name:xcp-volume-ace70b43-4950-49f7-9de2-cf9c358dc2b0 peer-node-id:0 conn-name:xcp-ng-labs-host02 local:ipv4:10.100.0.202:7001 peer:ipv4:10.100.0.201:7001 established:yes
exists -

Will be sending the debug file as a DM.

Edit: Just as a sanity check, I tried to reboot the master instead of just restarting the toolstack, and the linstor SR seems to be working as expected again. The XOSTOR tab in XOA now populates (it just errored out before) and the SR scan now goes through.

Edit2: Was able to move a VDI, but then, the same exact error started to happen again. No idea why.

Maelstrom96

@ronan-a Since XOSTOR is supposed to be stable now, I figured I would try it out with a new setup of 3 newly installed 8.2 nodes.

I used the CLI to deploy it. It all went well, and the SR was quickly ready. I was even able to migrate a disk to the Linstor SR and boot the VM. However, after rebooting the master, it seems like the SR doesn't want to allow any disk migration, and manual Scan are failing. I've tried unmounting/remounting the SR fully, restarting the toolstack, but nothing seems to help. The disk that was on Linstor is still accessible and the VM is able to boot.

Here is the error I'm getting:

sr.scan
{
  "id": "e1a9bf4d-26ad-3ef6-b4a5-db98d012e0d9"
}
{
  "code": "SR_BACKEND_FAILURE_47",
  "params": [
    "",
    "The SR is not available [opterr=Database is not mounted]",
    ""
  ],
  "task": {
    "uuid": "a467bd90-8d47-09cc-b8ac-afa35056ff25",
    "name_label": "Async.SR.scan",
    "name_description": "",
    "allowed_operations": [],
    "current_operations": {},
    "created": "20240502T21:40:00Z",
    "finished": "20240502T21:40:01Z",
    "status": "failure",
    "resident_on": "OpaqueRef:b3e2f390-f45f-4614-a150-1eee53f204e1",
    "progress": 1,
    "type": "<none/>",
    "result": "",
    "error_info": [
      "SR_BACKEND_FAILURE_47",
      "",
      "The SR is not available [opterr=Database is not mounted]",
      ""
    ],
    "other_config": {},
    "subtask_of": "OpaqueRef:NULL",
    "subtasks": [],
    "backtrace": "(((process xapi)(filename lib/backtrace.ml)(line 210))((process xapi)(filename ocaml/xapi/storage_access.ml)(line 32))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 35))((process xapi)(filename ocaml/xapi/message_forwarding.ml)(line 131))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename ocaml/xapi/rbac.ml)(line 205))((process xapi)(filename ocaml/xapi/server_helpers.ml)(line 95)))"
  },
  "message": "SR_BACKEND_FAILURE_47(, The SR is not available [opterr=Database is not mounted], )",
  "name": "XapiError",
  "stack": "XapiError: SR_BACKEND_FAILURE_47(, The SR is not available [opterr=Database is not mounted], )
    at Function.wrap (file:///opt/xo/xo-builds/xen-orchestra-202404270302/packages/xen-api/_XapiError.mjs:16:12)
    at default (file:///opt/xo/xo-builds/xen-orchestra-202404270302/packages/xen-api/_getTaskResult.mjs:11:29)
    at Xapi._addRecordToCache (file:///opt/xo/xo-builds/xen-orchestra-202404270302/packages/xen-api/index.mjs:1029:24)
    at file:///opt/xo/xo-builds/xen-orchestra-202404270302/packages/xen-api/index.mjs:1063:14
    at Array.forEach (<anonymous>)
    at Xapi._processEvents (file:///opt/xo/xo-builds/xen-orchestra-202404270302/packages/xen-api/index.mjs:1053:12)
    at Xapi._watchEvents (file:///opt/xo/xo-builds/xen-orchestra-202404270302/packages/xen-api/index.mjs:1226:14)"
}

I quickly glanced over the source code and the SM logs to see if I could identify what was going on but it doesn't seem to be a simple thing.

Logs from SM:

May  2 13:22:02 xcp-ng-labs-host01 SM: [19242] LinstorSR.scan for e1a9bf4d-26ad-3ef6-b4a5-db98d012e0d9
May  2 13:22:02 xcp-ng-labs-host01 SM: [19242] Raising exception [47, The SR is not available [opterr=Database is not mounted]]
May  2 13:22:02 xcp-ng-labs-host01 SM: [19242] lock: released /var/lock/sm/e1a9bf4d-26ad-3ef6-b4a5-db98d012e0d9/sr
May  2 13:22:02 xcp-ng-labs-host01 SM: [19242] ***** generic exception: sr_scan: EXCEPTION <class 'SR.SROSError'>, The SR is not available [opterr=Database is not mounted]
May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]   File "/opt/xensource/sm/SRCommand.py", line 110, in run
May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]     return self._run_locked(sr)
May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]   File "/opt/xensource/sm/SRCommand.py", line 159, in _run_locked
May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]     rv = self._run(sr, target)
May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]   File "/opt/xensource/sm/SRCommand.py", line 364, in _run
May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]     return sr.scan(self.params['sr_uuid'])
May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]   File "/opt/xensource/sm/LinstorSR", line 536, in wrap
May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]     return load(self, *args, **kwargs)
May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]   File "/opt/xensource/sm/LinstorSR", line 521, in load
May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]     return wrapped_method(self, *args, **kwargs)
May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]   File "/opt/xensource/sm/LinstorSR", line 381, in wrapped_method
May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]     return method(self, *args, **kwargs)
May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]   File "/opt/xensource/sm/LinstorSR", line 777, in scan
May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]     opterr='Database is not mounted'
May  2 13:22:02 xcp-ng-labs-host01 SM: [19242]

Maelstrom96

@ronan-a said in XOSTOR hyperconvergence preview:

@Maelstrom96 We must update our documentation for that, This will probably require executing commands manually during an upgrade.

Any news on that? We're still pretty much blocked until that's figured out.

Also, any news on when it will be officially released?

Maelstrom96

@Maelstrom96 said in XOSTOR hyperconvergence preview:

Is there a procedure on how we can update our current 8.2 XCP-ng cluster to 8.3? My undertanding is that if I update the host using the ISO, it will effectively wipe all changes that were made to DOM0, including the linstor/sm-linstor packages.

Any input on this @ronan-a?

Maelstrom96

Is there a procedure on how we can update our current 8.2 XCP-ng cluster to 8.3? My undertanding is that if I update the host using the ISO, it will effectively wipe all changes that were made to DOM0, including the linstor/sm-linstor packages.