XCP-ng 8.3 betas and RCs feedback π
-
Here are new updates for XCP-ng 8.3.0. In theory, the last ones before the final release:
sm-3.2.3-1.7.xcpng8.3
: beta support for XOSTOR (which will become available for production a bit after the release of XCP-ng 8.3.0)xapi-24.19.2-1.4.xcpng8.3
(and the numerous RPM packages that come from that build): a small IPv6-related fix, but I won't detail it here because another fix is needed so that the first fix is really useful . There will be more fixes related to IPv6 shortly after the release of XCP-ng 8.3.0.xcp-ng-pv-tools-8.3-12.xcpng8.3
: fixed the package versions (no changes but the displayed version).xsconsole-11.0.6-1.2.xcpng8.3
: add the ability to join a pool from XSConsole when the primary address type is IPv6.
We also made one change to the installer since the RC2 release: the installer will now refuse to upgrade if the host's certificate size is too small (which would prevent XAPI from being able to communicate with clients or other hosts!). The release notes will document how to change the key before doing an upgrade. Most users who initially installed XCP-ng with a version lower than 8.0 were likely to fall into this trap, so we made these two last minute changes to prevent it:
- On XCP-ng side: the installer check described above.
- On Xen Orchestra side: a warning on pools which have "small" host keys.
-
@stormi Updated without issues however noticed it dumped some errors to the console on reboot.
Checking into it i see the following in xensource.log
/var/log/xensource.log-Sep 25 12:12:29 xcpng-prd-02 xcp-networkd: [ info||131 |server_init D:2740ee30f2da|network_utils] /usr/bin/ovs-vsctl --timeout=20 --bare -f table -- --columns=name find port fake_bridge=true tag=199 /var/log/xensource.log:Sep 25 12:12:29 xcpng-prd-02 xcp-networkd: [error||131 |server_init D:2740ee30f2da|network_utils] Error in read one line of file: /sys/class/net/eth3/features, exception Unix.Unix_error(Unix.ENOENT, "open", "/sys/class/net/eth3/features")\x0ARaised by primitive operation at Xapi_stdext_unix__Unixext.with_file in file "ocaml/libs/xapi-stdext/lib/xapi-stdext-unix/unixext.ml", line 92, characters 11-40\x0ACalled from Xapi_stdext_unix__Unixext.buffer_of_file in file "ocaml/libs/xapi-stdext/lib/xapi-stdext-unix/unixext.ml" (inlined), line 177, characters 2-52\x0ACalled from Xapi_stdext_unix__Unixext.string_of_file in file "ocaml/libs/xapi-stdext/lib/xapi-stdext-unix/unixext.ml", line 179, characters 47-73\x0ACalled from Network_utils.Sysfs.read_one_line in file "ocaml/networkd/lib/network_utils.ml", line 160, characters 6-33\x0A **/var/log/xensource.log-Sep 25 12:12:29 xcpng-prd-02 xcp-networkd: [error||131 |server_init D:2740ee30f2da|network_utils] Caught unix error: No such file or directory [access, /usr/sbin/ovs-vlan-bug-workaround]** /var/log/xensource.log-Sep 25 12:12:29 xcpng-prd-02 xcp-networkd: [error||131 |server_init D:2740ee30f2da|network_utils] Assuming script /usr/sbin/ovs-vlan-bug-workaround doesn't exist
Looking on hosts i haven't let patched i see this exists on them as well. It doesn't seem to affect anything. All hosts were ISO upgraded to RC2 last week.
-
Hello,
This is a somewhat strange request, but I figured I would post it anyway in case someone else has my use case.
I installed XCP-ng 8.3 RC2 on two hosts. On my newer host, I have 96G ram, a 1T and a 4T drive. I have already deployed a few VMs (Windows 11 and various flavors of Linux. Everything is running well. No complaints. I also am running Xen Orchestra from source on a separate machine as a container. All good so far.
Now for my use case: The newer host has a CPU/GPU chip combo powerful enough to run medium-complexity games on, but I don't believe the Windows VM has enough access to the GPU to run the games at their full rendition. So, since this is a lab/play installation anyway, I wanted to explore the potential of installing Windows 11 on the 4T drive to dual boot between XCP-ng and Windows 11. I have already installed Windows no problem on the 4T drive and it is bootable from the EFI boot menu. What I really want to accomplish is to get the Windows boot menu item into the XCP-ng grub menu so that I don't have to use a keyboard interrupt to bring up the EFI boot menu. I have tried several times to modify grub to add the Windows menu item, but every time I try it I get a corrupt grub menu. Any ideas?
I have modified the /etc/grub.d/40_custom file, adding the following to the end of the file:
menuentry "Windows 11" {
set root=(hd0,2)
chainloader +1
}I then ran the following to modify the grub.cfg in /boot/efi/EFI/xenserver/
grub-mkconfig -o /boot/efi/EFI/xenserver/grub.cfgWhen I reboot, the menu appears as all special characters, and does boot up xcp-ng. I can then log in as root and revert to the original grub.cfg and all is well. It's just that the menu is corrupt.
Has anyone tried this and succeeded? I would appreciate it if you could reply to this post.
-
@letizido-0 I solved my own problem. It turns out grub-mkconfig reads every file in the /etc/grub.d/ directory and builds the grub.cfg trying to include everything. Much easier to just directly edit the grub.cfg and add the menu item with proper settings. Working fine now. Perhaps you may want to look at all those scripts in /etc/grub.d/ and remove what XCP-ng doesn't use/need.
-
Regarding the fact that the upgrade asks for the network configuration for hosts which are not the pool master: we checked, it's been the case in this installer code since more than ten years. At some point, the installer needs to contact the pool master during the upgrade.
-
@stormi Good to know, thx.
-
@stormi said in XCP-ng 8.3 betas and RCs feedback :
At some point, the installer needs to contact the pool master during the upgrade.
Good to know that - never had to reinstall yet. Thanks for that info stormi
-
Weird incident here, my server froze for no reason at all (none that I can know yet), I don't even know for how long it has been frozen since the VM I use every day is working. I cannot ping Xcp-ng server, it is registered but cannot be connected via Xen Orchestra. I did not run the 2 latest batches of updates on my XCp-NG 8.3 server. The weird thing is that most of my Linux VM are still running and reachable!! The Windows VM, on the other hand are not reachable. Any recommendations for investigation before I reboot the host server? It is running on a Supermicro and I can see the frozen main screen via iKVM and it does not respond to keyboard inputs, not even from the virtual kvm keyboard.
-
I suppose you cannot have any physical access to check on a screen?
-
@olivierlambert There is no physical screen attached to it. I am currently investigating the SSD that was booting CXP-ng. I am guessing it just died on me. Will update if I can read any data from it.
- Any specific log I should look at if any available?
- If I restart from scratch with a new disk, will I retrieve my VMs that are actually stored on other disks?
-
- kern.log, dmesg, the usual suspects.
- if you have a metadata backup yes. Otherwise, you'll need to re introduce the SR, and just have to recreate your VMs and attach the existing disks to it (and without the metadata info, if your disks are all the same size, it could take same some time to find who's who)
-
@olivierlambert I could access the SSD, filesystem was corrupted. I don't think I have a backup at hand just in case I would not be able to boot at all. What files could I transfer to get all my VMs back to normal after a clean install on a new disk?
-
@olivierlambert System restarted, not sure how to backup XCP-NG server side properly though. I ran a backup of XO metadata and pool parameters.
-
XCP-ng pool metadata backup is exactly what you needed in that case
-
@stormi
FWIW...one of the boxes where I installed the latest updates (from a few days ago) did not autostart any of the VMs. I had to manually start each of them. -
Latest updates over ISO-installed 8.3 RC2 worked fine for me. I did experience one host in my three-host pool to which no VMs could be migrated. After looking at the networking from bash in DOM0, it showed that both 10G ports for the storage and migration networks were DOWN. These ports are on a genuine IBM-branded Intel X540-T2 card I bought used on eBay so it might have gone bad. Since the card has worked well for some time, I figured it couldn't hurt to re-seat it in the PCIe slot. Sure enough, that fixed it. Moral of the story: check the mundane stuff first; it's not always the fault of new updates.
-
haha nice catch, PCI reseat is like black magic sometimes
-
@archw and the other hosts did?
-
@stormi
YesI've not had a chance to reboot the host since then to see if something else is going on. Will do so tonight.
-
Hi,
I'm currently testing our the RC2 with ceph backed rbd devices, which works perfectly for us on 8.2.1. After installation I tried to add an existing shared storage, without success. Then I tried to create a new one and ran into following problem. As you see I can create a volume group manually without a problem.
xe sr-create fails:
xe sr-create name-label="RC2StorageTest" shared=true type=lvm device-config:device=/dev/rbd0 Error code: SR_BACKEND_FAILURE_77 Error parameters: , Logical Volume group creation failed,
vgcreate from commandline works:
vgcreate RC2StorageTest /dev/rbd0 Physical volume "/dev/rbd0" successfully created. Volume group "RC2StorageTest" successfully created
If I repeat the xe sr-create after manually creating a VG the VG will be removed by xe sr-create, but is still failing with the same error.
Any idea where to look to solve this issue?