XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    XCP-ng 8.3 updates announcements and testing

    Scheduled Pinned Locked Moved News
    101 Posts 25 Posters 6.8k Views 38 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • F Online
      flakpyro @Greg_E
      last edited by

      Greg_E

      The VM is EUFI with no vTPM or Secure Boot enabled. The CPU us a "Xeon E-2336 CPU @ 2.90GHz" Running in a Super Micro Server. We use these servers at remote sites to run a number of VMs including a "Blue Iris" server with an Nvidia T1000 GPU passed thru to it, i have one such servers as a test server as well. The second machine doing it is a Minisforum MS-01 also with a T1000 GPU Passed thru in my home lab. The OS in both cases is Windows Server 2025. Over the weekend i cloned a fresh copy without GPU Passthru to see if it occurs with no GPU. We have about 15 of these VMs running at remote locations on the stable 8.3 patch branch that are not doing this.

      It should be noted though that these VMs did need to be customized to allow Blue Iris to run without BSODing the VM on Intel CPUs. Thread can be found here: https://xcp-ng.org/forum/topic/8873/windows-blue-iris-xcp-ng-8-3/35?_=1746455850378 but in the end the following needed to be applied to a VM to keep it from BSODing when running Blue Iris on new Intel CPUs:

      xe vm-param-add uuid=... param-name=platform msr-relaxed=true
      

      stormi It does not seem to do it every time, it seems the VM must run for sometime and then be rebooted to cause it to happen. I have created a bug report file from the host as requested and will DM you a link to it! The last time i experienced this would have been at "May 2, 2025 at 4:18 PM (3 days ago)" according to our XOA appliance.

      G 1 Reply Last reply Reply Quote 0
      • G Offline
        Greg_E @flakpyro
        last edited by

        flakpyro

        I'm guessing it is something to do with the T1000. I've had so flakyness with them in our workstations. You don't have an older K620 or something do you? I'm also wondering if this might be a case for an Intel ARC GPU, they weren't out when I did my last workstation refresh or I'd probably be using them with Creative Cloud applications.

        Also can you try change the Management Agent Service from auto-start to delayed auto-start? Server 2025 has been a bit odd for me for a while, and that was back running on Intel hosts (though they are old CPUs).

        I'll see if I can get a VM up this afternoon, but I'm on AMD with the iGPU so not sure it even supports the acceleration that you need out of the T1000 for the camera recorders. I need to go through and configure each host to pass through the GPU, audio, and maybe some USB and then reboot. But if it works, it might make a nice Handbrake transcoding VM for me, dump files in, get back to work while it does it's thing.

        F 1 Reply Last reply Reply Quote 0
        • F Online
          flakpyro @Greg_E
          last edited by

          Greg_E I do not have anything older like a K620, i do have a GeForce GTX 1650 in another machine that is also doing this but i believe that's the same generation as the T1000 (Turing). I agree i think ARC could be a great replacement for this application in the future.

          Since this is occurring at boot before the OS loads i'm not sure if this would be management agent related?

          G 1 Reply Last reply Reply Quote 0
          • G Offline
            Greg_E @flakpyro
            last edited by

            flakpyro

            Not going to be able to test this, my little mini-lab is having a very hard time with passthrough, to the point where the host won't boot. Tried this on my backup host (for backup DR testing) and finally got it back after the last couple of hours fooling with turning passthrough on and then back off. Sorry.

            1 Reply Last reply Reply Quote 0
            • stormiS Offline
              stormi Vates 🪐 XCP-ng Team
              last edited by

              New update candidates for you to test!

              As we move closer to making XCP-ng 8.3 the new LTS release, taking over from XCP-ng 8.2.1, a new batch of update candidates is now available for user testing ahead of a future collective release. Details are provided below.

              • biosdevname: Update as a dependency for another component.
              • blktap: Fix: enable NBD client only after completing handshake
              • cyrus-sasl: Fix for CVE-2022-24407 (not directly affecting XCP-ng in normal use).
              • gpumon: Rebuild for XAPI update.
              • intel-ice: Update ice driver to v1.15.5
              • kernel: improve timer handling for better compatibility with hardware.
              • plymouth: Packaging update. No visible changes.
              • psmisc: Update to version 23.6.
              • python-urllib3: Update to version 1.26.30.
              • rsync:
                • Update to version 3.4.1
                • Fixes for CVE-2024-12084, CVE-2024-12085, CVE-2024-12086, CVE-2024-12087, CVE-2024-12088, CVE-2024-12747
                • The rsyncd configuration and systemd unit files now come in a separate package named rsyncd-daemon, not installed by default
              • smartmontools: Update to version 7.4
              • xapi:
                • Drop FCoE support when fcoe_driver does not exists
                  • FCoE support will be removed from next versions
                • No more CPU checks for halted VMs in cross-pool migration
                • Move CPU check to the target host during cross-pool migration
                • Serialize all PCI and VUSB plugs to keep them ordered
                • Fixes multiple issues in periodic scheduler
                • Fixes multiple issues in the way XAPI handles RRD metrics
                • Improve SR.scan by reducing a racing window when updating the XAPI db
                • A lot or maintenance-related changes, XAPI being a very active project.
              • xcp-featured: rebuilt for XAPI
              • xcp-ng-release: update copyright years and EULA
              • xen:
                • Improve support for Zen 5 and Diamond Rapids CPUs
                • IOMMU logic improvements and fixes
                • add PCI quirks for problematic hardware (e.g Cisco VIC UCSX-ML-V5D200GV2)
                • fix emulation of MOVBE
              • xenserver-status-report: maintenance update.
              • xha:
                • Support configurable syslog printing
                • Fixes issue where sub-threads can't be scheduled enough resources.

              Test on XCP-ng 8.3

              From an up-to-date host:

              yum clean metadata --enablerepo=xcp-ng-testing
              yum update --enablerepo=xcp-ng-testing
              reboot
              

              The usual update rules apply: pool coordinator first, etc.

              Versions

              • biosdevname: 0.3.10-5.xcpng8.3
              • blktap: 3.55.5-2.1.xcpng8.3
              • cyrus-sasl: 2.1.26-24.el7_9
              • gpumon: 24.1.0-40.1.xcpng8.3
              • intel-ice: 1.15.5-2.xcpng8.3
              • kernel: 4.19.19-8.0.38.1.xcpng8.3
              • plymouth: 0.8.9-0.31.20140113.3.xcpng8.3
              • psmisc: 23.6-2.xcpng8.3
              • python-urllib3: 1.26.20-3.1.xcpng8.3
              • rsync: 3.4.1-1.1.xcpng8.3
              • smartmontools: 7.4-2.xcpng8.3
              • xapi: 25.6.0-1.4.xcpng8.3
              • xcp-featured: 1.1.8-1.xcpng8.3
              • xcp-ng-release: 8.3.0-31
              • xen: 4.17.5-9.1.xcpng8.3
              • xenserver-status-report: 2.0.11-1.xcpng8.3
              • xha: 25.0.0-1.1.xcpng8.3

              What to test

              Normal use and anything else you want to test. The closer to your actual use of XCP-ng, the better.

              Test window before official release of the updates

              None defined, but early feedback is always better than late feedback, which is in turn better than no feedback 🙂

              We will not be very available on this forum until Monday to help fixing issues if there are any, so don't update too fast if that's a possible problem for you.*

              F 1 Reply Last reply Reply Quote 0
              • abudefA abudef referenced this topic
              • F Online
                flakpyro @stormi
                last edited by

                stormi Updated both of my test hosts. Everything rebooted and came up fine.

                No VM stats in XO / XOA i see still. I will be curious if this round of updates fixes my EFI / Windows Server reboot hangs.

                stormiS 1 Reply Last reply Reply Quote 0
                • stormiS Offline
                  stormi Vates 🪐 XCP-ng Team @flakpyro
                  last edited by

                  flakpyro No VM stats despite a reboot of the hosts?

                  abudefA 1 Reply Last reply Reply Quote 0
                  • abudefA Offline
                    abudef @stormi
                    last edited by abudef

                    stormi No stats until the toolstack restarts, as with previous candidates

                    A 1 Reply Last reply Reply Quote 0
                    • stormiS Offline
                      stormi Vates 🪐 XCP-ng Team
                      last edited by

                      Ok, let's involve Team-XAPI-Network

                      G 1 Reply Last reply Reply Quote 0
                      • G Offline
                        Greg_E @stormi
                        last edited by

                        stormi

                        My master host hung on reboot, after about 10 minutes I forced the power off, and then back on. Host is HP T740 Thinclient with AMD v1756b processor, 64GB of DDR4 SODIMM, Intel dual x520 PCIe card, and an Intel i226-v in the a+e slot, plus the onboard Realtek NIC, BIOS at 1.20 which I think is still current for this model. All three hosts are identical with possible exception of x520 card revisions, host 3 might have an older revision (donations accepted for x710 based cards 😀 )

                        Otherwise everything went as planned, all three are updated and all three needed to have the toolstack restarted to see the stats. I only have 2 small Linux VMs on this system right now, and both of them started fine.

                        There were no stats in XO-Lite either, I was doing the second and third hosts from XO-Lite to see if XO was getting mad. The second and third rebooted without issue and something I'll look into with the next update, the delay on master could have been the VMs trying to auto start where I remembered to turn that off for the other two hosts.

                        1 Reply Last reply Reply Quote 0
                        • B Offline
                          bufanda
                          last edited by

                          Installed on my Test pool with 2 HP EliteDesk 800 G3 mini. Except the no stats in XenOrchestra. Did some VM migragration between hosts, reboot testes (only Linux VMs though) and so far no issues.

                          1 Reply Last reply Reply Quote 1
                          • A Offline
                            andriy.sultanov @abudef
                            last edited by andriy.sultanov

                            abudef Greg_E bufanda

                            Could you please attach /var/log/{xensource.log,daemon.log,xcp-rrdd-plugins.log} from the time after the reboot and before the toolstack restart?

                            Also: does this reproduce if you reboot again - would you still have no metrics until toolstack restart? Because if not, this is entering paranormal territory 🙂

                            UPD: I've just updated xapi from 24.39 to 25.6 myself - stats are working after the reboot and I can see them in XenOrchestra, no additional toolstack restart required.

                            abudefA B 2 Replies Last reply Reply Quote 0
                            • abudefA Offline
                              abudef @andriy.sultanov
                              last edited by

                              andriy.sultanov said in XCP-ng 8.3 updates announcements and testing:

                              does this reproduce if you reboot again

                              Yes, this does. Logs have been provided via pm.

                              1 Reply Last reply Reply Quote 0
                              • B Offline
                                bufanda @andriy.sultanov
                                last edited by

                                andriy.sultanov
                                Same issue after another reboot. The stats on host just flatline. No CPU usage or anything.
                                See screenshot
                                f58c86b0-ac72-4a6c-a554-ecee918e4db7-image.png

                                I attached log files
                                daemon.log.txt xcp-rrdd-plugins.log.txt xensource.log.txt

                                A 1 Reply Last reply Reply Quote 0
                                • A Offline
                                  andriy.sultanov @bufanda
                                  last edited by andriy.sultanov

                                  bufanda Thanks!
                                  Hmm, can't see anything suspicious either in your logs or abudef's

                                  Since you said it reproduces if you reboot the host again, I'd really appreciate if you could send the result of rrd2csv running for 30 seconds or so on a system that doesn't have the stats working - so before any toolstack restarts.

                                  B G 2 Replies Last reply Reply Quote 0
                                  • B Offline
                                    bufanda @andriy.sultanov
                                    last edited by

                                    andriy.sultanov
                                    I ran it for a minute.

                                    rrd2csv.txt

                                    1 Reply Last reply Reply Quote 0
                                    • G Offline
                                      Greg_E @andriy.sultanov
                                      last edited by

                                      andriy.sultanov

                                      If still needed, I can probably get you the rrd2csv next week, I need to unrack my system and move it to a new mobile rack. A large expense that I'm not happy to have to do, but now needed going forward (it's a work thing, where I prototype workflows for my production system) 😠 😡 😣

                                      A 1 Reply Last reply Reply Quote 0
                                      • A Offline
                                        andriy.sultanov @Greg_E
                                        last edited by

                                        Greg_E Thanks, but that will not be necessary - I think I've figured out where the problem lies now. Good luck with the move 🙂

                                        G 1 Reply Last reply Reply Quote 1
                                        • G Offline
                                          Greg_E @andriy.sultanov
                                          last edited by

                                          andriy.sultanov

                                          The move is more about spending yet more personal money for something that is primarily used for learning what I need to know for work (a rack, UPS, and some odds and ends at $800). Or half of it is, half of it is a VMware system and that's to let me learn what I need to leave this job and find one that pays more. And then move that work to something easier to use that costs less. Still amazed at how many places just took the price increase and are still not making plans to move to something else, even if they are cutting core counts in half to save half of that new money. But I'm also seeing that the general trend in IT around where I live is to get into a Silo, and never ever take on another task to fill a need. This way you never want to move laterally to other products, just keep doing the same things the same ways until management gets tired of hearing "we can't do that" and fires everyone to replace them with an MSP (or other contractor).

                                          So for the off topic rant, feeling salty again today.

                                          1 Reply Last reply Reply Quote 0
                                          • First post
                                            Last post