Windows Server 2025 on XCP-ng
-
I created a VM using the RTM release ISO and installed the latest Citrix tools (9.4), i've left this VM running for 5 days now and i don't see any runaway conhost.exe processes, granted the VM is just sitting there doing nothing.
-
Thanks, I didn't know there was a 9.4 yet, just grabbed that too.
-
Going to take a bit longer, need to get secure boot keys installed, reading through the directions now. This will be the first time I'm testing something that need vTPM so should be an adventure!
[edit] Wow... VERY slow on my old hardware with secure boot and vTPM active. This is running on an HP DL360p Gen8 which doesn't support TPM2 modules and doesn't support UEFI natively, not sure if that stuff passes through. It's brutally slow and not from a storage point of view either.
Going to throw another few processor cores at it once the updates finish installing and I can shut it down. Hope that makes it work better.
This is with the 9.3.3 agent and drivers to the same NFS storage that I've been using for years, running on a 10gbe connection. Reached around 3gbps during install which is funny because that seemed really slow too.
I have never seen my processor running this high.
-
It is still slow, everything lags, even the command prompt to set w32tm to use my gps ntp source and set it reliable was laggy. I think this must be a combination of my old hardware that doesn't support some of the new features, vTPM, vSecureboot, and vUEFI, mostly I think it's the old hardware I'm running this on. Using RDP there is less lag than the XCP-NG Console from XO, so maybe not a huge issue after all.
I just set up ADDS, DHCP, and DNS. After the reboot I see 5 conhost and will monitor from here. If I see it grow, I'm going to install the 9.4 agent and drivers to see if it fixes anything.
Nice to see a new Functional Level added, been a long time since the 2016 version.
-
In the few minutes between the last post and now, I was up to 8 of these and growing. XO (XCP) also didn't detect the management agent, no RAM use was showing. I'm wondering if this is part of the issue. Rebooted to see if the management agent gets straightened out and going to let it sit for a couple hours and check after dinner (almost time to go home), it will probably be 4 or 5 hours of run time before I get back to this tonight.
[edit] still no management agent, I think this might be part of the problem. It was working "fine" before installing ADDS, DHCP, and DNS. Going to reinstall 9.3.3 and see what happens later tonight or tomorrow.
-
OK, I've reached the end of my testing, it is still broke! There is something wrong with the Management Agent after you install and configure AD DS. It is in a constant state of "starting" and I'm pretty sure it is generating all the console host services. I've now tried both the 9.3.3 and the 9.4 versions, the 9.4 was downloaded about half an hour ago, can't much more fresh than that.
So where do we go? We don't run Server 2025 until the Management Agent is updated to work with it. The problem I have is that I can't think of how we alert them to the problem. Point someone to this thread maybe?
For now, I'm going back to 2022 because this isn't really worth messing with until things work. Kind of a show stopper for me. If you disable that stuck service and find a way to stop it, then maybe for messing around. I had to restart each time I disabled that service. And once disabled, all the repeating console hosts stopped in their tracks.
All of this testing was done on the latest version of XCP-NG 8.3 with either NFS or SMB for my SR and going over a 10gbe network. Truenas 24.10 is the storage host, just to be complete.
-
Hi Greg_E
Yea, that are exactly the same findings as mine
I think to disable the Service could be a workaround for the moment. 9.4.0 did not make a difference compared to 9.3.0.
We will start our new project with 2025 with workaround in place and monitor the situation when a new version become available where the issue may have been fixed.
I'm glad for any further comment on this topic when new informations become available from any the community members or the devs
Have a nice evening!
-
I'm trying to make this happen on my test VM. Are you saying your need to promote the VM to a domain controller to kick off this bug? I created my VM using the Server 2022 template which did not setup a vTPM or enable secure boot. I'm wondering if that's at all related?
-
Yes, you need to promote this to at least AD DS and go through the setup phases of this task. It was fine until I finished the AD set up and rebooted. I'm thinking there is a port that the new 2025 functional level uses that may be conflicting with the management agent. I didn't go too much farther because I need to move on with some other things in my lab.
If you disable that service, will the VM still be able to live migrate from one host to another during things like rolling pool reboot or rolling pool upgrade? Again, didn't have time to test right now, but have had issues in the past from this. Wish I remembered to try this while it was still built.
There may be one other way around this that might be worth testing... Build a Server 2022, set up AD DS and make sure everything is working. Then do an inplace upgrade to server 2025. This will keep the functional level at 2016 and check to see if everything is working. Then upgrade the functional level to 2025 and see what happens. Depending on where my other tests go, I might give this a try because I'm more likely to do an inplace upgrade on my production machines than to do a fresh install and migrate FSMO roles. But not right now.
-
@Greg_E
For the heck of it, I set up a new VM with 2025 and made it a domain controller complete with AD DS. It seems to work fine but, like yours, there are thirty something conhost processes running. -
This post is deleted! -
You should also notice that you cannot run any task scheduler tasks any longer. The task starts, but the action is not taken and after the timeout it end. Also any .msi package cannot be installed due to the Windows Installer Service is not able to start.
-
I noticed the installer when I was trying to update from the 9.3.3 to 9.4 agent, didn't know about schedulers being locked up.
-
@Chemikant784
So with this being patch Tuesday, two updates got installed:2024-11 Cumulative Update for Microsoft server operating system version 24H2 for x64-based Systems (KB5046617)
2024-11 Cumulative Update for .NET Framework 3.5 and 4.8.1 for Microsoft server operating system version 24H2 for x64 (KB5045934)For whatever reason, those processes are no longer showing up and I can run things from the scheduler??
If you updated tonight, did yours go away?
-
@archw
Hi, i thought the same today so i gave a try. But unfortunately at my test server fully patched with fresh ADDC on it there are the same issues present. But i have to say that i did not a detailed test at this time. -
The latest WS2025 patches fixed the issue for me.
-
Thanks, I'll have to give this a try when I get some time, I can set it up on the gigabit network on my lab. Might even do this today because it's a task I can do while getting interrupted.
-
Ok, this was again not as straight forward as we might like. System as follows:
XCP-NG 8.3 (current)
vTPM, vUEFI, vSecureboot
NFS share
Management Agent 9.4
Server 2025 with November 2024 cumulative update applied (not important)
AD DS, DNS, DHCP installed and configuredAfter several reboots I was still getting the problem:
Thinking a little more, I decided to switch from Automatic to Automatic (Delayed Start)
The delayed start seems to have fixed the issue for me:
I've had this issue before with other services, we handle audio and video devices which have some demanding start routines so I should have thought of this earlier. Simply changing to delayed start lets another necessary service start which then allows the MA to start.
Also, the horrible lagging I mentioned earlier was from a Truenas issue that I didn't know about until a couple days ago, patch your Electric Eel if you haven't done this yet. It's still a little laggy, but not like it was when I mentioned the problem before. All work so far has been through the XO console which is running on an old HP T630 which is pretty slow.
Going to monitor this for a while before I dig too deeply into setting this up, it's my second physical network on my lab which is also the second physical network on my VMware system, I'm sure I'll need to do more configuration than just DNS and DHCP down the road. I'll probably form a trust with the first network AD DS, just to give that some practice and see what 2025 might bring. Eventually I'll need to upgrade my production network to 2025, so I'm justifying the time spent in research for the future, I'm already Windows 11 for all my clients, so Server 2025 should play more nicely and might bring some features back that they removed from 2022 (some GPO to start).
-
@Greg_E
Hi Greg_EThanks for your detailed report and informations I will try it with delayed start. It could be a solution indeed. When i have news on this, i will share it here.
-
My three test systems seems to work properly when the management agent start type is set to delayed. So i think this can be a workaround for the moment.