Sluggish performance on VMs migrated from Hyper-V
-
Putting this first so you know what you're in for before you start reading:
TL/DR: I've migrated some VMs from Hyper-V to XCP-ng and (probably via my own ineptitude) we are experiencing worse performance. How can I fix this?Recently, we started the process of migrating from a Hyper-V environment, to XCP-ng.
We only have a couple of clients on our servers, both of which are relatively simple Windows server DC + RDS setups.One of these clients was long overdue for an OS upgrade, so we just built a new pair of VMs for them.
The other, we migrated using the guide found here https://docs.xcp-ng.org/installation/migrate-to-xcp-ng/#from-hyper-v
We may have made a few mis-steps along the way...As the client is reporting (and we are observing) that performance is slow compared to pre-migration, I'll dump all relevant info I can think of below:
VMs
- DC
- Windows Server 2016
- 4 vCPU
- 8GB RAM
- 250GB storage
- Apps/RDS
- Windows Server 2022
- 8 vCPU
- 24GB RAM
- 1.8TB Storage across multiple drives
Hyper-V Environment
- HP DL380 Gen9
- Dual Xeon E5-2630 v4 @ 2.2Ghz
- 256GB RAM
- All VM storage on SSD in RAID5 (NTFS)
XCP-ng Environment
- HP DL380 Gen9
- Dual Xeon E5-2680 v4 @ 2.4Ghz
- 256GB RAM
- Storage (all thin provisioned with ext4):
- VM OS on SSD RAID5
- User Profiles on SSD RAID5
- Frequently Accessed data on SSD RAID5
- Old/Archived data on NFS drive over 10GB link
- Daily delta backups via XO
Process
Export and Convert
From Hyper-V, we exported the VMs and drives.
- At this stage, we are not clear on how to "remove all the Hyper-V tools from the VM", so no vmic services were removed prior to conversion.
These services have since been disabled, but were not running anyway as there was no Hyper-V host to trigger them.
We then converted the exported drives and imported into XCP-ng
VM Creation
From XO, we created virtual machines with the same spec but no drives attached.
- It appears these were created with "other os" instead of the Windows Server 2026/2022 templates.
For the VM with multiple drives, the OS was attached first, Citrix VM tools installed, and then the other drives attached.
Other Config
Network
Migration removed the network config, so both VMs were re-ip'ed after the installation of Citrix drivers.
This also broke the connection between the DC and the RDS VMs, which was fixed with the powershell commandTest-ComputerSecureChannel -Repair
File Restoration
We pulled the "Files and Folders" from our managed backup provider (Cove), bringing the server up to date.
Outlook had a fit over the ost files. Redownloading and indexing all emails was causing a major load, so we set a group policy enforcing outlook not to use cached mode, and then removed all ost files from the VM (backed up separately with Cove's 365 integration).Error Events
VSS/ESENT were creating error events, related to hyper-v backup management.
This was resolved by removing the hyper-v CLSID via a registry edit.Findings and queries - Where to start?
Secure Boot
Neither VM is presently using Secure Boot. I've seen some (AI generated) suggestions that this may impact performance - am I likely to see an improvement if I enable this?
Templates
I have read that using the correct template should give a performance boost due to the relevant "keys" being installed.
Is there a way to fix this without re-creating VMs and re-attaching storage?
Hyper-V integration
If the services are disabled, are they likely to cause issues?
If I need to remove pre-conversion, am I correct in assuming it will only be on the OS drive?
How can I remove these tools?
Disk Usage
Monitoring the resource use, I frequently see drives being read at ~100MB/s and thought this might be the cause. I traced this back to the backup manager and stopped backups temporarily to test.
With backups disabled, performance did not seem to improve.
As there has been no change in backup frequency compared to pre-migration, I would not expect this to be the cause.What else should I consider?
- DC
-
@Statitica Hi !
Did you install the XenServer or the XCP-ng tools on your Windows VM ?
Without the tools, performance are badAlso, the "Other OS" template can cause issues, because windows template have some supplemental parameter for Xen (like Viridian, and other thing I forgot)
-
@AtaxyaNetwork said in Sluggish performance on VMs migrated from Hyper-V:
@Statitica Hi !
Did you install the XenServer or the XCP-ng tools on your Windows VM ?
Without the tools, performance are badI installed the XenServer tools as this forum post suggested it was the better of the two options.
https://xcp-ng.org/forum/topic/9902/citrix-or-xcp-ng-drivers-for-windows-server-2022/2Also, the "Other OS" template can cause issues, because windows template have some supplemental parameter for Xen (like Viridian, and other thing I forgot)
Yep, I saw that mentioned in another forum post here https://xcp-ng.org/forum/topic/8854/performance-differences-and-higher-cpu-usage
I was hoping there would be a way to fix it without rebuilding the VM, but perhaps a rebuild is the safest thing to try first. -
@Statitica I don't think you can change the template on the go (but I might be wrong)
The safest way is to create a new VM with the right template, same spec, without disk. Then don't boot it, shut the VM with the wrong template, detach the disk (rename it if the name is not obvious), and attach it to the new VM with the right template.
-
Thanks @AtaxyaNetwork
Doing some more reading and testing at the moment (can't rebuild the VM until my clients log off for the night) and thought I'd drop some notes here for future reference.
I've seen a few different sites and blog posts mentioning that SMAPIv3 is on the horizon and should offer a significant performance increase on disk IO compared to SMAPIv1.
I guess the question is: would this potentially be causing slowness compared to Hyper-V?
How soon will we see SMAPIv3 in production (how long is a piece of string)?I also fired up the old VMs from the other client mentioned. These were guests running Windows server 2016 on a Hyper-V host. The older VMs perform far better than the new - again, on similar (slightly slower) hardware. Withe the performance improvements Microsoft claim to have made with each iteration, it seems strange to me that an older version running (mostly) the same software on slightly slower hardware, should be much more responsive than newly built VMs with the correct templates.
-
You need to swap the template in priority. This might drastically change the result
-
@olivierlambert yep - clients are just done for the day so I'm running a backup job and should be able to let you know how it goes in a few hours.
-
@olivierlambert I tried to boot the OS drive with the correct template, and it blue screened with "boot device inaccessible".
I reverted back to the incorrect template, and it's booting again.
Going to try re-importing and setting up the boot device with the correct template next.
-
@Statitica It's because of a change in Windows storage driver. Deleting
HKLM\SYSTEM\CurrentControlSet\Services\stornvme\StartOverride
before booting on the new template should make it work. -
@dinhngtu do I just do that before powering the VM down, or is there a way to do it on the disk while it is detached and powered down?
-
@Statitica You can do that before powering down. There's XenBootFix, which isn't really made for this, but may also work.
-
@dinhngtu stornvme doesn't have a StartOverride section.
Assume it is storahci?
-
@Statitica If you're using the Windows templates, then stornvme is correct. Do you have some unusual NVMe drivers (Intel drivers being one) installed?
-
@dinhngtu I originally used the "other install media" template when I imported this VM from hyper-v.
Nothing listed in msinfo, but in the reg keys under storahci
I noticed there are also drivers in there from vmware (these VMs have been through a few hypervisors before I took over the company and decided to do this particular migration).
Out of interest I checked a Server25 install which was done using the "Windows Server 2022" template. That also has no StartOverride value in the registry.
-
@Statitica have you checked the Event Logs, specifically the System event logs? I recall having a performance hit when I migrated some VMs from Hyper-V to XCP-ng (years ago at this point).
I don't recall the exact issue off hand, but the remedy was easy enough.
-
@DustinB There isn't anything in the eventlogs since the VSS/ESENT issue was cleared.
-
@Statitica Not having
stornvme\StartOverride
is expected on systems that use NVMe controllers to boot (e.g. XCP-ng Windows templates). Can your imported VM boot on the correct Windows template once you have made surestornvme\StartOverride
is not present? Once it boots with the correct template and NVMe controller then let's see what the performances are.