Nested Virtualization for Linux VMs with AMD SVM enabled in BIOS fails - hangs at nested VM boot for DockerDesktop and/or libvirt

nesting4lyfe2024

There seems to be an issue with getting nested virtualization to work properly with xcp-ng using the AMD Ryzen 5 3600 CPU.

I've tried with the following Linux Distros in Guest VMs:

Fedora v40
Debian 12
Ubuntu 22.04 LTS

I've had no success with libvirt or docker-desktop (docker engine seems fine though) in any of the Linux Distro's

I want to clarify that I ran the test command for amd_kvm from the xcp-ng hosted Guest VM terminals:
'# cat /sys/module/kvm_amd/parameters/nested'
and got the expected "Y" or "1" result - which should indicate that Nested Virtualization is enabled properly (I set them all up for Nested V through XOA).

Trying to get Docker Desktop to work on guest Linux VMs running on XCP-NG

I've tried to use Docker Desktop with Linux Debian12, Ubuntu 22, Rocky 9 and Fedora v40 guests, all having Gnome GUIs + recommended Gnome extentions (Gnome extensions: https://extensions.gnome.org/extension/615/appindicator-support/ ; per Docker docs: https://docs.docker.com/desktop/install/fedora/ ),

... but the all either hang at "Docker Engine Starting..." or lock the OS up with 100% CPU usage that requires a "Force Reboot".

Does anyone have a known fix for this issue? I can get Docker Desktop to work on bare metal PCs/Laptops with Linux and Windows ok, but not within a guest Linux VM with nested virtualization enabled on an XCP-NG Hypervisor.

I've tried everything there is in the Docker official docs:

https://docs.docker.com/desktop/troubleshoot/overview/
https://docs.docker.com/config/daemon/troubleshoot/
And in some cases, was able to get a slight change in behavior, where some Linux distros were locking up without the GUI window showing first, the GUI for "Accept" would at least appear, but eventually "hang" and fail to start with the same "Docker Engine is starting..." status message.

Oddly, the actual Docker Engine seems to work fine, by itself. The current workaround is to use podman desktop (installed via flatpak/flathub) with both podman and docker installed together (with caveats for Fedora and Rocky - https://faun.pub/how-to-install-simultaneously-docker-and-podman-on-rhel-8-centos-8-cb67412f321e ) - Using Podman Desktop provides a semi-funtional featureset for managing containers in the Gnome DT GUI but its not ideal.

The issue is seemingly only with Docker Desktop, itself, and not with Docker Engine.

Since Docker Desktop did not work, I then attempted using libvirt / virt-manager

After trying fresh installs of all 3 Linux Distros and installing libvirt + virt-manager, there were no bootable guest VMs to complete the install process. For the bootable ISO install images, I tried the following:

Alpine Linux v3.17
Debian 12
Rocky 9

In all attempts, never did any of the images boot up with libvirt in the cli, not with virt-manager in a Gnome GUI on any of the above mentioned "nested host as a VM on xcp-ng"

There is just a blank / black screen for virt-manager
There guest VMs show as "running" but there is no network (DHCP never assigns an IP) or disk activity for the "nested" VM, if I attempt something like this articles (libvirt without the GUI):
https://stackoverflow.com/questions/64792580/libvirt-virt-install-hangs-on-installation

While I did have some success at running docker engine, and I am able to run Kubernetes directly on xcp-ng "non-nested VMs", I have a need to separate each K8s cluster and isolate the networks (as much as possible) for testing purposes.

Has anyone else ran into the issue of being unable to use a nested virtualization configuration on a xcp-ng host within a Linux Guest VM with nested virtualization enabled and found a solution for this problem?

Which logs could I check for any errors that might be relevant to finding the solution for the root of this issue?

Any assistance anyone might be able to offer is greatly appreciated!

krul8761

@nesting4lyfe2024 I am currently having the same issue attempting to run Kali Linux in a Windows 10 environment using VirtualBox. I also attempted to use WSL but I get the error

"Please enable the Virtual Machine Platform Windows feature and ensure virtualization is enabled in the BIOS"

Did you ever find a fix for this?

olivierlambert

Hi,

It's a complex situation, nested doesn't work well in many situation. It's in our backlog.

nesting4lyfe2024

@krul8761 I'm not certain (but maybe someone at Vates / @olivierlambert could correct any misgivings or misunderstandings I have on this) but after doing a lot of deep diving, research and experimentation, it appears that certain motherboard/CPU combinations are capable of "NV" - Nested Virtualization - and to varying degrees of effort involved to get it to work.

I tried everything that I could think of, everything I stumbled upon to "try" throughout XCP-ng/XenServer and ProxMox forums, Vbox, even virtsh/vmm/libvirt, and I never found any solutions to work on the Dell PowerEdge platform (R710 and R730 models - which have Intel CPUs/chipsets), nor any Asus motherboard combos (Using AMD CPU/APUs - tested nearly a dozen with XCP-ng) nor any other hardware.

It's noteworthy to mention that I have mostly tested "old and End-of-Life" types of hardware that I am attempting to keep alive and useful for random/various purposes and "learning tools" (eg DNS Servers, VPN Servers, "apt/yum/dnf cachers", "Virtual Machine Appliances" like Security / Firewall / Networking VMs, K8s Nodes, and the like).

My expectations were fairly low for getting XCP-ng to do "NV" on most of these systems after I went through the pain of trying various kernels, "manually compiling", tweaking/hacking sys configs, customization with modprobe-ing, allllll the things (within my limited scope of understanding, as I am no Linux kernel developer) - Nothing worked.

So I tried to re-think everything, and weigh the options.

I will get real wordy and opinionated here, but maybe this will save others from spending a lot of time "getting stuck" or going down dead-end paths, as I certainly have in my thinking about "why I think Nested Virtualization is so important to me" (and how that thinking evolved with realities):

My first thought: "Buy more stuff"

Explore a known-good / user-reported / well documented hardware platform where Nested Virtualization works well with XCP-ng

I decided against this, as I could "feel" that even if "something finally worked!" its days would likely be numbered and I could rely on NV-ed VMs for key services, as at any moment, there might be some breaking change that happened with yum update, or, if I attempted to "hold back" certain packages to maintain the frail functionality that I worked hard at trying to accomplish, there would be security concerns (2024 was wild, and 2025 looks to be even more so, in terms of security concerns)

Then I shifted to what I felt was a more sane option:

A "More Sane" Option?: Use a different Hypervisor that has known-good and stable functionality with Nested Virtualization capabilities.

(Obvious choice, right? welllll...)

What I've found here, is that there really are NOT very many decent options for Nested Virtualization. But there ARE working solutions for MOST edge cases, but not really a "clear best fit".

How "Sane" is it to switch to VMware ESXi or "Workstation Pro", since that product line as "the most simple and functional NV" (arguably)?

--- There used to be ESXi, but Broadcom wrecked that... though they did give us "VMware WorkStation Pro" in consolation. But for how long? And what's with the weird licensing? How security is it REALLY, with ALL those CVEs and newworthy security flaws with extremely sophisticated attackers? It seems like ESXi and other VMware products got rolled out as "polished turds" before the "sale" to Broadcom, in terms of stability and security. It's not just a problem with VMware/Broadcom, either. I want to be clear that I'm not solo bashing VMware/Broadcom, per say, but these are issues with any "Integral Platform" and/or Hypervisor, but in the case of (older?) ESXi and VMware products, such issues are somewhat exacerbated by a lack of transparency with Close Source, though.

How sane is Hyper-V? Hyper-V has Nested Virtualization for most modern PC platforms and is backed by LOTS of "supporting communities and documentation"

--- Hyper-V is actually not a terrible option, once you learn how to navigate the confusing GUI options and the weird "secret knowledge of PowerShell commands" that ACTUALLY "unlock" the feature that you THOUGHT you enabled in the GUI menus and rebooted 32 times to finally get the output of an often seemingly unrelated feature/tool/status that you interpret as "I finished step 12 of 47!"

But what comes next, once you DO get Nested Virtualization working in Hyper-V? Pain. I haven't tried using the more modern iterations of "Windows Server 202X XX", but the earlier version and the "starter version" on Windows 10/11 Pro DOES have some interested use cases where its the path of least resistance.
For example, Dell's (insanely resource hungry) "OMSA" / OpenManage Enterprise Appliance (Or whatever confusing naming its better known as in your particular workplace and homelab circles) has a ready-to-go "Hyper-V Appliance" that is... nice... but... you probably need at least 3+ Dell "Enterprise Products" to justify the CPU and RAM requirements for the more useful features - So systems with 32GB or less RAM aren't going to be able to have "always on Appliances" like these (again, Dell isn't the only one that does this - these are "Enterprise Grade Solutions", so there is an inherent "infinite money" expectation when using "Enterprise Vendor Closed Source tools" - They NEED more spend to keep making hardware/software synergies)

Hyper-V is TOTALLY worth mentioning, and I realize I've been harsh on it, as it does have its place and use cases that will most likely fit what you are trying to accomplish @krul8761

But for a homelab or "experimental dev environment"? HyperV as a platform will take forever to "spin up", You will learn WAY more than you ever want to about strange MS Windows features, security measures (and flaws) and other various quirks, and the timesuck of trying to weed through what change happened in which release or "KB" for Windows is very real.

HyperV (and basically all of Windows / Microsoft Products) have some of the most "Extensively confusing blend of old, new, stale and "in development" support forums and documentation - Microsoft does a Disturbingly good job (Capital D) and "covering its tracks", as you will notice that 50+% of the time you search for something Windows/MS specific with a non-EVIL search engine, the link will redirect you to the generic "search our forums!" page - HP is really getting good at doing this sort of "scrubbing", too. And some of it is good, but most of the details I search for tend to go "missing" and I find plenty of "just turn it on and off again" type of advice from issues that are 5+ years old.

All that said, is THAT the Hypervisor you want to trust? I don't.
BUT, again, HyperV DOES have use cases. Like its integration with WSL and Docker Desktop, and there are some interesting developments happening with "Shared / Partitioning GPUs Configurations" (That's a whole other rabbit hole - if you're into gaming or cheaping out on GPU costs for simple video stearming/transcoding stuff), but GOOD LUCK with sorting all that out with the "documentation" - You often have better luck with a handful of people that slap together 15+ step guides and the accompanying files and list of commands needed to get it working that end up in a GitHub Repo completely unrelated to MS (Case and point: https://github.com/DanielChrobak/Hyper-V-GPU-Partition-Guide ) - These "techniques" DO in fact work, but there are SO MANY caveats and "gotchas" that it becomes unrealistic to maintain them for very long.

So "Yes, HyperV can and does work!" - but its a HOT MESS trying to debug and maintain such complicated, un-intuitive configurations with "part gui, part PowerShell, part Windows Settings, part "custom scripts you download and run" (IMO). Again, if the newer versions of Windows Server 202X have a better tool set for HyperV, I'm not aware, (nor have much interest), but I'm not bashing on Hyper-V just because its on OSS. It's because its a trainwreck. If one simple feature takes 2 days to try to understand, then track down all the "pieces", then test... only to find out "it doesn't work on my device/motherboard/etc", with nearly non-existent "feedback" to confirm or deny its even possible... there's no way it's going to be my go-to or recommendation. But maybe whatever cool-but-poorly implemented feature you might want will work for your specific blend of hardware. It has VERY rarely worked out for me, and even more rare "worth the effort involved" (specifically the "GPU sharing" and the "PCIe pass-through" implementations - https://woshub.com/passthrough-gpu-to-hyperv-vm/ - and this thread worth reading too - https://gist.github.com/Ruffo324/1044ceea67d6dbc43d35cae8cb250212#file-hyper-v-pci-passthroug-ps1)

"Pick your poison" applies, here.

What about virtsh/libvirt/vmm for Nested Virtualization?

The short answer is "Yes you can", but its only slightly less convoluted than with HyperV (again, just IMO)
This is an excellent article spelling it out with a basic Ubuntu set up using Nested Ubuntu and Windows installs - https://just.graphica.com.au/tips/nested-kvm/
BUT... while nesting / passthrough-ing and generally "enabling of all the coolest features" with libvirt/virtsh/kvm/qemu/(what is the actual "common name" for this Hypervisor, anyway?) you will probably have a BRUTAL journey, trying to match configurations for your specific "base OS" (If not Ubuntu or perhaps Rocky/Fedora)
The JANKY way you have to configure "bridges" for your network adapters and the super awkward UI (if you're using VMM and not just pure CLI) turns me off pretty hard.
Just my personal experience: At one point, I had everything working "perfectly" from a single NIC: WakeOnLan, Shared IP / CIDR from "Host-to-Guest", "vNICs" for each of my VMs, great!
... but then on reboot?
It fubar-ed everything, and I gave up fighting with it. I finally used a 2nd USB NIC and that "worked", but... then there were odd quirks with all the other overly-complicated networking for libvirt/virtsh/VMM, too (too many and too complicated to remember, let alone list out).
So if you want to use this fairly great "Type 1.5 Hypervisor" (It's an odd one to nail down the "Type" of, which reflects its power and capabilities). But given all of its issues and challenges, hard-to-find configurations, updates, and odd-but-not awful "XML-based" feature-enablements, it has a place for certain use cases, but is no "joy" to work with, either (The permissions issues alone is probably where most people give up on this one).

TrueNAS Though?

I'll throw an honorable mention out here to TrueNAS, too. But using TrueNAS as a virtualization platform is, again, in my opinion, similar to putting NOS boost on a semi truck. Cool idea, but... TrueNAS is an EXCELLENT tool for managing Network Attached Storage ("N.A.S." - imagine that, right?). It can take a "pile of hard drives" and transform them into an awesome, resilient, sometimes-high-performing-but-better-used-for-reliability configuration. As a virtualization platform? It's more of a "bell and whistle". If all you WANT is a (very) solid storage solution for your home or office that has a "side perk" of running something like a a few Docker Containers or low-resource VMs with private DNS servers, running an internal website, local database, etc, then its a great choice (and this is likely true for MOST people, even true of MOST small businesses). Last I checked, "Nested Virtualization is an experimental feature" in TrueNAS, just like with XCP-ng (likely for all the same reasons, too).

You can even do Kubernetes on TrueNAS, too ( https://www.truenas.com/community/threads/kubernetes-cables-the-challenges.109901/ )

But building full scale applications, or trying to do something with it that warrants a "need" for Nested Virtualization? You're probably barking up the wrong tree (Yes, even with TrueNAS CORE, instead of SCALE). You might find ways to "make it work", but you're also spending a lot of time and energy in fighting with your platform, rather than building your app/idea/business/personal relationships.

That said? I would call it "a good problem" if you are getting to the point where you have started to outgrow TrueNAS in an office/SOHO setting, and "Leveling up" if your home lab and desire for something more "production grade" or "cloud like" is what you're learning journey is pulling you towards.

VirtualBox, Maybe?

VirtualBox performance tends to be fairly awful. There are "things you can do" to help it out, sure, but I look back at all the years I used it and see how much of a "crutch" it really was (for a very long time). You can "sometimes" get nested Virtualization to work with it, but its tedious and painful (to my memory - I admit its been awhile) to "enable NV correctly" and the performance is already bad enough in the "un-nested VMs", so it gets exponentially worse as you start having less and less system resources to try to throw at it to get it "usable".

So, again, if you just have a need for something "simple" like what I call an "inverted WSL", VBox is a solid choice (max 5 VMs total, never running more than 2-3 at a time). Such as if you are running Ubuntu/Debian (or whatever) Linux on bare metal, but still have a few tools you need/prefer in Windows (eg: "Proper" MS Office, MS SQL, HeidiSQL, mRemoteNG, XCP-ng Center, WinSCP, and the other long list of "Enterprise Tools" that you might need for your day job that are "Windows Only").

VirtualBox has a WAY better layout / GUI / toolset design than HyperV (IMO), but HyperV DOES have better performance (assuming use of the same disk and number of vCPU/RAM allocations) in my experience, but neither are usually "great" (one sucks less than the other). The worst part about using VirtualBox with Windows 11, aside from the "expensive" resource cost-to-performance, is that it can/will cause issues with other virtualization tools like WSL, Docker Desktop, HyperV and/or VMware Products. So, once again, there are nuances, pros and cons, and no clear "one size fits all" solutions, if you need to use any of those tools together, on Windows. But for Linux or MacOS "Host Machines" running VirtualBox? Maybe that's a great fit, but it's probably <1% of people that actually do that, so you're likely going to be struggling in the wilderness of forums and random blogs for answers to issues that arise, too.

Summary of Some Rando on the Internet

What I've come to conclude at the present moment (2025Q1):

If you really truly absolutely need Nested Virtualization on a Type 1 Hypervisor for "home lab activities", either concede to using ProxMox or VMware/Broadcom Workstation Pro (pros and cons to both)

If you just have a few edge cases where you need Nested Virtualization for a-handful-or-less VMs and no need for PCIe Pass-Throughs

(for example, a few VMs with their own Docker Desktop Instances, or some kind of testing)

--- BM Windows: HyperV, if you want WSL and/or Docker Desktop and have the "sweet spot hardware" that does something "fancy" with PCIe (GPU/HBA/RAID passthrough); VMWare WS Pro if you don't
--- BM Linux: VMWare WS Pro or VirtualBox (or anything, really; Maybe "just KVM" would do the trick, too)

If You Want a Desktop Environment with Many VMs capable of Nesting (or Not) but with PCIe pass-through "opportunities"

--- BM Windows: MAYBE HyperV will work for your PCIe PT requirements, but if it doesn't you're basically SOL on Windows
--- BM Linux: Libvirt/VMM/KVM is a viable option, assuming you have more than one NIC and considerable Linux skills

If You Want/Need a Type 1 Hypervisor With Nested Virtualization and PassThrough:

ESXi is going to be the winner, here. Warts and all. But its not exactly realistic to think most people are going to fork out that kind of money for a Full License to use in a Home Lab
ProxMox is a "close second", though. It's not technically "Production Ready" in its "Community Version", but "close enough".

If you Want a Feature Rich, Fully Functional, Production Grade Type 1 Hypervisor:

I still prefer XCP-ng. It's far from perfect, but it's tooling and design are flexible and the performance tends to be more than marginally better than other Type 1s and MILES better than Type 2s; The backup features, ease of Service Integrations (iSCSI, NFS, SMB, etc) and broad support for Management Systems (Xen API, XOA, XCP-ngCenter, now XOA-Lite) makes a BIG difference. That's not even touching on the REALLY powerful stuff like "instant" VM cloning and Resource Pools

What About Nesting to Combine "The Best Of"?

This is what I would likely recommend most Home Labbers do. For those with the skills, experience, grit and confidence, I recommend finding a Solid Linux Distro that they really like for their workflows and install Libvirt/Virtsh/VMM. Once that is all in place, you can create Multiple Virtualized Type 1 Hypervisors and/or add in Type 2s for whatever edge cases you might have. That way, the sky is the limit, and all possibilities are open, its just a matter of deciding where to allocate the resources. This suggestion includes high-end desktops with 64 to 256GB of RAM and 6 to 12 core CPUs with "plenty" of Hard drives (not just capacity, but actual "count" of drives, for ZFS/RAID and bonus for NVMe/L2ARC).

If you have a full Bare Metal system you want to dedicate to being a "proper server" then you could go with ProxMox, and virtualize XCP-ng, TrueNAS and even Windows Server. You will need a "Daily Driver" machine to access it "remotely", though. That can be a mini PC or a laptop (Theoretically, a smart phone could work, too... but... I'd never recommend it for more than an "in a pinch" type of situation).

If you are just starting out on your Linux journey, and still want to stay on Windows for awhile longer, then you could also try using XCP-ng as a VirtualBox or Hyper-V guest VM, but you will have a "smoother" time with VMware WS Pro.

It's worth noting that I have seen, and am hereby acknowledging, the great work done on sorting out getting "nested Hyper-V" working within Windows Guest VMs on XCP-ng/Xen - but it still "feels" like its a long way off from being a truly viable solution, rather than an experiment.

That's my learnings from a lot of failures and successes working with "legit" Hypervisors. There are others, sure, but most of them are pretty awful to have not made it in the list. Hopefully that helps anyone else that stumbles upon these issues and breaks down enough to reach out and ask the communities for help in "solving the unsolveable". For the record, I am very much still looking forward to the day where Nested Virtualization is production ready on XCP-ng!

olivierlambert

Nested virt is a complex beast, even of the most "advanced" hypervisor (VMware). It will be hopefully a lot better in 2025 in Xen, but there's no magic: when you add such level of indirection, many things are wonky, while you need to stay secure and have decent performances, which is really really hard in nested.

nesting4lyfe2024

@olivierlambert That makes complete sense

I should mention that even those platforms that have Nested Virtualization, the performance is usually AWFUL, but could be useful for certain things (mostly testing and experiments though). A "Custom DNS Server" or something of that nature ("network service", some type of container running an "agent", etc).

I see why people think they want it, and it makes logical sense, but I've spelled all that out to essentially say "You think you want it, but be careful what you wish for."

I'd love to have a handful or 2 Windows guest VMs with Docker Desktop or HyperV to keep projects better separated and secure, but do I "need" that? not really. There are portability, security and privacy elements that appeal to the way I try to organize guest VMs for certain tasks, but to really make nesting useful, there is an implicit amount of wasted resources, too. Like if I have a nested XCP-ng or TrueNAS guest VM, that's a large amount of resources allocated to a VM that I'm probably not using all of it at once.

That said, I use the "Copy VM" feature all the time, as I have built out certain settings and configurations in a sort of "appliance templates" way - so having a ready-made environment that I could use GPUs to passthrough for something like an AI model I'm "training and tuning" or a similar passthrough type use case that I can just copy quickly and tweak 2-3 things in a few seconds to get everything going is what I'd like to be able to use it for.

So I find the idea of nested convenient and novel, but I've learned its not a necessity. Most server platforms (or even desktop motherboards) end up with wasted / inaccessible slots when using "powerful enough for AI or Gaming" GPUs, and then there are the often difficult-to-satify power requirements, too. So I find that most of the kinds of things I want to try to do with hardware / hardware emulation is usually best done on its on standalone system, anyway.

But maybe one day, single slot GPUs will be more affordable, and virtualization will be a standard feature for a hardware platform, rather than such a heavy lift.

Are there any specific CPU/mobo combinations that tend to have "the most success" with Nested Virtualization for Hypervisors like XCP-ng? It seems that is the biggest factor for NV, as of today (but maybe I'm wrong and it doesn't matter as much as I think it does).

olivierlambert

On 8.2, there's maybe more chances to get it working on decently tested and mature platform (Intel on top of Dell hardware for example). On 8.3, it's simply broken, regardless the hardware you use, until we got it fixed upstream.

nesting4lyfe2024

Thank you for that @olivierlambert

I've tried with the Dell R710 and R730, without any luck - I did a lot more extensive tests with the R710, which led to the frustration of not "trying as hard" with the R730, but perhaps there are some "tricks" that I missed? I tried to enable NV with modprobe and the like (I know that's vague, but its a LOT of nuance to describe for the "what and how").

Since the R710 is fairly ancient, I would typically assume its "less capable" than the R730 for NV with XCP-ng, but then sometimes I run into oddities where the R710 actually outperforms the R730 "without additional configuration" (such as with my other post/comment I'm actively exploring/troubleshooting - https://xcp-ng.org/forum/post/89035 ) - Then there is also the incredibly tedious and annoying mxGPU "slicing" features from the limited GPUs that feature was available for that the major of the internet has reported to NOT work for them in any modern Kernel of Linux, (though a few "edge cases" have been reported sucessfull). So, "maybe" older hardware has better / more mature drivers in some cases? That said, I've retired the R710, so the point is somewhat moot, anyway - but it serves as an interesting use case that I've learned a great deal from "conceptually". (TL;DR - its true what they say about "assuming" )

But, rather than start tinkering again and potentially create another 3+ day project, are there any specific "success stories" with the R730, specifically that you could recommend? or perhaps a specific model / CPU combination where NV has been reported by others as working with XCP-ng? Perhaps specific UEFI/BIOS settings or "kernel hacks" that "tend to have a higher reported success"?