Hiding hypervisor from guest to prevent Nvidia Code 43
-
@olivierlambert
Okay I haven't checked the XEN sources extensively but I presume that those 'static values' you are talking about are hard-coded somewhere in there. From what I have read, Nvidia's detection method is to look for specific strings in specific places. I believe that (at least part of) the KVM patch is to randomize those values.So, at least theoretically, wouldn't it be possible to hard-code different values and recompile the whole source? That should provide at least a temporary fix, and I understand that everyone would have to do it for themselves, but perhaps it would be possible for someone with a more detailed knowledge on the project to create a guide, perhaps on the wiki, which people interested in this and with enough technical expertise can follow.
-
I have exactly 0 resources available right now to work on this problem. As I said, this will required a lot of time to reach to a result that:
- won't be upstreamed if it's "hacky" (current Xen code base won't allow to do that properly)
- will require entire Xen rebuilt and package creation
As a small team today, should we waste time on something that won't last long nor being upstreamed?
Really, we gladly accept contributions, but I won't put that as our priority 1 before UEFI and secure boot for VMs (current Xen work on our side) and other capital features that can be done with far less efforts.
Please put your request in perspective: you aren't alone in the world.
-
@olivierlambert
Of course I am not alone and of course I am not implying to leave all other work and deal with this. I was just asking if it is a possible (albeit "hacky") and strictly DIY solution (no up-streams or anything like that).I will gladly look into this some more over the incoming months when I have the time and preferably appropriate hardware.
-
Actually, there are patchs for Xen, and it is working with Xen with driver patcher, but it doesn't work on Xenserver/XCP sadly
https://github.com/sk1080/nvidia-kvm-patcher/issues/45#issuecomment-574680727
https://lists.xenproject.org/archives/html/xen-devel/2016-07/msg01713.htmlI'm not familiar with Xen code base (I took a look, and ugh) enough to know where to apply the hiding, but I don't think it should take months for someone familiar with code base.
-
This is exactly the patch I asked Xen team about, and the answer was: "this is an ugly hack that will never be upstream" (until Xen will expose an interface to made those changes).
edit: I'll reask when Xen code base will be more ready to get this
-
From the github post, it seems the blacklisting the GPU works, which is similar to how kvm does it, without modification to Xen works?
-
So I asked some people in Xen team: CPUID/MSR changes needed to be done for this use case aren't ready yet.
-
Is it not better to vote with your wallet and choose something else than nvidia?
-
Thanks Oliver,
Sadly, ATI is being going downhill since AMD bought them over.
-
@olivierlambert There is a huge and growing market for virtualised GPU: gaming, AI, 3D. Personally I'd pay $$ for this feature alone - having spent days trying to get this working. I do understand that Nvidia will use their might to squash this functionality - as it is not in their commercial interests - they'll claim it circumvents their EULA.
-
I'll look for update on this topic... but for now I'm going to drop XCP-NV and go (back) to KVM.
-
It's XCP-ng not "XCP-NV" We'll continue to track if it's possible to do so one day.
-
@olivierlambert oops.. typo. Apologies
-
@olivierlambert @imtrobin I had worked on CPUID for exposing temperature data of CPU to Guest (Dom0). May be we can use same.
# cpuid | grep -I hypervisor_id hypervisor_id = "Microsoft Hv"
Is this the one which should be hidden?
-
I'm not entirely sure about that, I need to ask one Xen dev.
-
This post is deleted! -
@r1 said in Hiding hypervisor from guest to prevent Nvidia Code 43:
@olivierlambert @imtrobin I had worked on CPUID for exposing temperature data of CPU to Guest (Dom0). May be we can use same.
# cpuid | grep -I hypervisor_id hypervisor_id = "Microsoft Hv"
Is this the one which should be hidden?
I'm have no idea if that is enough, but maybe. How can we try this? For kvm , they do this
-
What is the current status on this? We are hoping for this soonish, and if it isn't coming at all or not for a long time, then we need to plan a migration to another platform.
Our use case is for business, but we are not a very big business and do not wish to spend thousands on buying Quadro cards when much cheaper GTX and other consumer cards will do the job perfectly if it wasn't for Nvidia's artificial limitation.
We run a custom program for Kitchen Design as RemoteApps under an RDS setup. At this point, we are running under Quadro 4000 cards, as we already had them, but due to this, I cannot upgrade to Server 2019 as there are no drivers. To upgrade we have to upgrade cards, and the best bang for buck cards in our use, are P4000 which do run under Server 2019, but are $1000+ each. WE want to have 2 cards in each server, totalling 4 cards. We can get even better bang for buck on some RTX 2070's or the like for nearly half the cost.
The only thing stopping us is this stupid Code 43 "bug".
We are willing to wait as we haven't quite hit the limits of our current setup, but we are heading close to it, and we need to plan our next move. I love XCP-ng and honestly cannot fault it for anything, other than having the capability to hide the fact the VM is a vm.
We have already ran a test setup on Proxmox and the Code 43 fix works perfectly.
-
If it's for business, why not trying to pour some resources to make this work? It's Open Source with pro support backed, so there's 2 approaches:
- contribute to speed up the process
- pay to get someone (individual or company) to do it
One priority can't be a priority for everyone. But you can have a influence on that with 2 different levers.
-
I've been using Xcp-NG and Xen for awhile now and unfortunately i have had to change everything from Xen to Esxi because this feature is not there. hypervisor.cpuid.v0=false is the only line I have to enter for it to hide the GPU's in all our servers.
It's a shame as i understand it isn't a priority but this has been asked for many years from the google searches i have done. When this feature is available I will come back to Xen unfortunately until then I have to use Esxi