@manilx Maybe this post on Intel iGPU passthough gives some ideas? You probably loose the video output on the Protectli when you assign the iGPU to a VM.
Posts made by gskger
-
RE: Passthru of Graphics card
-
RE: NVIDIA Tesla M40 for AI work in Ubuntu VM - is it a good idea?
@CodeMercenary The M40 is a server card and (physically) compatible with the R730, so no extra cooling is required (and possible). The downside is that the R730 most likely will still go full blast on all fans regardless of the actual power consumption since the server can not read the GPUs temperature. But there are scripts to manage the fan speeds based on server or GPU temperature. And once you have ollama installed, you can ask it how to write that code
-
RE: NVIDIA Tesla M40 for AI work in Ubuntu VM - is it a good idea?
@CodeMercenary Probably not insane if you want to learn using ollama or other LLM frameworks for inference. But the M40 is an ageing GPU with a low compute capability (v5.2), so with time, it might not be supported any more by platforms like ollama, vLLM, llama.cpp or aprhodite (did not check if they actually support that GPU, but Ollama has support for the M40). I doubt that you get an acceptable performance for stable diffusion (image generation) or training/fine-tuning. But what could you expect for $90?
The card has a power consumption (TDP) of 250W which is compatible with the 16x PCIe slot of riser #2. You have to be extra careful with the cable as it is not a standard cable. While most would suggest power supplies of 1100W for the Dell R730 to be on the save side, I run two P40s with 750W power supplies in a Dell R720. But I also power limit the card to 140W with little effect on the performance and have light workloads and no batch processing.
-
RE: XCP-ng 8.3 betas and RCs feedback 🚀
@olivierlambert People might be a little crazy, but they also have trust in the test and RC lifecycle of XCP-ng, which has proven to be very reliable thanks to the dedication of the Vates team and also the community. But I agree, for production use it's better to wait for a GA announcement. Anyway, congratulation to this important milestone in the development of XCP-ng .
-
RE: XCP-ng 8.3 betas and RCs feedback 🚀
@bleader There is a
xcp-ng-8.3.0.iso
on the ISO repository. Is that the release of XCP-ng 8.3 ? Looking forward to an official announcement . -
RE: from Hyper-V
@McHenry Maybe this old post helps Error importing large vhd file. It also links to the documentation on how to Migrate to XCP-ng from Hyper-V, but I guess you already read that.
-
RE: XCP-ng 8.2 updates announcements and testing
@bleader Update worked well on my two node homelab and everything looks and works normal after reboot. I did some basic stuff like VM and Storage migration, but nothing in depth. Let's see how things work out.
-
RE: XCP-ng 8.3 betas and RCs feedback 🚀
@marvine I am using two Nvidia P40 with passthrough to Debian VMs with XCP-ng 8.3 RC1. I had no issues with passthrough, installing and runing the P40. For Windows VMs, this older post might give some more information.
-
RE: Using ipmitool locally in a VM?
Mh, that sounds interesting, but I never done this. Can you suggest a starting point, example or documentation to get started?
-
RE: Using ipmitool locally in a VM?
Yes, that is my challenge.
In the Debian VM, I can use
nvidia-smi
to get GPU info from the GPUs that are passed through by XCP-ng to the VM. I can not useipmitool
localy in the Debian VM to get host/server info and control the fan speed (most likely because/dev/ipmi0
is visible in DOM0 on my Dell R720 but not visible in the VM). One option would be to use IPMI Over LAN to give the VM access to the iDRAC interface, but that is in the management VLAN.My thought is to dynamically control the fan speeds from within the VM that creates the thermal load, or even turn the VM off when the load exceeds a certain critical threshold.
-
Using ipmitool locally in a VM?
I am using a Dell Poweredge R720 with two Nvidia P40 to learn about Large Language Models (LLMs) and machine learning in general. XCP-ng is installed and both P40s are pass-throughed (if that word exists) to a Debian 12 VM that runs Ollama in Docker. I do powerlimit both GPUs, but during inference the passively cooled GPUs sometimes still get hot anyway.
On a bare metal Debain 12 install, I can use
nvdia-smi
to read the max. GPU temperatur andipmitool
to read the max. inlet/outlet/CPU temperatur and set the fan speed dynamicly, again withipmitool
. That works reliably.I don't want to use
ipmitool -H {ip_address} -U {username} -P {password} {command}
(which seems to be the #1 recommendation) because I don't want to punch a hole in the firewall to give this VM access to the management network. Another control VM that queries both temperature sources (iDRAC viaipmitool
, Ollama Debian VM with P40 Paththrough via SSH andnvidia-smi
) would also work, but feels to complicated.Any idea if something like running
ipmitool
locally in a Debain 12 VM can be achieved with XCP-ng? -
RE: XCP-ng 8.2 updates announcements and testing
@bleader Installed on my playlab. Everything looks normal, let's see how it goes.
-
RE: EOL: XCP-ng Center has come to an end (New Maintainer!)
@Seneken Have you considered using Xen Orchestra (XO) instead of XCP-ng Center? XO is the recommended solution to manage Xenserver or XCP-ng hosts and pools. The Xen Orchestra from Source comes with all the bells and whistles for a homelab, while the Xen Orchestra Appliance (XOA) is intended for enterprise scenarios with support. But you most likely already know that and are just interested in a windows client
-
RE: Xen Orchestra from source with Let's Encrypt certificates
Having XO from source or XOA act as a certification authority for the XCP-ng hosts is for sure a good approach. Would be great if that could include the VMs running on the XCP-ng hosts, which is my main goal (apart from being able to HTTPS into XO from source of course).
-
RE: Xen Orchestra from source with Let's Encrypt certificates
@kevdog Internal server can not be reached from the internet (no port forwarding so no HTTP challenge) and my hosting provider does not have an API for DNS challenges. Thats why I use pfSense "on the edge" at the moment. I admit that a cheap VPS runing acme.sh could do the trick, but my automation works and I am lazy .
-
RE: XCP-ng 8.2 updates announcements and testing
@bleader Updated my two node homelab and everything seems to work as expected. Let's see how things go over the next days.
-
RE: XCP-ng 8.3 betas and RCs feedback 🚀
@stormi Update of my XCP-ng 8.3 test server through CLI (
yum update
) and rebooted. All VMs work as expected. -
Nvidia P40s with XCP-ng 8.3 for inference and light training
Being curious about Large Language Models (LLMs) and machine learning, I wanted to add GPUs to my XCP-ng homelab. Finding the right GPU in June 2024 was not easy, given the numerous options and constraints, including a limited budget. My primary setup consists of two HP ProDesk 600 G6 running my 24/7 XCP-ng cluster with shared storage. My secondary setup features a set of Dell R210 II and Dell R720 servers, which I use for memory-intensive tasks. The Dell R720 can hold up to two full-sized GPU which lead to my first requirement: a GPU must fit into the Dell R720.
With R720 compatibility as a requirement, gaming GPUs (RTX 2080, 3080/3090, 4080/4090) were not an option. Additionally, they are expensive and do not all come with a lot of VRAM memory. But I admit that those are more powerful compared to what I came up with.
Since I want to test different LLMs with various parameter sizes, my minimum memory size requirement was 24GB VRAM. That clearly reduced the GPU options again, even for compatible low power (~70W) GPUs like the Nvidia RTX A2000 (12GB) or the Nvidia Tesla T4 (16GB).
My budget limit for getting started was around €300 for one GPU. That narrowed down my search to the Nvidia Tesla P40, a Pascal architecture GPU that, when released in 2016, cost around $5,699. In Europe, the P40 is available on Ebay for around €300 to €500, and I was very lucky to get two P40s for around €510 in total. Seeing two P40s on Ebay in the US for $299 with no delivery option for Europe was a painful experience though.
However, I had to make two compromises with the P40, which may be a problem in the future. First, the P40 is lacking Tensor Cores, which are essential for deep learning training compared to FP32 training. Additionally, the P40 is limited by its CUDA compute capability of 6.1, which is lower than that of newer GPUs like the H100 (9.0). At some point, software tools might stop supporting the P40.
To install a second GPU I had to swap the Dell R720 riser card #3 from 2 PCIe x8 slots with a 150W power connector to a 1 PCIe x16 slot with a 225W power connector. Like the K80/M40/M60/P100 the P40 has a 8-pin EPS connector, so you need a special power cable that can be sourced from Ebay. Using the standard Dell general-purpose GPU cable risks damaging the GPU or motherboard. Yesterday, the last part arrived so today is install day.
The process of swapping riser #3, installing the two P40 GPUs, and connecting the power cables was straightforward.. During boot, the server checks PCI devices and updates the inventory, which might take some minutes. After the initial fan ramp-up, the fan speed dropped back to normal and the Dell R720 idles at about 126W with both GPUs installed.
Next step was installing and updating XCP-ng 8.3 beta, which was as easy as installing the GPUs. Adding the host to XO from source and activating the PCI pass-through in the hosts advanced view required a reboot, but after that I could setup an Ollama VM to run LLMs and another Open WebUI VM to chat with the LLMs. With 48 GB of VRAM, I can run
llama3-70b
with some headroom and about 6 tokens/sec whilellama3-8b
is much smaller and answers with 23 tokens/sec on this setup.So what are the next steps? On one hand, I want to setup a development environment for
Phyton
and API based usage of LLMs (not only local LLMs, but also cloud based LLMs like ChaGPT or Claude). That will be fun, since I have zero experience with that. On the other hand, I will setup more GPU supported services like Perplexica or AUTOMATIC1111 or whisper. Apart from that, I will also try to improve my prompt engineering skills and learn about LLM multi agent frameworks.The best thing on this setup is that XCP-ng 8.3 beta provides a robust foundation for running Large Language Models (LLMs) and other AI workloads on one machine. Looking forward to the release candidate!
-
RE: problem with export or moving VM between pools
@noam Are you using XOA or XO from source and is it fully up-to-date? You should also give more details on your setup (XCP-ng version etc.)
-
RE: XCP-ng 8.3 betas and RCs feedback 🚀
@stormi Update of my XCP-ng 8.3 test server through XO from source (88 patches) went really well and after a reboot (not sure if that was needed), my VMs started normaly . I will report back if something comes up. Looking forward to the RC