@Forza Sorry, you were correct, I just mixed in another new issue. NFS is currently used only for backups. All my SRs are in local storage. It just happened that I now have backups failing not just because of the NFS issue but because of the VDI issue but I think it's a side-effect of the NFS problem causing the backup to get interrupted so now the VDI is stuck attached to dom0. I should have made that more clear or never mentioned the VDI issue at all.
Best posts made by CodeMercenary
-
RE: All NFS remotes started to timeout during backup but worked fine a few days ago
-
RE: All NFS remotes started to timeout during backup but worked fine a few days ago
@Forza Seems you are correct about
showmount
. On UNRAID running v4showmount
says there are no clients connected. I previously assumed that meant XO only connected during the backup. When I look at/proc/fs/nfsd/clients
I see the connections.On Synology, running v3,
showmount
does show the XO IP connected. Synology is supposed to support v4 and I have the share set to allow v4 but XO had trouble connecting that way. Synology is pretty limited in what options it lets me set for NFS shares.
Synology doesn't have
rcpdebug
available. I'll see if I can figure out how to get more logging info about NFS. -
RE: One VM backup was stuck, now backups for that VM are failing with "parent VHD is missing"
@olivierlambert Well, last night the backup completed just fine despite me taking no action.
I updated the XO to the latest commit when I got in this morning so hopefully the issue I had back in June don't come back.
-
RE: Possible to use "xo-cli vm.set blockedOperarations=<object>"?
@julien-f Thank you, that's super helpful and even easier than I thought it would be.
-
RE: Import from ESXi 6 double importing vmdk file?
@florent Yeah, it's a lot of data, thankfully my other VMs are not nearly as large. I'm still not sure why it failed when none of the virtual drives are 2TB. The largest ones are configured with a 1.82TB max so even the capacity of the drive is less than the max.
I'm moving ahead with a file level sync attempt to see if that works.
To be clear, this post was as much or more about helping you figure out what's wrong so other people don't have the same issue, than it is about making this import work for my VM. With the flood you are getting from VMware refugees, I figure I'm not the only person with large drives to import. In other words, if there's something I can do to help you figure out why it fails then I'm willing to help.
-
RE: Help: Clean shutdown of Host, now no network or VMs are detected
@olivierlambert It was DR. I was testing DR a while ago and after running it once I disabled the backup job so these backups have just been sitting on the server. I don't think I've rebooted that server since running that backup.
Latest posts made by CodeMercenary
-
RE: VDIs attached to control domain can't be forgotten because they are attached to the control domain
@rtjdamen I have become suspicious that my backups might not be as messed up as I thought. Yesterday I noticed that my backups were again in
started
state long after they would normally be completed. Because some of those backups go to the UNRAID server I mentioned, I decided to do some digging. That UNRAID server does have incoming network activity so I became suspicious that the backups are working just very slow. I checked it again this morning and one of the backups completed after 16 hours, the other one completed after 21 hours. In this case I think the issue is that UNRAID is using the 1Gb adapter instead of the 10Gb adapter.In the future I'm going to be more careful about deciding that a backup is stuck. I'd like to figure out if there's a way to get more insight into what is happening in the backup, like an ongoing percentage complete and a data transfer speed or total. Would be nice not to have to look at the receiving side for traffic and assume that's the backup, plus some of my backup targets aren't as easy to tell the incoming traffic.
Now I have to dare rebooting the UNRAID server again to see if I can get it to use the right network connection. It must have gotten out of whack when I reset the BIOS and I need to get it back in whack.
-
RE: VDIs attached to control domain can't be forgotten because they are attached to the control domain
@rtjdamen Good to know, thank you. I'm not likely to try to do that without fully understanding what I'm doing. I will just continue to reboot the host if this happens again. It did happen again a week or two ago. My infrastructure isn't so complex that it's impossible to reboot the host, it's just annoying because I have to stay late so nobody is using the servers and I've had servers that failed to boot up after a restart that should have been trivial so I'm always a bit nervous. That was years ago and not when running XCP-ng but it left an emotional scar. I just had the same thing happen with my UNRAID server last Friday, had to clear the BIOS settings to get it to boot again.
-
RE: VDIs attached to control domain can't be forgotten because they are attached to the control domain
@andrewperry Sorry for the delay in responding; I think you posted while I was on vacation. I ended up rebooting the host and have not had the problem return since. Uh oh, I hope I didn't just jinx myself.
-
RE: NVIDIA Tesla M40 for AI work in Ubuntu VM - is it a good idea?
@gskger Great to know, thank you.
I actually already have Ollama installed and running with Open WebUI so I can ask it ahead of time to see what it would be like. I've installed some more code specific models that are better suited to that kind of question.
Running the case fans full blast all the time would be a non-starter so I'm glad you let me know that as I count the costs of attempting this.
-
RE: NVIDIA Tesla M40 for AI work in Ubuntu VM - is it a good idea?
@gskger Thanks for the input. I also have dual 750W power supplies and I'd be totally fine limiting the power of the card. I'm not looking for crazy performance, just better than what I get with CPUs and enabling things that are a lot harder to even get working with CPU only, like stable diffusion.
I will absolutely trade performance for less noise and heat in my small server closet. Ideally I'd want the GPU card to consume nothing and need very minimal cooling if I'm not actively running a task in ollama. I can currently hear the fans spool up in that server every time I give ollama a task to run. I don't mind the extra noise when doing something as long as it doesn't permanently make the server more noisy.
I do have concerns about possibly needing external cooling for it as suggested in the article, I'd love to not need active cooling. If limiting the power consumption means that the existing case airflow is sufficient then I'd be happy with that.
-
NVIDIA Tesla M40 for AI work in Ubuntu VM - is it a good idea?
I built an AI vm a little while ago running ollama on Ubuntu 24.04. I have no GPU in that host so I threw 32GB RAM and 32 cores at it and it works, slowly. I knew that a GPU is really the way to go, especially if I want to be able to do image generation. Then today I saw this article. https://blog.briancmoses.com/2024/09/self-hosting-ai-with-spare-parts.html
A GPU for around $90 could be interesting. Any of you guys have a guess at the difficulty of getting an NVIDIA Tesla M40 (https://www.ebay.com/itm/274978990880) to work in an Ubuntu VM in a PowerEdge 730xd host? I see that the card has power requirements but don't know if the supplies in the server take care of that since this card is 10 years old and the server likely is too, and the card is designed for server use - maybe it wouldn't need tweaking.
Am I insane to be considering this?
-
RE: Possible to use "xo-cli vm.set blockedOperarations=<object>"?
@julien-f Thank you, that's super helpful and even easier than I thought it would be.
-
RE: Does "xo-cli emergencyShutdownHost" immediately return control to a script?
I guess I'll partially answer my own question by pointing out that I can background the commands with & so it should run them all at once anyway. Sorry, still new to bash scripting and my Linux skills are a bit rusty but improving.
-
Does "xo-cli emergencyShutdownHost" immediately return control to a script?
Finishing up my NUT scripting in an XO VM. When NUT sends
lowbatt
,shutdowncritical
, orpowerdown
I'm going to pull the list of host uuids and then runxo-cli emergencyShutdownHost host=$uuid
for each of them.The complication is that one of these hosts holds the very VM that is running NUT. I'm wondering if xo-cli returns control right away or if it waits for any significant time. What I obviously want to avoid is that the first host in the list is the one with XO+NUT on it and the VM gets suspended before telling the other hosts to shut down.
What's would add insult to injury is that likely when that XO VM resumes after power is restored, it would happily continue its work and ask the other hosts to emergency shutdown.
I can certainly alter the script to carve out the current host and shut it down last, that's probably a good idea regardless, but if the emergency shutdown exits really quickly then I might be able to finish in time. I only have 3 hosts.
Granted, I could put a VM on each of the hosts and have them all using the NUT monitor so I'd have redundancy but I'm trying to make this as simple as possible without making it so simple that it fails me in a power outage.
I noticed, for instance, that when using
xo-cli vm.suspend
, it waits until the suspend is done so it can returntrue
. I want to make sure the same isn't true of host shutdown.Obviously I can test this to find out but I'd rather not emergency shutdown one of my hosts during a workday, nor do it from home when I'd have to come back and turn it on. Some evening I can stay late and experiment with it but not this evening or this weekend so I thought I'd ask.
-
Possible to use "xo-cli vm.set blockedOperarations=<object>"?
I'm looking for a way to turn off blocking VM suspend without blocking VM stop. I see from
xo-cli --list-commands
that there is avm.set
described as:vm.set id=<string> [blockedOperations=<object>]
I have, so far, been unable to figure out what to pass to blockedOperations to have it change a value. Not sure what it means by <object>.
Passing a json string to it doesn't seem to work. I get an error that it must be an object.
I believe I was able to blow away all blocked operations by passing nothing as that arg but I don't want to remove all of it, just the suspend. I realize that when wanting to block
stop
it makes sense to also blocksuspend
because they, in essence, have the same result. I was hoping to extract an existing object from the vm properties, turn offsuspend
, then pass that object back tovm.set
. Is my quest hopeless?