@Forza Sorry, you were correct, I just mixed in another new issue. NFS is currently used only for backups. All my SRs are in local storage. It just happened that I now have backups failing not just because of the NFS issue but because of the VDI issue but I think it's a side-effect of the NFS problem causing the backup to get interrupted so now the VDI is stuck attached to dom0. I should have made that more clear or never mentioned the VDI issue at all.
Best posts made by CodeMercenary
-
RE: All NFS remotes started to timeout during backup but worked fine a few days ago
-
RE: All NFS remotes started to timeout during backup but worked fine a few days ago
@Forza Seems you are correct about
showmount
. On UNRAID running v4showmount
says there are no clients connected. I previously assumed that meant XO only connected during the backup. When I look at/proc/fs/nfsd/clients
I see the connections.On Synology, running v3,
showmount
does show the XO IP connected. Synology is supposed to support v4 and I have the share set to allow v4 but XO had trouble connecting that way. Synology is pretty limited in what options it lets me set for NFS shares.
Synology doesn't have
rcpdebug
available. I'll see if I can figure out how to get more logging info about NFS. -
RE: One VM backup was stuck, now backups for that VM are failing with "parent VHD is missing"
@olivierlambert Well, last night the backup completed just fine despite me taking no action.
I updated the XO to the latest commit when I got in this morning so hopefully the issue I had back in June don't come back.
-
RE: Possible to use "xo-cli vm.set blockedOperarations=<object>"?
@julien-f Thank you, that's super helpful and even easier than I thought it would be.
-
RE: Import from ESXi 6 double importing vmdk file?
@florent Yeah, it's a lot of data, thankfully my other VMs are not nearly as large. I'm still not sure why it failed when none of the virtual drives are 2TB. The largest ones are configured with a 1.82TB max so even the capacity of the drive is less than the max.
I'm moving ahead with a file level sync attempt to see if that works.
To be clear, this post was as much or more about helping you figure out what's wrong so other people don't have the same issue, than it is about making this import work for my VM. With the flood you are getting from VMware refugees, I figure I'm not the only person with large drives to import. In other words, if there's something I can do to help you figure out why it fails then I'm willing to help.
-
RE: Help: Clean shutdown of Host, now no network or VMs are detected
@olivierlambert It was DR. I was testing DR a while ago and after running it once I disabled the backup job so these backups have just been sitting on the server. I don't think I've rebooted that server since running that backup.
Latest posts made by CodeMercenary
-
RE: VDIs attached to control domain can't be forgotten because they are attached to the control domain
@andrewperry Sorry for the delay in responding; I think you posted while I was on vacation. I ended up rebooting the host and have not had the problem return since. Uh oh, I hope I didn't just jinx myself.
-
RE: NVIDIA Tesla M40 for AI work in Ubuntu VM - is it a good idea?
@gskger Great to know, thank you.
I actually already have Ollama installed and running with Open WebUI so I can ask it ahead of time to see what it would be like. I've installed some more code specific models that are better suited to that kind of question.
Running the case fans full blast all the time would be a non-starter so I'm glad you let me know that as I count the costs of attempting this.
-
RE: NVIDIA Tesla M40 for AI work in Ubuntu VM - is it a good idea?
@gskger Thanks for the input. I also have dual 750W power supplies and I'd be totally fine limiting the power of the card. I'm not looking for crazy performance, just better than what I get with CPUs and enabling things that are a lot harder to even get working with CPU only, like stable diffusion.
I will absolutely trade performance for less noise and heat in my small server closet. Ideally I'd want the GPU card to consume nothing and need very minimal cooling if I'm not actively running a task in ollama. I can currently hear the fans spool up in that server every time I give ollama a task to run. I don't mind the extra noise when doing something as long as it doesn't permanently make the server more noisy.
I do have concerns about possibly needing external cooling for it as suggested in the article, I'd love to not need active cooling. If limiting the power consumption means that the existing case airflow is sufficient then I'd be happy with that.
-
NVIDIA Tesla M40 for AI work in Ubuntu VM - is it a good idea?
I built an AI vm a little while ago running ollama on Ubuntu 24.04. I have no GPU in that host so I threw 32GB RAM and 32 cores at it and it works, slowly. I knew that a GPU is really the way to go, especially if I want to be able to do image generation. Then today I saw this article. https://blog.briancmoses.com/2024/09/self-hosting-ai-with-spare-parts.html
A GPU for around $90 could be interesting. Any of you guys have a guess at the difficulty of getting an NVIDIA Tesla M40 (https://www.ebay.com/itm/274978990880) to work in an Ubuntu VM in a PowerEdge 730xd host? I see that the card has power requirements but don't know if the supplies in the server take care of that since this card is 10 years old and the server likely is too, and the card is designed for server use - maybe it wouldn't need tweaking.
Am I insane to be considering this?
-
RE: Possible to use "xo-cli vm.set blockedOperarations=<object>"?
@julien-f Thank you, that's super helpful and even easier than I thought it would be.
-
RE: Does "xo-cli emergencyShutdownHost" immediately return control to a script?
I guess I'll partially answer my own question by pointing out that I can background the commands with & so it should run them all at once anyway. Sorry, still new to bash scripting and my Linux skills are a bit rusty but improving.
-
Does "xo-cli emergencyShutdownHost" immediately return control to a script?
Finishing up my NUT scripting in an XO VM. When NUT sends
lowbatt
,shutdowncritical
, orpowerdown
I'm going to pull the list of host uuids and then runxo-cli emergencyShutdownHost host=$uuid
for each of them.The complication is that one of these hosts holds the very VM that is running NUT. I'm wondering if xo-cli returns control right away or if it waits for any significant time. What I obviously want to avoid is that the first host in the list is the one with XO+NUT on it and the VM gets suspended before telling the other hosts to shut down.
What's would add insult to injury is that likely when that XO VM resumes after power is restored, it would happily continue its work and ask the other hosts to emergency shutdown.
I can certainly alter the script to carve out the current host and shut it down last, that's probably a good idea regardless, but if the emergency shutdown exits really quickly then I might be able to finish in time. I only have 3 hosts.
Granted, I could put a VM on each of the hosts and have them all using the NUT monitor so I'd have redundancy but I'm trying to make this as simple as possible without making it so simple that it fails me in a power outage.
I noticed, for instance, that when using
xo-cli vm.suspend
, it waits until the suspend is done so it can returntrue
. I want to make sure the same isn't true of host shutdown.Obviously I can test this to find out but I'd rather not emergency shutdown one of my hosts during a workday, nor do it from home when I'd have to come back and turn it on. Some evening I can stay late and experiment with it but not this evening or this weekend so I thought I'd ask.
-
Possible to use "xo-cli vm.set blockedOperarations=<object>"?
I'm looking for a way to turn off blocking VM suspend without blocking VM stop. I see from
xo-cli --list-commands
that there is avm.set
described as:vm.set id=<string> [blockedOperations=<object>]
I have, so far, been unable to figure out what to pass to blockedOperations to have it change a value. Not sure what it means by <object>.
Passing a json string to it doesn't seem to work. I get an error that it must be an object.
I believe I was able to blow away all blocked operations by passing nothing as that arg but I don't want to remove all of it, just the suspend. I realize that when wanting to block
stop
it makes sense to also blocksuspend
because they, in essence, have the same result. I was hoping to extract an existing object from the vm properties, turn offsuspend
, then pass that object back tovm.set
. Is my quest hopeless? -
RE: Switching to XCP-NG, want to hear your problems
Interesting to know that v3 seems to be more reliable that v4. I had repeated problems with using NFS for a backup remote and those problems only went away when I changed the remotes to use SMB. I know NFS would be better to use but a backup that happens through an inferior protocol is way better than one that fails using a better protocol.
Maybe I should give NFS another chance but force it to use v3.
In my case, I'd get backups working on NFS and then several days later a backup would fail. Then backups fail every day until I intervene, usually by rebooting the XO VM. Sometimes I'd then have to do cleanup, like releasing a VDI or something. Then it may or may not start working again but if it did start working I'd have another failure a few days later. It's been 2.5 weeks since I switched it to SMB and have had no failures. That's definitely the longest I've gone without a failure from a delta backup to a networked drive.
Note, I also have backups going to a local drive mounted in XO so with all those remote failures I always had a clean backup somewhere. This was in the process of trying to decide if I could trust sending delta backups to a network remote rather than using full backups to a local remote. My initial feelings were that the delta backups didn't work reliably but now I believe the issue was with NFS, not with deltas specifically.
-
RE: All NFS remotes started to timeout during backup but worked fine a few days ago
A bit after this I started having trouble again. I scrubbed my delta backup sets, switched to using SMB to access the same share and so far I'm a week in with no trouble at all. Remains to be seen if the strange failures will crop up again but so far I think this is the longest I've gone with the delta backups not having some random failure. Usually, it was a backup getting stuck in Started status for a day or two then switching to Interrupted because I'd reboot the XO VM to unstick it, then one of the VM backups will fail because a VDI is attached to DOM0. I know theoretically NFS is better than SMB but for now the one that breaks least often is the better option. Of course, maybe my issue had nothing to do with NFS but for now it's felt like SMB is more reliable.