@Forza Sorry, you were correct, I just mixed in another new issue. NFS is currently used only for backups. All my SRs are in local storage. It just happened that I now have backups failing not just because of the NFS issue but because of the VDI issue but I think it's a side-effect of the NFS problem causing the backup to get interrupted so now the VDI is stuck attached to dom0. I should have made that more clear or never mentioned the VDI issue at all.
Best posts made by CodeMercenary
-
RE: All NFS remotes started to timeout during backup but worked fine a few days ago
-
RE: All NFS remotes started to timeout during backup but worked fine a few days ago
@Forza Seems you are correct about
showmount
. On UNRAID running v4showmount
says there are no clients connected. I previously assumed that meant XO only connected during the backup. When I look at/proc/fs/nfsd/clients
I see the connections.On Synology, running v3,
showmount
does show the XO IP connected. Synology is supposed to support v4 and I have the share set to allow v4 but XO had trouble connecting that way. Synology is pretty limited in what options it lets me set for NFS shares.
Synology doesn't have
rcpdebug
available. I'll see if I can figure out how to get more logging info about NFS. -
RE: One VM backup was stuck, now backups for that VM are failing with "parent VHD is missing"
@olivierlambert Well, last night the backup completed just fine despite me taking no action.
I updated the XO to the latest commit when I got in this morning so hopefully the issue I had back in June don't come back.
-
RE: Possible to use "xo-cli vm.set blockedOperarations=<object>"?
@julien-f Thank you, that's super helpful and even easier than I thought it would be.
-
RE: Import from ESXi 6 double importing vmdk file?
@florent Yeah, it's a lot of data, thankfully my other VMs are not nearly as large. I'm still not sure why it failed when none of the virtual drives are 2TB. The largest ones are configured with a 1.82TB max so even the capacity of the drive is less than the max.
I'm moving ahead with a file level sync attempt to see if that works.
To be clear, this post was as much or more about helping you figure out what's wrong so other people don't have the same issue, than it is about making this import work for my VM. With the flood you are getting from VMware refugees, I figure I'm not the only person with large drives to import. In other words, if there's something I can do to help you figure out why it fails then I'm willing to help.
-
RE: NVIDIA Tesla M40 for AI work in Ubuntu VM - is it a good idea?
@gskger Yeah, looks like it would be too tight. Ouch, those T4s are an order of magnitude more expensive. I'm definitely not interested in going that route.
-
RE: Help: Clean shutdown of Host, now no network or VMs are detected
@olivierlambert It was DR. I was testing DR a while ago and after running it once I disabled the backup job so these backups have just been sitting on the server. I don't think I've rebooted that server since running that backup.
Latest posts made by CodeMercenary
-
RE: Why does emergencyShutdown take a lot longer than shutting down the host from the console?
@manilx For
host.stop
, what does thebypassBackupCheck
do? I thinkbypassEvacuate
is pretty clear, that must mean "don't try to migrate running VMs to another server before stopping." I feel like I would want to bypass whatever the backup check is because I've lost power and the server needs to be shut down. -
RE: Why does emergencyShutdown take a lot longer than shutting down the host from the console?
@manilx That is great to hear, thank you.
I will definitely share it with the community. Still trying to iron out some of the wrinkles. Testing requires me to let all my servers get shut down so I'm limited in how frequently I can test the solution.
This weekend was my third pull-the-plug test and it was the closest to totally working. In fact, I think this test did totally work, but due to network changes I had to shut down xcp because I've found it gets really mad if you change anything about its network while it's running. It was using the physical console to shut it down that I was shocked at how fast it shut down. That's why I posted to ask about the difference.
I think changing my script to just use
host.stop
will resolve my last concerns about the script. Having a faster shutdown for xcp might also allow me to go back to my original design when I let the more important VMs live a bit longer. I originally staged when VMs got shut down so the important once could survive a 5 or 10 minute power outage. Turns out with xcp taking 8 minutes to shutdown after the VMs were down, I had to change my script to start to close everything as soon as it was clear that this wasn't just a small power blip.When I post it though, everyone will need to recognize that I'm no bash coder. I write code but this is the only bash stuff I've done so it could be rough.
-
RE: Why does emergencyShutdown take a lot longer than shutting down the host from the console?
@olivierlambert I understood what
emergencyShutdownHost
does, I was surprised that it seems to take a long time even if all the VMs were stopped before executing it. There should be nothing to suspend but it still takes about 8 minutes for the server to finish the shutdown.I will start using
host.stop
instead ofhost.emergencyShutdownHost
for the hosts that have no running VMs. Realistically, when having NUT shut it down, I'd rather the host just issue a clean shutdown command to any running VMs. I'm not sure whathost.stop
will do if there is a running VM. If it would politely ask it to stop then that would be perfect, if it yanks the virtual power cord then I wouldn't like that.The ideal is that no host has a running VM by the time I want to shut it down but since NUT runs in a VM at the moment, one host will have that one running VM. I'll be in a bit of a race condition if I issue
host.stop
immediately followed byshutdown now
. It's virtual murder-suicide but the murderer's life depends on the murdered. Can Linux shut down before xcp kills it? Might be an interesting test. -
RE: Performing automated shutdown during a power failure using a USB-UPS with NUT - XCP-ng 8.2
@nomad What do you mean by the grand reconfiguration? Is there a new version that changes how things work?
-
RE: Why does emergencyShutdown take a lot longer than shutting down the host from the console?
@olivierlambert No, I don't use any VM as an SR in XO. Other than the local storage SRs, DVD, removable media and such, the only SRs are hosted on a Synology array or on my UNRAID server.
-
Why does emergencyShutdown take a lot longer than shutting down the host from the console?
This weekend I was working on my servers and I needed to shut them down. I closed all the VMs, then from the physical console I used the UI to tell the host to shut down. It shut down in about 30 to 60 seconds and powered off.
In testing my NUT scripts, they shut down all the VMs then issue the host.emergencyShutdown command and it takes the hosts about 8 minutes to shut down.
Any reason for that difference? Is there a command I can issue through the xo-cli that would cause the faster shutdown?
Another advantage is that the shutdown command from the console actually turns the power off while emergencyShutdown shuts everything down but doesn't power down the hardware, at least it never has for me.
-
RE: XO instance UI unreachable during backups
@Danp Ah, thank you for that.
I need to restructure things a bit but I was already thinking I would do that. The issue is that this VM also runs NUT so it's the last VM running before shutting down the servers. I reduced the memory because it takes a LONG time to suspend a VM with 16GB RAM but doesn't take long to shut one down. Between the 8 to 10 minutes it takes for XCP-ng to shut down and the time it takes to suspend a VM with 16GB RAM, I don't think my batteries will last that long.
I'll have to move NUT into a leaner VM that doesn't handle backups. That's something I was thinking I would do anyway because if there was a power outage during a backup I don't think my NUT script would be able to do what it needs to do. Based on the cron job I run to make sure my xo-cli registration is good, the xo-cli stuff won't run when the VM is hammered like this.
Thanks for helping me understand why this happens and how to fix it.
-
XO instance UI unreachable during backups
I've noticed recently that when my backups are running, they totally slam the CPUs and the web UI is inaccessible. What can I do to improve this? Should I give the instance more than 4 CPU cores? I used to give the instance 16GB RAM but it never went higher than 2GB so I reduced it. Could that cause this?
I can still SSH into the instance but I have little way to know how much of the backup is complete or which backups are finished. I've seen this multiple times a week for the last month or so.
This goes on for hours and without the web UI I can't even gauge how much time might be left. I came in this weekend because I'm trying to improve my network setup to hopefully help with things like this but I can't shut down the servers and tear the network apart when XO is at some unknown point in the backup. Going to try to come back tomorrow and see if it's finished, well, I'll be smarter tomorrow and check the status from home first.
Currently running XO from source commit 1bc0f (two commits behind current due to taking a couple days off last week).
-
RE: NVIDIA Tesla M40 for AI work in Ubuntu VM - is it a good idea?
@gskger said in NVIDIA Tesla M40 for AI work in Ubuntu VM - is it a good idea?:
Nvidia RTX A2000 12GB
I am curious how your testing goes. That sounds like a great card. Not as expensive as the T4 so might be more reasonable for me to consider.
-
RE: Moving management network to another adapter and backups now fail
I'm bummed to hear that it isn't tolerant of changes. When I set up xcp originally, I gave it the first 10Gb port as the management interface and that's on the main LAN, no VLAN. Now I was wanting to move management off of the main LAN and onto a dedicated VLAN on the second 10Gb port. I've been nervous to make that change because I don't want to break something, it seems that concern was well founded. I was actually planning on posting today to ask about how to best move the management interface into a VLAN on a separate port.
Feels like I just have to live with everything on the same port and I won't be able to isolate the management or backup traffic like I want to. Maybe I could move the backups onto a separate VLAN or does that happen through the management interface? I think I need to dive back into the docs.