VDI_COPY_FAILED: Failed to find parent
-
@olivierlambert I have not been able to create a successful backup yet, so backups are a moot point right now. I already have a support ticket open (#7732723).
All hosts are shutdown now so I can run the memory tests, so once they complete on all three hosts, I'll boot them up and do what you've suggested and report back.
Now, for my own edification, what is the significance of this file
OLD_26c304ef-36d2-42ca-9930-9aa76ba933e8.vhd
, and why do you think moving it will have a positive effect? -
Because it seems to be a leftover from a failed coalesce, at least it seems you have a problem with a chain. Maybe it's this chain and moving (not removing just in case) will make it right. The best solution would be to check which UUID reported in the error is in which chain to remove that broken chain and rescan the SR.
-
@olivierlambert Roger that. The memtest is still running, and I've left the office to go get my kids from school. I will be returning to the lab in a couple of hours, and if the memtest is done on host1, I'll bring up the pool and move
OLD_26c304ef-36d2-42ca-9930-9aa76ba933e8.vhd
and report back. -
@olivierlambert As requested, I moved the VDI named
OLD_26c304ef-36d2-42ca-9930-9aa76ba933e8.vhd
out of the source SR30d873a3-3523-1952-e923-bcaa02b9255a
and then rescanned the SR. I then retried the copy, both from the XOA web UI and from the master host (VMH01) usingxe vdi-copy
. The result was the same failure; that the parent couldn't be found.Is there a place I can find out what the UUID is of the parent VDI that seems to be missing? Maybe it's still on the NFS server but has somehow gotten corrupted?
Also, I'd like to point out that I have an unresolved issue with that particular SR - where some of the VDIs keep disappearing and reappearing (Ticket#7729741). Perhaps could be related?
-
So now you have to figure out the chain to find which VDI as a parent pointing to this missing VDI. Hard to tell if i it's related, but you clearly have an issue on this storage (why is not easy to answer).
@Danp I think I remember a tool we could use to display the chain?
-
@olivierlambert Yes, the tool is called
xapi-explore-sr
. Installation and usage details can be found at https://github.com/vatesfr/xen-orchestra/tree/master/packages/xapi-explore-sr. -
So now you have to figure out the chain to find which VDI as a parent pointing to this missing VDI.
@olivierlambert How do I figure out the chain of a VDI? I've looked over the entire XOA UI to see if there's a way of seeing who the parent of any VDI is, but I don't see anything obvious. Must this be done via CLI? If so, how?
Also, I don't know enough about XCP-ng or XO to confidently agree that it's the storage, but I can certainly see why you think so. That said, I have another SR - targeting the same NFS server - where I'm not having these issues.
I want to take advantage of this situation to help you figure out if there is a problem with the underlying code. I could very easily blow away the SR I'm having issues with, but that will rob us of this opportunity.
-
@Danp and @olivierlambert
Haha, looks like we were both typing at the same time. I'll checkout the link provided and educate myself on how to use the tool, thanks again.
-
So problem, I'm in an airgapped environment. I cannot pull packages from the internet directly. How can I get this utility installed?
Also, I figure that this utility needs to be run on XOA, since I don't see NPM installed on the hosts?
-
In theory, it could be installed on any machine. Is there a way to get it installed on a laptop from Internet then plugged in your airgap network?
-
@olivierlambert Technically, yes, I suppose I could do that. However, that would not mimic what a SysAdmin would have to do, should they run into this trouble in production.
Is there no other alternative to get this utility, perhaps as a .deb package that I can manually install on XOA?
Lastly, I performed the following testing just now:
-
Created a new VDI on a known working VM and put the VDI on the problematic SR (let's call it NFS-DS1).
-
I then attempted to migrate the newly created VDI to the known good SR (NFS-DS2); it failed with error
VM_LACKS_FEATURE
. -
So I crested another VDI, attached to the same VM, but this time I put it on SR NFS-DS2 (the good one) and then attempted to migrate it to SR NFS-DS21; IT WORKED.
-
So I retried migrating it back to NFS-DS2, and it worked as well.
-
Finally, I went back to the original new VDI I started with, and attempted to migrate it from SR NFS-DS21 to NFS-DS2; it worked!
SO, it seems the issue remains with the VDIs that seem to have lost their parent.
P.S. I haven't noticed the VDIs disappearing from SR NFS-DS1 for over an hour now.
-