cancelling frozen tasks and freezing NFS SR scan task after SR outage
-
Hello everyone,
I am testing (beating) a fresh install of community XO and XC-png.
After the last restore bug which left me with many stuck tasks, I decided to keep at it and give it more testing.On top of trying GUI "cancel task" I also tried ssh "xe task-cancel force=true uuid=taskuuid"
and both failed.Pulling power on hosts did the trick lol
Then when everything was back online I pulled the network plug on the NFS DR for a while and plugged it back which made the XO VM running on it also freeze as well as Async.SR.scan frozen task for that SR.
I have tried to "force shutdown" that XO VM which led to another frozen task (Async.VM.hard_shutdown)
and lastly i have tried "xe vm-reset-powerstate --force --multiple"
which worked after also being frozen for about 4-6 minutes.I am very new to xc-png and xo projects so my question is, did I miss anything? So far the only solutions I found to work are not solutions that are very "safe" and I wouldn't use them on production vms and/hosts.
Did anyone experience "frozen" unrebootable vms when they had NFS SR outage?
What troubleshooting steps do you recommend for me to try when this situation is recreated?Thanks a lot!
-
You might want to research switching your NFS mount from soft to hard.
-
@Danp Thanks for the tip bud!
I will search that topic and experiment some more.