XOA Create VM and Delete VM Struggling..... Tasks Getting Stuck.....?
-
Bit baffled by this one.....
I've spun up a lot of VMs historically with no problems at all.
Yet when I've tried recently on the XOA from Sources (installed on a VM on the Bare Metal Host)
It's failed on 3x attempted VM Creations, and their subsequent Deletions have also failed to kick in.
Initially though this could be a network issue as backups were taking place at the same time, but even after those finished (well, 99% finished according to the Tasks) - All of these basic tasks were just getting stuck on the process.
VM.start: 54% (Stuck at....)
Async.VM.hard_shutdown (on HOST) 33% (Stuck at....)
VBD.unplug (on HOST) 0% (Stuck at.....)
Async.VM.destroy (on HOST) 0% (Stuck at.....)
As I mentioned, a handful of backups are stuck at 99%, so not sure if that is somehow related and/of blocking things - But can't see how it would be as the backups are only for currently running VMs)
It's all a bit odd.....
And while odd is normal working in tech, I'm getting zero information from XOA about what the issue is, and ultimately how to solve it, and a quick Google of "Turn the Host off and back on again isn't really an option in the real world"
Any ideas?
Regards,
Michael -
XO from sources means you are probably in XO6, have you tried from the XO5 link or tried from XO-lite?
Is the storage where you are trying to put it properly connected and working?
I haven't had to create a VM since version 6 came out, but I thought there was a thread mentioning some difficulty making a new VM from xo6.
-
@Greg_E Looks like I'm still on XOA 5 as I can see the "Try XOA 6" in the side menu.
Just tried to create a VM via XCP-ng Centre, getting the same issues.
Can't create or Delete VMs.
I've SSH'd into the HostOS, and I can successfully ping the Storage Repository.
Then just tried removing the existing Storage Repository where the ISOs are hosted, and it wouldn't disconnect, just hanging again.
Tried to create a new Storage Repository (same as the old ones for the ISOs) and that wouldn't connect.
Certainly feels like something SR related.
Just noticed too that one of the backup jobs that sends the backups to a different SR on the same machine where the ISOs live also failed.
Something has happened yesterday which is making the HostOS now no longer able to connect to the SR.
I've just tried disabling Norton Firewall on the machine where the SR live, as that has got in the way historically, but that made no difference.
Also just checked the Windows CIFS / SMB Settings and noticed that one of the three options was unchecked (previously all three where checked and I know that has not been disabled manually) - So will give a reboot a go to see if that solve things.
Bit baffled what else to try to get the SR re-connected as literally nothing manually has changed, so it has to be some Windows or Norton janky auto update that has kicked in and screwed things up.
Hmmm.....
-
Keep us posted!
-
Hmm.....
yum install -y nc nc -zv {IP of Windows Machine where SRs live} 445 Ncat: Version 7.50 ( https://nmap.org/ncat ) Ncat: Connected to {IP of Windows Machine where SRs live}:445. Ncat: 0 bytes sent, 0 bytes received in 0.01 seconds.Is that suggesting with 0 bytes sent that it's not getting out from XCP-ng? So never actually hitting the Windows machine where the SMB SR lives?
What could be causing that?
-
Northing weird getting blocked on XCP-ng firewall for outbound connections
iptables -L Chain INPUT (policy ACCEPT) target prot opt source destination xapi_nbd_input_chain tcp -- anywhere anywhere tcp dpt:nbd ACCEPT gre -- anywhere anywhere ACCEPT tcp -- anywhere anywhere tcp dpt:rtps-dd-mt RH-Firewall-1-INPUT all -- anywhere anywhere ACCEPT tcp -- anywhere anywhere tcp dpt:rtps-dd-mt Chain FORWARD (policy ACCEPT) target prot opt source destination RH-Firewall-1-INPUT all -- anywhere anywhere Chain OUTPUT (policy ACCEPT) target prot opt source destination xapi_nbd_output_chain tcp -- anywhere anywhere tcp spt:nbd Chain RH-Firewall-1-INPUT (2 references) target prot opt source destination ACCEPT all -- anywhere anywhere ACCEPT icmp -- anywhere anywhere icmp any ACCEPT udp -- anywhere anywhere udp dpt:bootps ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED ACCEPT udp -- anywhere anywhere ctstate NEW udp dpt:ha-cluster ACCEPT tcp -- anywhere anywhere ctstate NEW tcp dpt:ssh ACCEPT tcp -- anywhere anywhere ctstate NEW tcp dpt:http ACCEPT tcp -- anywhere anywhere ctstate NEW tcp dpt:https ACCEPT tcp -- anywhere anywhere tcp dpt:21064 ACCEPT udp -- anywhere anywhere multiport dports hpoms-dps-lstn,netsupport REJECT all -- anywhere anywhere reject-with icmp-host-prohibited Chain xapi_nbd_input_chain (1 references) target prot opt source destination REJECT all -- anywhere anywhere reject-with icmp-port-unreachable Chain xapi_nbd_output_chain (1 references) target prot opt source destination REJECT all -- anywhere anywhere reject-with icmp-port-unreachableAll the above is just out of the box default, no additional configurations.
-
Various similar issues I've spotted on Reddit re. Windows updates breaking SMB Mounts over the last 12-18 months with updates. Impossible to track down the specifics so not even going to bother.
Then found this issue too, https://xcp-ng.org/forum/topic/10545/long-delays-at-46-when-creating-or-starting-a-new-vm
Which was solved by moving from SMB Mount to NFS Mount.
That isn't going to work with my infrastructure setup as the Windows machine is on a Home Edition, and NFS is only supported on Business or Enterprise edition.
As it stands right now, and given nothing I've tried to get the SMB Mount to work again, feels like I'm at a dead end to get this fixed. Looking like I'll need to re-address the architecture for storing the ISOs and one of the backup routes (thankfully there are many types of backups in place for redundancy)
Not the end of the world, but a tad annoying which feels like it's probably a Windows update that broke this with one of the almost daily automated updates/shutdowns/restarts that kicks in. Only started happening yesterday, all previous backups via the SR were working fine until then. That's the only thing I can put this down to really.
-
Bit more info.....
The issue reading up on this appears to be related to SMB v1, which is commonly referred to as "Insecure" (More like Less Secure if we're being accurate.....)
I've just run the following on the Windows machine where the SMB Shares live via Powershell (as an Admin)
Get-WindowsOptionalFeature -Online -FeatureName SMB1ProtocolWhich has clarified that this is currently running, which throws a spanner in the works as I was expecting that to be Disabled, which would back up the theory in the previous comment.
FeatureName : SMB1Protocol DisplayName : SMB 1.0/CIFS File Sharing Support Description : Support for the SMB 1.0/CIFS file sharing protocol and the Computer Browser protocol. RestartRequired : Possible State : Enabled CustomProperties : ServerComponent\Description : Support for the SMB 1.0/CIFS file sharing protocol and the Computer Browser protocol. ServerComponent\DisplayName : SMB 1.0/CIFS File Sharing Support ServerComponent\Id : 487 ServerComponent\Type : Feature ServerComponent\UniqueName : FS-SMB1 ServerComponent\Deploys\Update\Name : SMB1Protocol -
SMB v2 + 3 are also turned on;
Get-SmbServerConfiguration | Select EnableSMB2ProtocolWhich outputs
EnableSMB2Protocol ------------------ True