NFS SR not mounting
-
So I have server Xen2 which will not connect to my TrueNas server via NFS. Xen1 will connect with no issues, as will any Linux machine.
I spun up a TrueNas vm made all the settings the same and Xen2 will connected.
I ssh into Xen2 and run
showmount -e <server ip>
I get all the exports.
I can run this as well with no errors.mount -t nfs <server ip>:</path/to/export> </mountpoint>
I have tried changing TrueNas to use v.3 and v.4
I have told the mount command to use 3 and 4.
I have changed the file permission to nobody:nogroup 777.However if I try to LS or CD into the mountpoint, the process will hang and I have to CTL C to kill the LS/CD.
The once I unmount it, I can then view or cd to the mount directory normally.
In XOA, When I add a SR, the Path list will populate, however when I select any Path, the little spinner will popup. I then have to wait about 10 mins for the SR.probe task to time-out with the error: "SR_BACKEND_FAILURE_73(, NFS mount error [opterr=Failed to detect NFS service on server 172.16.60.200], )"
Of course the server is running though.In XCP-ng Center, adding the SR will error with a "SM has thrown a generic python exception Check your settings and try again".
I have turned logging on for mountd, rpc.statd and rpc.lockd. the only item in the log is the the export request succeeded.
Xen2 was running 8.0, I upgraded it to 8.2 with the latest patches last night.
Xen1 is running 8.2 with latest Patches
XOA was updated last week.
TrueNas is version 12u4 with latest patches. in the server and Test VMIts been very frustrating as I have to wait 10 min for the SR.prob to time-out for testing any settings change.(no I can't kill the task, I've tried)
Any help would be greatly appreciated at this point. If there is any logfile that would help please let me know and I'll pop it up here.
Or do you think this is a TrueNas error and i should bug them?
Thank,
Rick -
Smells like an MTU issue (or network related issue)
-
@olivierlambert The networking is 10G to a unifi 10g switch. The MTU is set to 9000. I changed it to 1500 and it did not effect anything.
-
Check /var/log/SMlog on your host to see if you have a more interesting message.
I'm not aware of any specific bug regarding NFS, so I assume your environment/setup is not correctly configured (but it's hard to tell where)
-
Jun 21 09:41:43 xen2 SM: [24970] sr_create {'sr_uuid': 'bddee608-8dc2-845d-a8fb-b0c1fc70e469', 'subtask_of': 'DummyRef:|3b143df9-8faa-425e-bbda-6eb11b7855e4|SR.create', 'args': ['0'], 'host_ref': 'OpaqueRef:52f78c5e-b94c-423e-bc70-1d233282575c', 'session_ref': 'OpaqueRef:664daad6-a0c8-4$ Jun 21 09:41:43 xen2 SM: [24970] _testHost: Testing host/port: 172.16.60.200,2049 Jun 21 09:41:43 xen2 SM: [24970] ['/usr/sbin/rpcinfo', '-s', '172.16.60.200'] Jun 21 09:42:43 xen2 SM: [24970] FAILED in util.pread: (rc 1) stdout: '', stderr: 'rpcinfo: can't contact rpcbind: : RPC: Timed out
Looks like rpcbind is having issues, although im not sure why, there shouldn't be anything blocking that TrueNAS or Xen2.
Both TrueNas and Xen2 were rebooted over the weekend.
-
Any idea @fohdeesha ?
-
@olivierlambert Ok thought i would try SMB for kicks. Still didnt work. Which is weirder, but it did give me a more detailed error.
Jun 21 09:56:18 xen2 SM: [12450] dconf contains username Jun 21 09:56:18 xen2 SM: [12450] dconf contains secret Jun 21 09:56:18 xen2 SM: [12450] CIFS user = ******* Jun 21 09:56:18 xen2 SM: [12450] Obtained CIFS password via secret Jun 21 09:56:18 xen2 SM: [12450] Obtained CIFS plain text password Jun 21 09:56:18 xen2 SM: [12450] dconf contains username Jun 21 09:56:18 xen2 SM: [12450] dconf contains secret Jun 21 09:56:18 xen2 SM: [12450] ['mount.cifs', '\\\\silo10g\\test', '/var/run/sr-mount/SMB/silo10g/test/a4f93a75-e749-f043-b879-58d69c0becc0', '-o', 'cache=loose,vers=3.0,actimeo=0,domain=***************'] Jun 21 09:56:18 xen2 SM: [12450] FAILED in util.pread: (rc 32) stdout: '', stderr: 'mount error(5): Input/output error Jun 21 09:56:18 xen2 SM: [12450] Refer to the mount.cifs(8) manual page (e.g. man mount.cifs) Jun 21 09:56:18 xen2 SM: [12450] ' Jun 21 09:56:19 xen2 SM: [12450] ['mount.cifs', '\\\\silo10g\\test', '/var/run/sr-mount/SMB/silo10g/test/a4f93a75-e749-f043-b879-58d69c0becc0', '-o', 'cache=loose,vers=3.0,actimeo=0,domain=***************'] Jun 21 09:56:19 xen2 SM: [12450] FAILED in util.pread: (rc 32) stdout: '', stderr: 'mount error(5): Input/output error Jun 21 09:56:19 xen2 SM: [12450] Refer to the mount.cifs(8) manual page (e.g. man mount.cifs) Jun 21 09:56:19 xen2 SM: [12450] ' Jun 21 09:56:19 xen2 SM: [12450] Raising exception [111, SMB mount error [opterr=mount failed with return code 32]] Jun 21 09:56:19 xen2 SM: [12450] lock: released /var/lock/sm/a4f93a75-e749-f043-b879-58d69c0becc0/sr Jun 21 09:56:19 xen2 SM: [12450] ***** generic exception: sr_create: EXCEPTION <class 'SR.SROSError'>, SMB mount error [opterr=mount failed with return code 32] Jun 21 09:56:19 xen2 SM: [12450] File "/opt/xensource/sm/SRCommand.py", line 110, in run Jun 21 09:56:19 xen2 SM: [12450] return self._run_locked(sr) Jun 21 09:56:19 xen2 SM: [12450] File "/opt/xensource/sm/SRCommand.py", line 159, in _run_locked Jun 21 09:56:19 xen2 SM: [12450] rv = self._run(sr, target) Jun 21 09:56:19 xen2 SM: [12450] File "/opt/xensource/sm/SRCommand.py", line 323, in _run Jun 21 09:56:19 xen2 SM: [12450] return sr.create(self.params['sr_uuid'], long(self.params['args'][0])) Jun 21 09:56:19 xen2 SM: [12450] File "/opt/xensource/sm/SMBSR", line 246, in create Jun 21 09:56:19 xen2 SM: [12450] raise xs_errors.XenError('SMBMount', opterr=exc.errstr)
Looks like some crazy mount error. although I'm at a loss what to check after this. I'm about ready to nuke the TrueNas and start it from scratch.
BTW, thanks for the help so far, I really appreciate it!Rick
-
@thorcraftit seems like perhaps a permissions error, what does your share properties on truenas look like for this host? Is the allowed IP or denied IP boxes populated at all? Attach a screenshot if you can. Here's what mine looks like:
Note that the mapall user and group must have permissions on the shared folder, I typically create a user on the freenas system just for xen shares, and give it ownership of the folder/dataset that's being shared. this ensures you don't have any weird permissions issues. I think in your case the key boxes to examine are the authorized networks however.
What does your Services > NFS > Settings look like? here's mine:
Lastly and perhaps most importantly, can you run "yum install nmap" and then "nmap 172.16.60.200" on both a working host, and then this non-working host? Paste the results of both here - if they differ, for instance if the non-working server doesn't see any RPC services running like I suspect, that will heavily point towards a firewall/network/ACL issue