Navigation

    XCP-ng

    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups

    CEPH FS Storage Driver

    Development
    8
    56
    1863
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • J
      jmccoy555 last edited by

      Hi @r1 Just ttied this on my pool of two servers, but no luck. Should it work? I've verified the same command on a host not in a pool and it works fine.

      [14:20 xcp-ng-bad-1 /]# xe sr-create type=nfs device-config:server=10.10.1.141,10.10.1.142,10.10.1.143 device-config:serverpath=/xcp device-config:options=name=xcp,secretfile=/etc/ceph/xcp.secret name-label=CephFS
      Error: Required parameter not found: host-uuid
      [14:20 xcp-ng-bad-1 /]# xe sr-create type=nfs device-config:server=10.10.1.141,10.10.1.142,10.10.1.143 device-config:serverpath=/xcp device-config:options=name=xcp,secretfile=/etc/ceph/xcp.secret name-label=CephFS host-uuid=c6977e4e-972f-4dcc-a71f-42120b51eacf
      Error code: SR_BACKEND_FAILURE_140
      Error parameters: , Incorrect DNS name, unable to resolve.,
      

      I've verified ceph.mount works, and I can manually mount with the mount command.

      /var/log/SMlog

      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369] lock: opening lock file /var/lock/sm/fa472dc0-f80b-b667-99d8-0b36cb01c5d4/sr
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369] lock: acquired /var/lock/sm/fa472dc0-f80b-b667-99d8-0b36cb01c5d4/sr
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369] sr_create {'sr_uuid': 'fa472dc0-f80b-b667-99d8-0b36cb01c5d4', 'subtask_of': 'DummyRef:|cf1a6d4a-fca9-410e-8307-88e3421bff4e|SR.create', 'args': ['0'], 'host_ref': 'OpaqueRef:5527aabc-8bd0-416e-88bf-b6a0cb2b72b1', 'session_ref': 'OpaqueRef:49495fa6-ec85-4340-b59d-ee1f037c0bb7', 'device_config': {'server': '10.10.1.141', 'SRmaster': 'true', 'serverpath': '/xcp', 'options': 'name=xcp,secretfile=/etc/ceph/xcp.secret'}, 'command': 'sr_create', 'sr_ref': 'OpaqueRef:f57d72be-8465-4a79-87c4-84a34c93baac'}
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369] _testHost: Testing host/port: 10.10.1.141,2049
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369] _testHost: Connect failed after 2 seconds (10.10.1.141) - [Errno 111] Connection refused
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369] Raising exception [108, Unable to detect an NFS service on this target.]
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369] lock: released /var/lock/sm/fa472dc0-f80b-b667-99d8-0b36cb01c5d4/sr
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369] ***** generic exception: sr_create: EXCEPTION <class 'SR.SROSError'>, Unable to detect an NFS service on this target.
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369]   File "/opt/xensource/sm/SRCommand.py", line 110, in run
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369]     return self._run_locked(sr)
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369]   File "/opt/xensource/sm/SRCommand.py", line 159, in _run_locked
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369]     rv = self._run(sr, target)
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369]   File "/opt/xensource/sm/SRCommand.py", line 323, in _run
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369]     return sr.create(self.params['sr_uuid'], long(self.params['args'][0]))
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369]   File "/opt/xensource/sm/NFSSR", line 198, in create
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369]     util._testHost(self.dconf['server'], NFSPORT, 'NFSTarget')
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369]   File "/opt/xensource/sm/util.py", line 915, in _testHost
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369]     raise xs_errors.XenError(errstring)
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369]
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369] ***** NFS VHD: EXCEPTION <class 'SR.SROSError'>, Unable to detect an NFS service on this target.
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369]   File "/opt/xensource/sm/SRCommand.py", line 372, in run
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369]     ret = cmd.run(sr)
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369]   File "/opt/xensource/sm/SRCommand.py", line 110, in run
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369]     return self._run_locked(sr)
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369]   File "/opt/xensource/sm/SRCommand.py", line 159, in _run_locked
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369]     rv = self._run(sr, target)
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369]   File "/opt/xensource/sm/SRCommand.py", line 323, in _run
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369]     return sr.create(self.params['sr_uuid'], long(self.params['args'][0]))
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369]   File "/opt/xensource/sm/NFSSR", line 198, in create
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369]     util._testHost(self.dconf['server'], NFSPORT, 'NFSTarget')
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369]   File "/opt/xensource/sm/util.py", line 915, in _testHost
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369]     raise xs_errors.XenError(errstring)
      Apr  4 14:32:02 xcp-ng-bad-1 SM: [25369]
      
      1 Reply Last reply Reply Quote 0
      • P
        peder last edited by

        @jmccoy555 said in CEPH FS Storage Driver:

        device-config:server=10.10.1.141,10.10.1.142,10.10.1.143

        Are you sure that's supposed to work?
        Try using only one IP-address and see if the command works as intended.

        1 Reply Last reply Reply Quote 0
        • J
          jmccoy555 @r1 last edited by

          @r1 said in CEPH FS Storage Driver:

          while for CEPHSR we can use #mount.ceph addr1,addr2,addr3,addr4:remotepath localpath

          Yep, as far as I know that is how you configure the failover, and as I said it works (with one or more IPs) from a host not in a pool.

          p.s. yes I also tried with one IP to just be sure.

          1 Reply Last reply Reply Quote 0
          • R
            r1 XCP-ng Team last edited by

            @jmccoy555 Can you try the latest patch?

            Before applying it restore to normal state
            # yum reinstall sm

            # cd /
            # wget "https://gist.githubusercontent.com/rushikeshjadhav/ea8a6e15c3b5e7f6e61fe0cb873173d2/raw/dabe5c915b30a0efc932cab169ebe94c17d8c1ca/ceph-8.1.patch"
            # patch -p0 < ceph-8.1.patch
            
            # yum install centos-release-ceph-nautilus --enablerepo=extras
            # yum install ceph-common
            

            Note: Keep secret in /etc/ceph/admin.secret with permission 600

            To handle the NFS port conflict, specifying port is mandatory e.g. device-config:serverport=6789

            Ceph Example:
            # xe sr-create type=nfs device-config:server=10.10.10.10,10.10.10.26 device-config:serverpath=/ device-config:serverport=6789 device-config:options=name=admin,secretfile=/etc/ceph/admin.secret name-label=Ceph

            NFS Example:
            # xe sr-create type=nfs device-config:server=10.10.10.5 device-config:serverpath=/root/nfs name-label=NFS

            1 Reply Last reply Reply Quote 0
            • J
              jmccoy555 last edited by

              @r1 Tried from my pool....

              [14:38 xcp-ng-bad-1 /]# xe sr-create type=nfs device-config:server=10.10.1.141,10.10.1.142,10.10.1.143 device-config:serverpath=/xcp device-config:serverport=6789 device-config:options=name=xcp,secretfile=/etc/ceph/xcp.secret name-label=CephFS host-uuid=c6977e4e-972f-4dcc-a71f-42120b51eacf
              Error code: SR_BACKEND_FAILURE_1200
              Error parameters: , not all arguments converted during string formatting,
              
              
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906] lock: opening lock file /var/lock/sm/a6e19bdc-0831-4d87-087d-86fca8cfb6fd/sr
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906] lock: acquired /var/lock/sm/a6e19bdc-0831-4d87-087d-86fca8cfb6fd/sr
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906] sr_create {'sr_uuid': 'a6e19bdc-0831-4d87-087d-86fca8cfb6fd', 'subtask_of': 'DummyRef:|572cd61e-b30c-48cb-934f-d597218facc0|SR.create', 'args': ['0'], 'host_ref': 'OpaqueRef:5527aabc-8bd0-416e-88bf-b6a0cb2b72b1', 'session_ref': 'OpaqueRef:e83f61b2-b546-4f22-b14f-31b5d5e7ae4f', 'device_config': {'server': '10.10.1.141,10.10.1.142,10.10.1.143', 'serverpath': '/xcp', 'SRmaster': 'true', 'serverport': '6789', 'options': 'name=xcp,secretfile=/etc/ceph/xcp.secret'}, 'command': 'sr_create', 'sr_ref': 'OpaqueRef:f77a27d8-d427-4c68-ab26-059c1c576c30'}
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906] _testHost: Testing host/port: 10.10.1.141,6789
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906] ['/usr/sbin/rpcinfo', '-p', '10.10.1.141']
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906] FAILED in util.pread: (rc 1) stdout: '', stderr: 'rpcinfo: can't contact portmapper: RPC: Remote system error - Connection refused
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906] '
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906] Unable to obtain list of valid nfs versions
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906] lock: released /var/lock/sm/a6e19bdc-0831-4d87-087d-86fca8cfb6fd/sr
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906] ***** generic exception: sr_create: EXCEPTION <type 'exceptions.TypeError'>, not all arguments converted during string formatting
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906]   File "/opt/xensource/sm/SRCommand.py", line 110, in run
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906]     return self._run_locked(sr)
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906]   File "/opt/xensource/sm/SRCommand.py", line 159, in _run_locked
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906]     rv = self._run(sr, target)
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906]   File "/opt/xensource/sm/SRCommand.py", line 323, in _run
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906]     return sr.create(self.params['sr_uuid'], long(self.params['args'][0]))
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906]   File "/opt/xensource/sm/NFSSR", line 222, in create
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906]     raise exn
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906]
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906] ***** NFS VHD: EXCEPTION <type 'exceptions.TypeError'>, not all arguments converted during string formatting
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906]   File "/opt/xensource/sm/SRCommand.py", line 372, in run
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906]     ret = cmd.run(sr)
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906]   File "/opt/xensource/sm/SRCommand.py", line 110, in run
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906]     return self._run_locked(sr)
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906]   File "/opt/xensource/sm/SRCommand.py", line 159, in _run_locked
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906]     rv = self._run(sr, target)
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906]   File "/opt/xensource/sm/SRCommand.py", line 323, in _run
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906]     return sr.create(self.params['sr_uuid'], long(self.params['args'][0]))
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906]   File "/opt/xensource/sm/NFSSR", line 222, in create
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906]     raise exn
              Apr  6 14:38:26 xcp-ng-bad-1 SM: [8906]
              

              Will try from my standalone host with a reboot later to see if the NFS reconnect issue has gone.

              1 Reply Last reply Reply Quote 0
              • R
                r1 XCP-ng Team last edited by

                @jmccoy555 I have rpcbind service running. Can you check on your ceph node?

                1 Reply Last reply Reply Quote 0
                • J
                  jmccoy555 last edited by

                  @r1 yep, that's it, rpcbind is needed. I have a very minimal Debian 10 VM hosting my Ceph (dockers) as is now the way with Octopus.

                  Also had to swap the host-uuid= with shared=true for it to connect to all hosts within the pool (might be useful for the notes).

                  Will test and also check that everything is good after a reboot and report back.

                  1 Reply Last reply Reply Quote 0
                  • J
                    jmccoy555 last edited by jmccoy555

                    Just to report back..... So far so good. I've moved over a few VDIs and not had any problems.

                    I've rebooted hosts and Ceph nodes and all is good.

                    NFS is also all good now.

                    Hope this gets merges soon so I don't have to worry about updates 😄

                    On a side note, I've also set up two pools, one of SSDs and one of HDDs using File Layouts to assign different directories (VM SRs) to different pools.

                    1 Reply Last reply Reply Quote 0
                    • R
                      r1 XCP-ng Team last edited by

                      @jmccoy555 glad to know. I don't have much knowledge on File Layouts but that looks good.

                      NFS edits won't be merged as that was just for POC. Working on a dedicated CephFS SR driver which hopefully won't be impacted due to sm or other upgrades. Keep watching this space.

                      1 Reply Last reply Reply Quote 1
                      • olivierlambert
                        olivierlambert XCP-ng Team Admin Vates Team last edited by

                        We can write a simple "driver" like we did for Gluster 🙂

                        1 Reply Last reply Reply Quote 2
                        • First post
                          Last post
                        XCP-ng Pro Support

                        XCP-ng Pro Support