XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    CEPH FS Storage Driver

    Scheduled Pinned Locked Moved Development
    86 Posts 10 Posters 51.0k Views 8 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • S Offline
      scboley @olivierlambert
      last edited by

      @olivierlambert what are the plans to elevate it? I have a feeling its really starting to gain traction in the storage world.

      1 Reply Last reply Reply Quote 0
      • olivierlambertO Offline
        olivierlambert Vates 🪐 Co-Founder CEO
        last edited by

        Very likely when the platform will be more modern (upgrading the kernel, platform and using SMAPIv3)

        S 1 Reply Last reply Reply Quote 0
        • S Offline
          scboley @olivierlambert
          last edited by

          @olivierlambert Ok I see even with 8.x you are still based on centos 7 when is it going up to 8 and I'd assume rocky would be the choice since the redhat streaming snafu cough cough.

          1 Reply Last reply Reply Quote 0
          • olivierlambertO Offline
            olivierlambert Vates 🪐 Co-Founder CEO
            last edited by

            No, not really, see https://xcp-ng.org/blog/2020/12/17/centos-and-xcpng-future/ (so no biggie)

            S 1 Reply Last reply Reply Quote 0
            • S Offline
              scboley @olivierlambert
              last edited by

              @olivierlambert so what are your plans for going to a streams 8 version which would give the updated kernel platform and hopefully soon after SMAPIv3? IO throughput on 8 over 7 is vastly superior and not near as big as the 6 to 7 changes were.

              1 Reply Last reply Reply Quote 0
              • olivierlambertO Offline
                olivierlambert Vates 🪐 Co-Founder CEO
                last edited by

                We don't use any kernel from CentOS project (nor the Xen package). We only use "the rest".

                So in order, it will be:

                • newer Xen version (easiest thing)
                • more recent kernel (some patches are needed at different places)
                • more recent user space/base distro (bigger work, but started already, like migrating all Python 2 stuff to Python 3!)

                SMAPIv3 is done in parallel and with XS teams too 🙂

                S 1 Reply Last reply Reply Quote 0
                • S Offline
                  scboley @olivierlambert
                  last edited by

                  @olivierlambert I know kernel.org maintains a lot of very new kernels for centos versions way newer than the locked and back ported mess the default kernels are so do you build off that base and change out the virtual parts and put them in your builds?

                  1 Reply Last reply Reply Quote 0
                  • olivierlambertO Offline
                    olivierlambert Vates 🪐 Co-Founder CEO
                    last edited by olivierlambert

                    We use an officially supported kernel (4.19 in LTS) and yes, sometimes we even backport stuff to it specifically for XCP-ng 🙂

                    A kernel isn't "linked" to a distro, it's all about the distro maintainers to choose which kernel they want. We do that for XCP-ng and XenServer (with Citrix).

                    In short: we make our own choices regarding Xen and the kernel, entirely outside CentOS project.

                    1 Reply Last reply Reply Quote 0
                    • S Offline
                      scboley
                      last edited by

                      Ok I've got this setup and I have a cluster serving the cephfs and here's my errors:
                      xe sr-create type=cephfs name-label=ceph device-config:server=172.30.254.23,172.30.254.24,172.30.254.25 device-config:serverport=6789 device-config:serverpath=/fsgw/xcpsr device-config:options=name=admin,secretfile=/etc/ceph/admin.secret
                      Error code: SR_BACKEND_FAILURE_111
                      Error parameters: , CephFS mount error [opterr=mount failed with return code 1],

                      S 1 Reply Last reply Reply Quote 0
                      • S Offline
                        scboley @scboley
                        last edited by

                        @scboley I figured it out finally. I used another key created by the cluster and got it to connect and mount the ceph.

                        S 1 Reply Last reply Reply Quote 0
                        • S Offline
                          scboley @scboley
                          last edited by

                          @olivierlambert adding another host to the pool and it fails to connect to the ceph shared storage:

                          Nov 21 09:57:48 xcp4-1 xapi: [debug||116026 /var/lib/xcp/xapi|SR.scan R:05af02328263|helpers] Waiting for up to 12.902806 seconds before retrying...
                          Nov 21 09:57:59 xcp4-1 xapi: [debug||116027 /var/lib/xcp/xapi||dummytaskhelper] task dispatch:session.logout D:79aefd48b34b created by task D:f32e5efdeec8
                          Nov 21 09:57:59 xcp4-1 xapi: [ info||116027 /var/lib/xcp/xapi|session.logout D:67032978d90c|xapi_session] Session.destroy trackid=c8a5d1fe7e932298b267edb677909a4b
                          Nov 21 09:57:59 xcp4-1 xapi: [debug||116028 /var/lib/xcp/xapi||dummytaskhelper] task dispatch:session.slave_login D:0366d884ee46 created by task D:f32e5efdeec8
                          Nov 21 09:57:59 xcp4-1 xapi: [ info||116028 /var/lib/xcp/xapi|session.slave_login D:b39585e0b07e|xapi_session] Session.create trackid=fc78c651286146c61742b0ca74212bb9 pool=true uname= originator=xapi is_local_superuser=true auth_user_sid= parent=trackid=9834f5af41c964e225f24279aefe4e49
                          Nov 21 09:57:59 xcp4-1 xapi: [debug||116029 /var/lib/xcp/xapi||dummytaskhelper] task dis

                          Nov 21 09:59:34 xcp4-1 xapi: [ info||116009 HTTPS 192.168.254.101->|Async.PBD.plug R:631710626e67|xapi_session] Session.destroy trackid=726402fee499e51bb72de7fd054a93d0
                          Nov 21 09:59:34 xcp4-1 xapi: [debug||116009 HTTPS 192.168.254.101->|Async.PBD.plug R:631710626e67|message_forwarding] Unmarking SR after PBD.plug (task=OpaqueRef:63171062-6e67-4cbd-b3be-91bb534a94bf)
                          Nov 21 09:59:34 xcp4-1 xapi: [error||116009 ||backtrace] Async.PBD.plug R:631710626e67 failed with exception Server_error(SR_BACKEND_FAILURE_12, [ ; mount failed with return code 32; ])
                          Nov 21 09:59:34 xcp4-1 xapi: [error||116009 ||backtrace] Raised Server_error(SR_BACKEND_FAILURE_12, [ ; mount failed with return code 32; ])
                          Nov 21 09:59:34 xcp4-1 xapi: [error||116009 ||backtrace] 1/1 xapi Raised at file (Thread 116009 has no backtrace table. Was with_backtraces called?, line 0
                          Nov 21 09:59:34 xcp4-1 xapi: [error||116009 ||backtrace]

                          olivierlambertO 1 Reply Last reply Reply Quote 0
                          • olivierlambertO Offline
                            olivierlambert Vates 🪐 Co-Founder CEO @scboley
                            last edited by

                            @scboley Storage related errors will be in SMlog

                            S 2 Replies Last reply Reply Quote 0
                            • S Offline
                              scboley @olivierlambert
                              last edited by

                              @olivierlambert

                              Nov 21 10:20:11 xcp4-1 SM: [30250] vhd=/var/run/sr-mount/51b80ad1-820d-c29a-1f9c-a50d6454f927/.vhd scan-error=-5 error-message='failure scanning target'
                              Nov 21 10:20:11 xcp4-1 SM: [30250] scan failed: -5
                              Nov 21 10:20:11 xcp4-1 SM: [30250] ', stderr: ''
                              Nov 21 10:20:12 xcp4-1 SM: [30250] ['/usr/bin/vhd-util', 'scan', '-f', '-m', '/var/run/sr-mount/51b80ad1-820d-c29a-1f9c-a50d6454f927/
                              .vhd']
                              Nov 21 10:20:12 xcp4-1 SM: [30250] FAILED in util.pread: (rc 5) stdout: 'vhd=/var/run/sr-mount/51b80ad1-820d-c29a-1f9c-a50d6454f927 scan-error=2 error-message='failure
                              scanning target'
                              Nov 21 10:20:12 xcp4-1 SM: [30250] vhd=/var/run/sr-mount/51b80ad1-820d-c29a-1f9c-a50d6454f927/*.vhd scan-error=-5 error-message='failure scanning target'
                              Nov 21 10:20:12 xcp4-1 SM: [30250] scan failed: -5

                              1 Reply Last reply Reply Quote 0
                              • S Offline
                                scboley @olivierlambert
                                last edited by

                                @olivierlambert nevermind I fixed it, I had forgot to add the public side of my ceph network onto the new host and then it all scanned correctly. Thanks for being responsive and a little education as it always helps.

                                S 1 Reply Last reply Reply Quote 1
                                • S Offline
                                  scboley @scboley
                                  last edited by

                                  @olivierlambert this is holding up quite well. I've pounded it good doing 3 exports and an import simultaneously and maintained about 60mb sec writes while doing reads and writes asynchronous as the imports and exports are on the same cephfs. Had an issue with the ceph repository moving to the pool when I created the pool and wound up hosing my pool manager trying to fix it but setting new pool manager and rebuilding the other and it has been flawless since. I've built the XO from source and it has come a long way from the last time I looked at it. We went with a small vendor out of canada called 45drives for the ceph hardware and deployment and they have a nice ansible derived delivery and config package if anyone is looking for a supported solution that is pretty slick.

                                  1 Reply Last reply Reply Quote 1
                                  • olivierlambertO Offline
                                    olivierlambert Vates 🪐 Co-Founder CEO
                                    last edited by

                                    So just to be sure I get it, you have a dedicated Ceph storage on dedicated hardware and you wanted to connect to it via XCP-ng without using NFS or iSCSI, right?

                                    S 1 Reply Last reply Reply Quote 0
                                    • S Offline
                                      scboley @olivierlambert
                                      last edited by scboley

                                      @olivierlambert that is right and its up and running using the driver you guys built and taking a good load no issues. Hats off to you and your team for making this work. I used nfs for exporting off my old ceph system but the importing is strictly on your native cephfs drivers. On my old one I was on xcp-ng 7.6 and had nfs sitting over the cephfs and just gigabit network and it worked but no live movement at all and now I can live migrate no problems with bonded dual 10gb fiber and single 10gb fiber to the hosts.

                                      1 Reply Last reply Reply Quote 0
                                      • olivierlambertO Offline
                                        olivierlambert Vates 🪐 Co-Founder CEO
                                        last edited by

                                        Okay good to know. We hope to do an even better native integration on SMAPIv3, but the hardest part isn't on writing the driver itself, but to improve the SMAPIv3 itself to support what's missing (live storage migration and so on).

                                        S 1 Reply Last reply Reply Quote 0
                                        • S Offline
                                          scboley @olivierlambert
                                          last edited by

                                          @olivierlambert so are you completely on your own release schedule now or are you still tied to citrix version releases? I've used you since 7.x versions and have had zero issues and still have one 6.5 I'm going to migrate a vm off of because it was on local storage and it's quite large and I didn't want to redo it because of the storage changes from 6 to 7 and then I'll have a pool with all 8.2 after I patch up my other 8.2 and then rebuild the 6.5 to 8.2.

                                          1 Reply Last reply Reply Quote 0
                                          • olivierlambertO Offline
                                            olivierlambert Vates 🪐 Co-Founder CEO
                                            last edited by

                                            We always try to work with Citrix ("XenServer" division now), to push things upstream and manage to get it merged.

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post