CEPH FS Storage Driver

usbalbin

@olivierlambert With (experimental) CephFS driver added in 8.2.0. Reading the documentation

WARNING

This way of using Ceph requires installing ceph-common inside dom0 from outside the official XCP-ng repositories. It is reported to be working by some users, but isn't recommended officially (see Additional packages). You will also need to be **careful about system updates and upgrades.**

are there any plans to put ceph-common into the official XCP-ng repositories to make updates less scary?

I have been testing this for almost 8 months now. First with only one or two VMs, now with about 8-10 smaller VMs. The ceph cluster itself is running as 3 VMs(themself not stored on CephFS) with SATA-controllers passedthoughed on 3 different hosts.

This has been working great with the exception of in situations when the XCP-ng hosts are unable to reach the ceph cluster. At one time the ceph nodes had crashed(my fault) but I was unable to restart them because all VM operations were blocked taking forever without ever suceeding eventhough the ceph nodes them selfes are not stored on the inaccessable SR. To me it seems the XCP-ng hosts are endlessly trying to connect never timing out which makes them non-responsive.

olivierlambert

Hi,

Short term: no. Longer term when we have SMAPIv3: very likely, yes, at least as a community driver.

What about perfs? Can you describe more your setup and config?

scboley

I'm about to deploy the latest ceph on 45drives hardware and will use 8.2 with finally a decent amount of network backbone to start building a new virtual world. I've been using nfs over cephfs on single gigabit public and single gigabit privates and it performs ok for what we do but cannot do any failover or moving of virtuals live. This should alleviate those issues as well as give me lots more options for snapshots and recovery.

So in latest 8.2 patches and updates what do I need to do other than install ceph-common? Will the ceph repository show up in the xcp-ng center or orchestra?

I've had power outages and UPS failures and this stuff just self heals and the only issue has to be with mounting the cephfs after boot and then restarting nfs to recover the nfs repositories and it just comes up. Its scalable and way less trouble to deal with than fiber sans or iscsi.

jmccoy555

@scboley https://xcp-ng.org/docs/storage.html#cephfs

Once you do the manual stuff it will show up like any other SR in Xen Orchestra etc.

scboley

@jmccoy555 should I go ahead and update the 8.2 to latest patches first before doing this? I have yet to run a single patch on xcp-ng over many years and is it straightforward?

jmccoy555

@scboley I would assume so, but I can't say yes. I don't think it was available before 8.2 without following the above.

scboley

@jmccoy555 I'm talking about 8.2.1 and 8.2.2 and so forth. Is that a simple yum update on the system? I've just left it default version and never updated I was on 7.6 for a long time and just took it all to 8.2 with one straggler xenserver 6.5 still in production. I've loved the stability I've had with xcp-ng not even messing with it at all.

msgerbs

@scboley Yes, it's mostly just a matter of doing yum update: https://xcp-ng.org/docs/updates.html

scboley

Ok I see the package listed in the documentation is still nautilus has that been updated to any newer ceph versions as of yet? @olivierlambert

olivierlambert

I don't think we updated anything on that aspect, since Ceph isn't a "main" supported SR

scboley

@olivierlambert what are the plans to elevate it? I have a feeling its really starting to gain traction in the storage world.

olivierlambert

Very likely when the platform will be more modern (upgrading the kernel, platform and using SMAPIv3)

scboley

@olivierlambert Ok I see even with 8.x you are still based on centos 7 when is it going up to 8 and I'd assume rocky would be the choice since the redhat streaming snafu cough cough.

olivierlambert

No, not really, see https://xcp-ng.org/blog/2020/12/17/centos-and-xcpng-future/ (so no biggie)

scboley

@olivierlambert so what are your plans for going to a streams 8 version which would give the updated kernel platform and hopefully soon after SMAPIv3? IO throughput on 8 over 7 is vastly superior and not near as big as the 6 to 7 changes were.

olivierlambert

We don't use any kernel from CentOS project (nor the Xen package). We only use "the rest".

So in order, it will be:

newer Xen version (easiest thing)
more recent kernel (some patches are needed at different places)
more recent user space/base distro (bigger work, but started already, like migrating all Python 2 stuff to Python 3!)

SMAPIv3 is done in parallel and with XS teams too

scboley

@olivierlambert I know kernel.org maintains a lot of very new kernels for centos versions way newer than the locked and back ported mess the default kernels are so do you build off that base and change out the virtual parts and put them in your builds?

olivierlambert

We use an officially supported kernel (4.19 in LTS) and yes, sometimes we even backport stuff to it specifically for XCP-ng

A kernel isn't "linked" to a distro, it's all about the distro maintainers to choose which kernel they want. We do that for XCP-ng and XenServer (with Citrix).

In short: we make our own choices regarding Xen and the kernel, entirely outside CentOS project.

scboley

Ok I've got this setup and I have a cluster serving the cephfs and here's my errors:
xe sr-create type=cephfs name-label=ceph device-config:server=172.30.254.23,172.30.254.24,172.30.254.25 device-config:serverport=6789 device-config:serverpath=/fsgw/xcpsr device-config:options=name=admin,secretfile=/etc/ceph/admin.secret
Error code: SR_BACKEND_FAILURE_111
Error parameters: , CephFS mount error [opterr=mount failed with return code 1],

scboley

@scboley I figured it out finally. I used another key created by the cluster and got it to connect and mount the ceph.