So, I am making some progress with my driver, but as I start understanding SM API more, I am also discovering several flaws in its implementation that limits its ability to adapt to advanced storages such as CEPH - and probably main reason why CEPH was never integrated properly and why its existing community-made integration are so overly and unnecessarily complex.
I somewhat hope that SMAPIv3 solves this problem, but I am afraid it doesn't. For now I added this comment into header of that SMAPIv1 driver, which explains the problem:
# Important notes:
# Snapshot logic in this SM driver is imperfect at this moment, because the way snapshots are implemented in Xen
# are fundamentally different from how snapshots work in CEPH, and sadly Xen API doesn't let the SM driver
# implement this logic itself and instead forces its own (somewhat flawed and inefficient logic)
# That means when the "revert to snapshot" is executed via admin UI - this driver is not notified about it in any way.
# If it was, it would be able to execute a trivial "rbd rollback" CEPH action which would result in instant rollback
# Instead Xen decides to create a clone from the snapshot by calling clone(), which creates another RBD
# that is depending on parent snapshot, which is depending on original RBD image we wanted to rollback.
# Then it calls delete on the original image which is parent of this entire new hierarchy.
# This image is now impossible to delete, because it has a snapshot. Which means we need to perform a background
# flatten operation, that performs physical 1:1 copy of entire image to the new clone and then destroys the snapshot
# and original image.
# This is brutally inefficient in comparison to native rollback (as in hours instead of seconds), but it seems with
# current SM driver implementation it's not possible to do this efficiently, this requires a fix in SM API
Basically - XAPI has its own logic for how snapshots are created and managed and it forces this logic in exact same implementation on everyone - even in case that underlying storage contains its own snapshot mechanisms that can be used instead that would be FAR more efficient. Because this logic is impossible to override, hook to, or change, there isn't really any efficient way to implement snapshot logic on CEPH level.
My suggestion - instead of forcing some internal snapshot logic on SM drivers, abstract it away, just send high level requests to SM drivers such as:
- Create a snapshot of this VDI
- Revert this VDI to this snapshot
I understand for many SM drivers this could be a problem as same logic would need to be repeated in them, maybe you can make it so that if SM doesn't implement its own snapshot logic, you fallback to that default one that is implemented now?
Anyway - the way SM subsystem (at least V1, but I suspect V3 isn't any better in this) works, you can't utilize storage-level efficient features - instead you are reinventing the wheel and implementing same logic in software in extremely inefficient way.
But maybe I just overlook something, that's just how it appears to me, as there is absolutely no "revert to snapshot" overridable entry point in SM right now.