-
@splastunov The DotHill certainly wasn't great for scaling out, because of the age of the firmware and the old-school treatment of the disk arrays.
We did deploy several disk expansion trays that were part populated at purchase as some level of expansion, and operated most of the array as a single underlying pool, which was then split up into iSCSI presented volumes.
So the limitations of the DotHill controllers and firmware we were using didn't address the storage pool scaling issue that CEPH / GPFS / etal are designed to.
However, some of the newer filer heads that treat their entire available disk array as a consolidated pool and support iSCSI would have that scalability, whilst keeping the hypervisor side relatively straight-forward.
CEPH etal are also in a much more mature place than they were back in the early 2010s too, so if I was doing it again, maybe we would go for a solution architecture based around that. Or XOSTOR, of course!
My intent was to present an iSCSI setup instance as something we found relatively bomb-proof, and post-initial configuration, relatively easy to scale. We provisioned 24 port A and B side network switches with jumbo frame support to give us the scalability to add further pools of hypervisors or storage, as required.
-
The typical hypervisor node was provisioned with at least 4 network ports, but most had 8 to allow trunking/resilience for every connection:
Management + live migration (Isolated)
World (Trunk)
iSCSI-A
iSCSI-BThe DotHills were 4 ports per controller, and the two controllers were configured as:
Management
iSCSI-A
iSCSI-B
SpareThe iSCSI-A and iSCSI-B were simple web-managed procurves with 24 ports each.
We had this scaled to 12 hypervisor nodes in 3 pools, all running off the same underlying dothill (dual redundant controller) hardware. We had provisioned some local SSD storage in each hypervisor for any nodes that were struggling for IOPs but ultimately never needed that.