After installing ZFS, creating the zpools, installing GlusterFS, and creating the volumes, we have ended up with a solution with respectable performance that can sustain a node failure and still serve data to its clients.
For the setup, we used Azure as the cloud provider. While each provider has their own set of configuration challenges, the core concepts can be used on other cloud providers as well.
However, this design has a disadvantage. When adding new disks to the zpools, the stripes don't align, causing new reads and writes to yield lower performance. This problem can be avoided by adding an entire set of disks at once; lower read performance is mostly covered by the read cache on RAM (ARC) and the cache disk (L2ARC).
For GlusterFS, we used a dispersed layout that balances performance with high availability. In this three-node cluster setup, we can sustain...