High-availability, scalability and capacity planning
Highly available systems must also be scalable. The load on most complicated distributed systems can vary dramatically based on time of day, weekday vs weekend, seasonal effects, marketing campaigns and many other factors. Successful systems will have more users over time and accumulate more and more data. That means that physical resources of the clusters - mostly nodes and storage - will have to grow over time too. If your cluster is under provisioned it will not be able to satisfy all the demand and it will not be available because requests will time out or be queued up and not processed fast enough.
This is the realm of capacity planning. One simple approach is to over-provision your cluster. Anticipate the demand and make sure you have enough of a buffer for spikes of activity. But, this approach suffers from several deficiencies:
- For highly dynamic and complicated distributed systems itβs difficult to predict the demand...