Limiting service resources
So far, we have not really spent any time talking about service isolation with regard to the resources available to the services, but it is a very important topic to cover. Without limiting resources, a malicious or misbehaving service could be liable to bring the whole cluster down, depending on the severity, so great care needs to be taken to specify exactly what allowance individual service tasks should use.
The generally accepted strategy for handling cluster resources is the following:
- Any resource that may cause errors or failures to other services if used beyond intended values is highly recommended to be limited on the service level. This is usually the RAM allocation, but may include CPU or others.
- Any resources, specifically the hardware ones, for which you have an external limit should also be limited for Docker containers too (e.g. you are only allowed to use a specific portion of a 1-Gbps NAS connection).
- Anything that needs to run on a specific device...