Applying back pressure
We discussed back pressure briefly in the last chapter. Without back pressure we cannot build a reasonable load-tolerant system with predictable stability and performance. In this section, we will see how to apply back pressure in different scenarios in an application. At a fundamental level, we should have a threshold of a maximum number of concurrent jobs in the system and, based on that threshold, we should reject new requests above a certain arrival rate. The rejected messages may either be retried by the client or ignored if there is no control over the client. When applying back pressure to user-facing services, it may be useful to detect system load and deny auxiliary services first in order to conserve capacity and degrade gracefully in the face of high load.
Thread pool queues
JVM thread pools are backed by queues, which means that when we submit a job into a thread pool that already has the maximum jobs running, the new job lands in the queue. The queue is...