Communication failure
As well as shutting a service down, tests should introduce another common, realistic problem – communication failure. Like system outages, these are real scenarios: blocking communication simulates a network outage while all the services are still running.
Blocking communication allows for a more specific test – you can let most of the system continue as normal and just restrict access between two particular processes. This can test specific failure scenarios, although they may be less realistic. You should work through different options, varying between large realistic outages and small specific failures that are less realistic but test distinct failure modes within the system.
You can also alter when communication failures occur. During a message sequence, test what happens if you fail to receive a reply to each message. Like white-box testing state machines, this ensures there is correct error handling at every stage. This will need careful...