We already adopted a few tools. We have metrics stored in Prometheus. We deployed Swarm Listener that propagates information to Prometheus. We have Alertmanager that receives notifications whenever a certain threshold is reached. While those tools allowed us to move forward towards our goals, they are not enough. Now we need to figure out what to do with those alerts. Receiving them in Slack is only the last resort. We need a tool that will be capable of receiving an alert, process the data that comes with it, apply certain logic, and decide what to do.
In most cases, self-adaptation is all about scaling. Since we are limiting ourselves to services, the system, when it receives an alert, needs to be capable of deciding whether to scale up, or down, or do nothing. We need a tool that can accept remote requests, that is capable of running code that...