The software life cycle
After a system has successfully gone through the last phases of software development (including performance testing, tuning, and acceptance testing), it will be deployed in production where its hopefully long and successful life will begin for real.
Upgrades
Over its lifetime, the system will most likely need to be upgraded for one reason or another. Upgrading might involve changes to the hardware, code, and configuration. Before this upgraded system is put into production, it should be as thoroughly tested as it was when it was first released in order to ensure that it will meet old, and any new, requirements. Naturally, this includes performance testing and tuning, when needed.
Metrics
During its life in production, a lot of things about the system will be of interest to the business, QA, and the different IT departments. Some important questions that need to be addressed among the different instances could be:
- Business: What use cases are actually utilized and to what grade? For what reasons are important functions not used? Are they avoided due to poor response times, perhaps? Does the system and its components really give the expected Return of investment (ROI) or can there be optimizations made?
- QA and IT: Are the error rates under control? Is the hardware utilization actually in alignment with what is estimated or is there need for more or less of something? What about the response times and usage of components, caches, and other software resources? What is the health of the system at any given time?
Information to answer these questions and more can quite easily be answered by the system itself. Some information might be available for extraction directly out of the box from the system or from underlying resources, while others might need to be enabled by configuration or by more or less advanced instrumentation in code.
The information is often extracted/collected by logging or monitoring through a protocol such as SNMP (mostly used by hardware and operating system services) or by using an API such as the Java Management Extension (JMX) API.
WildFly exposes information about quite a few resources through JMX, and instrumenting your application code to expose values using JMX is very easy and powerful. JMX can also be used externally from a system to give it instructions such as clearing a cache, starting/stopping a service, and so on.
Quantifiable information from and about a system, regardless of how it is retrieved, is called metric. The various metrics can be useful for a single situation such as a monitoring alert for something going wrong. However, it is also important to collect metrics over time as a proof of living up to SLA and be able to do various analysis related to the business, quality, or technology.
Performance testing and tuning is one of the areas that can benefit hugely from having metrics available. It is, for example, very valuable during the design, or modification, of test cases and setting realistic baselines.