Performance
In the world of Information Technology (IT), performance is often used as a generic term of measure. This measure can be experienced somewhat differently depending on the role an individual has.
A user of a certain application will think more or less favorably of the application depending on how fast it responds, or flows, from his or her individual perspective and interactions. For a developer, an administrator, or someone else with a more technical insight of the application, performance can mean several things, so it will need to be defined and quantified in more detail. These expert roles will primarily need to distinguish between response time, throughput, and resource utilization efficiency.
Response time
Response time is normally measured in seconds (and is often defined with some prefix, such as milli or nano) and relates to the sum of time it takes to send a request to an operation, the execution time of the operation in a specific environment, and the time taken to respond to the requester that the operation has completed. The request, the execution of the operation, and the response are collectively called a roundtrip: a there-and-back-again trip.
A typical example, (depicted in the following diagram), is a user who has filled out a form on a web page. When the user sends the form by clicking on the Submit button and sends the data in the form, the timer starts ticking. As the data is received by a server, the data populates a JavaBean in a Java Servlet. From the servlet, subsequent calls to other components such as other servlets and EJBs will occur. Some data will then be persisted in a database. Other data can be retrieved from the same or other databases, and everything will be transformed into a new set of data in the shape of an HTML page that is sent back to the browser of the end user. As this response of data materializes at the user's end, the timer is stopped and the response time of the roundtrip can be revealed:
Deciding what, or how much, should be included in a use case for measuring the response time, will vary on the test or problem at hand.
The total response time for the roundtrip of a system can be defined as the total time it takes to execute a call from an end user through all layers of the network and code to a database or legacy system, and all the way back again. This is an important and common value that is often used in service level agreements (SLA). However, it provides a far from complete picture of the performance and health of a system. It is important to have a set of use cases with the measured response time from various points in the system that covers its common and most vital functionality and components. These will be extremely helpful during tuning and when changes or problems occur.
Note
It is important to remember that the roundtrip in a use case must be constant, in terms of start and stop points, between measurements. Changing the definition will render the measurements useless as they must be comparable!
So, what can affect the response time? In short, any change might affect the response time in a positive or negative way. Changes that you can perform as a technician in the software, hardware, and related infrastructure—code, configuration, hardware, network topology, and so on—of a system, will have their effect and not seldom other than you might expect.
With all these things static, there might still be changes that can make the response time vary. Here, we're mostly concerned with the load on the system. As the load of the system increases, its response times will eventually rise as the system throughput decreases.
When an application performs work, that work is often triggered by external or internal clients. This will require resources, such as CPU, volatile memory, network, and persistent storage. The level of this utilization is the load of the system. Load can be measured for one or several of these resources.
Take, for example, an increasing number of users and the interactions they do in a system. The increase in system transactions could eventually exhaust available connections of a pool. The excess of transactions would need to be queued for released resources or even timeout. This then turns into a bottleneck, where the system isn't able to handle the increasing number of transactions quickly enough.
Throughput
The load a system can manage is coupled to the measure of the system throughput. Throughput is generically measured in transactions per given time unit, where a transaction can be a task or operation, or even a set of operations that act as one.
The time unit is often measured in seconds but can be significantly smaller (milli, nano) or bigger (minutes, hours, and so forth). Commonly, throughput is denoted as transaction per second (TPS).
Here, a transaction or operation can be of any size such as a small computational function or a big business case spanning over several components or systems. The size is not of importance but the amount of operations is.
An alternative measure of throughput that is often used is the amount of data transferred per second, such as bytes per second. Just like an SLA often has one or more stated response times for a set of use cases, throughput normally also has TPS values for the system as a whole and possibly, some for subsystems/components that are important from a business perspective.
From a technical or IT-operations perspective, it is equally important to know what throughput certain systems or subsystems regularly have, and at what levels these might start to have problems and failures. These will be important indicators for when upgrades are in order.
With a focus on Java EE, it is important to remember that the Java EE specification, and therefore, most application servers that implement it, were designed for overall throughput and not guarantees about response times.
Utilization efficiency
Low response times and a high measure of throughput is normally what anyone wants from a system, as this will help in keeping customers happy and the business thriving. If money is of no concern, it might not be a big deal from a technical perspective to have a lot of hardware and using more resources than needed, as long as business is booming.
This poor efficiency will, however invoke unnecessary costs, be it to the environment, the employment force, development time, system management, administration, and so on. Sooner or later, a business that wants to be or stay profitable must have an efficient organization that can rely on cost-efficient IT departments and systems. This includes utilizing available resources in the most efficient way while keeping the customers happy. It's a balancing act that business and IT departments must do together.
Having a bit less computational force, memory, and IT staff available might (among many things) cause higher response times and worse throughput. Consequently, software must be more efficient on available hardware. To make the software efficient, we need to test and improve its performance.