Scalability and performance
How do we measure the scalability of a system? Let's take an example, and see how this is done.
Let's say our application is a simple report generation system for employees. It is able to load employee data from a database, and generate a variety of reports in bulk, such as pay slips, tax deduction reports, employee leave reports, and more.
The system is able to generate 120 reports per minute—this is the throughput or capacity of the system expressed as the number of successfully completed operations in a given unit of time. Let's say the time it takes to generate a report at the server side (latency) is roughly 2 seconds.
Let's say the architect decides to scale up the system by doubling the RAM on its server.
Once this is done, a test shows that the system is able to increase its throughput to 180 reports per minute. The latency remains the same at 2 seconds.
So, at this point, the system has scaled close to linear in terms of the memory...