Scalability is the attribute of a software system that allows it to handle an increased amount of work with proportionally more resources, while still maintaining the service level agreements (SLAs) that the system offered. A scalable system allows you to solve an increased amount of traffic/work by throwing money at the problem; that is, by adding more hardware. A non-scalable system simply cannot handle the load, even with increased resources.
For example, consider a backend software service that provides an API that is useful for an app. But it is also important that the API returns data within a guaranteed amount of time so that users don't experience latency or unresponsiveness at the app. A system not designed with scalability in mind will behave as shown here:
With an increase in traffic, the response times go through the roof! In contrast, with...