Division into smaller units
The main technique for software architecture is to divide the whole system into smaller elements and describe how they interact with each other. Each smaller element, or unit, should have a clear function and interface.
For example, a common architecture for a typical system could be a web service architecture composed of:
- A database that stores all the data in MySQL
- A web worker that serves dynamic HTML content written in PHP
- An Apache web server that handles all the web requests, returns any static files, like CSS and images, and forwards the dynamic requests to the web worker
Figure 1.1: Typical web architecture
This architecture and tech stack has been extremely popular since the early 2000s and was called LAMP, an acronym made from the different open source projects involved: (L)inux as an operating system, (A)pache, (M)ySQL, and (P)HP. Nowadays, the technologies can be swapped for equivalent ones, like using PostgreSQL instead of MySQL or Nginx instead of Apache, but still using the LAMP name. The LAMP architecture can be considered the default starting point when designing web-based client/server systems using HTTP, creating a solid and proven foundation to start building a more complex system.
As you can see, every different element has a distinct function in the system. They interact with each other in clearly defined ways. This is known as the Single-Responsibility principle. When presented with new features, most use cases will fall clearly within one of the elements of the system. Any style changes will be handled by the web server and dynamic changes by the web worker. There are dependencies between the elements, as the data stored in the database may need to be changed to support dynamic requests, but they can be detected early in the process.
We will describe this architecture in greater detail in Chapter 9.
Each element has different requirements and characteristics:
- The database needs to be reliable, as it stores all the data. Maintenance work like backup- and recovery-related work will be important. The database won't be updated very frequently, as databases are very stable. Changes to the table schemas will be made through restarts in the web worker.
- The web worker needs to be scalable and not store any state. Instead, any data will be sent and received from the database. This element will be updated often. Multiple copies can be run, either in the same machine or in multiple ones to allow horizontal scalability.
- The web server will require some changes for new styling, but that won't happen very often. Once the configuration is properly set up, this element will remain quite stable. Only one web server per machine is required, as it's capable of load-balancing between multiple web workers.
As we can see, the work balance between elements is very different, as the web worker will be the focus for most new work, while the other two elements will be much more stable. The database will require specific work for us to be sure that it's in good shape, as it's arguably the most critical element of the three. The other two can recover quickly if there's a problem, but any corruption in the database will generate a lot of problems.
The most critical and valuable element of a system is almost always the stored data.
The communication protocols are also unique. The web worker talks to the database using SQL statements. The web server talks to the web worker using a dedicated interface, normally FastCGI or a similar protocol. The web server communicates with the external clients via HTTP requests. The web server and the database don't talk to each other.
These three protocols are different. This doesn't have to be the case for all systems; different components can share the same protocol. For example, there can be multiple RESTful interfaces, which is common in microservices.
In-process communication
The typical way of looking at different units is as different processes running independently, but that's not the only option. Two different modules inside the same process can still follow the Single-Responsibility principle.
The Single-Responsibility principle can be applied at different levels and is used to define the divisions between functions or other blocks. So, it can be applied in smaller and smaller scopes. It's turtles all the way down! But, from the point of view of architecture, the higher-level elements are the most important, as it's the higher level that defines the structure. Knowing how far to go in terms of detail is clearly important, but when taking an architectural approach, it is better to err on the "big picture" side rather than the "too much detail" one.
A clear example of this would be a library that's maintained independently, but it could also be certain modules within a code base. For example, you could create a module that performs all the external HTTP calls and handles all the complexity of keeping connections, retries, handling errors, and so on, or you could create a module to produce reports in multiple formats, based on some parameters.
The important characteristic is that in order to create an independent element, the API needs to be clearly defined and the responsibility needs to be well defined. It should be possible for the module to be extracted into a different repo and installed as a third-party element for it to be considered truly independent.
Creating a big component with internal divisions only is a well-known pattern called a monolithic architecture. The LAMP architecture described above is an example of that, as most of the code is defined inside the web worker. Monoliths are the usual de facto starts of projects, as normally at the start there's no big plan and dividing things strictly into multiple components doesn't have a big advantage when the code base is small. As the code base and system grow more and more complex, the division of elements inside the monolith starts to make sense, and later it may start to make sense to split it into several components. We will discuss monoliths further in Chapter 9, Microservices vs Monolith.
Inside the same component, communication is typically straightforward, as internal APIs will be used. In the vast majority of cases, the same programming language will be used.