Describing a data solution
There are two types of database solutions: transactional solutions and analytical solutions. In the following sections, we will understand in detail what these solutions are and the requirements for choosing between them.
Transactional databases
Transactional databases are used by systems for basic operations: creating, reading, updating, and deleting. Transactional systems are considered the core of the informatization of business processes. With these basic operations, we can create entities such as customers, products, stores, and sales transactions, among others, to store important data.
A transactional database is commonly known as online transaction processing (OLTP) considering that this type of database serves online transactional operations between the application and the database.
For an organization, transactional databases usually have their data segmented into entities, which can be tables (or not), with or without a relationship between these entities to facilitate the correlation between this data.
For example, an e-commerce database can be structured with a table called Shopping_Cart
, which represents the products that are being selected in the store during user navigation, and another called Purchases
with the completed transaction records.
The process of segmenting entities in a database is called normalization, which will be covered in Chapter 3, Working with Relational Data.
The format of a normalized transactional database is optimized for transactional operations, but it is not the best format for data exploration and analysis.
The following is an example of a relational transactional database:
Figure 1.4 – Example of a relational transactional database
The preceding figure demonstrates a relational database of transactional workloads in a sales and delivery system. We can see the main entity, Orders, joined to Employees, Shippers, Customers, and Order Details, which then detail all products of this order in the relationship with the Products entity, which looks for information in the Categories and Suppliers entities.
Analytical databases
When the data solution requires a good interface for queries, explorations, and data analysis, the data storage organization is different from transactional databases. To meet this requirement, we prioritize the data aggregations and relationships for data consumption and exploration; this specialized data storage is called an analytical database.
Analytical databases use a process called online analytical processing (OLAP) and have undergone a great evolution in recent years with the emergence of data warehouses and big data platforms.
Analytical databases are constituted through a process of data ingestion, and they are responsible for processing and transforming the data into insights and information and then making this processed information available for consumption. The following steps describe this process:
- Data ingestion – The process responsible for connecting to transactional databases or other data sources to collect raw transaction information and include it in the analytical database
- Data processing – The process performed by the OLAP platform to create a data model, organize entities, perform indicator calculations, and define metrics for data consumption
- Data query – After the data model is loaded with the proper organization for querying, data manipulation and reporting tools can connect to the OLAP platform to perform your queries
The following diagram is an example of a structured data model in an OLAP database:
Figure 1.5 – Example of an analytical relationship
The following diagram is a simple comparison of OLTP and OLAP databases:
Figure 1.6 – Data flow between OLTP and OLAP
The preceding figure demonstrates the traditional flow of data, which is sourced and stored in transactional OLTP databases and then moved to OLAP analytical databases for data intelligence generation.
Important note
There are modern data storage platforms that aim to unite OLTP and OLAP on the same platform, but these databases, often called NewSQL, still need to mature their structures to deliver the best of transactional and analytical worlds in the same database. The industry standard is to keep transactional and analytical data structures separate.
In this section, we defined what transactional and analytical data solutions are and the characteristics of each solution. In the next section, we will detail the recommended data types and storage for each of these types.