Cosmos DB extends the database service that was known as Azure Document DB. However, it is very important to note that Cosmos DB adds a huge number of features to the services offered by its predecessor. In fact, Cosmos DB is continuously adding new features and has quickly become one of the most innovative services found in Azure that targets mission-critical applications at a global scale.
Cosmos DB is a NoSQL database service included in Azure. NoSQL definitely means not only SQL in the case of this database service, because Cosmos DB provides a SQL API that allows us to query documents by using SQL in one of the possible models that the database service supports. Cosmos DB is a multi-model database service, and therefore it supports different non-relational models, which we will analyze later.
Let's perform a bottom-to-top analysis to have a better understanding of this database service. The following are three main features that Cosmos DB provides that establish pillars for supporting additional features:
- Partitioning
- Replication
- Resource governance
Partitioning makes it possible for Cosmos DB to provide an elastic scale out of storage and throughput by distributing the data in multiple logical and underlying physical partitions. We can start with something very small and grow elastically and seamlessly to something very large, increasing both storage and throughput as required. For example, we can start with a total storage size measured in gigabytes and end up with petabytes. We can start with small throughput requirements per second and end up with huge throughput requirements per second.
Replication makes it possible to deliver turnkey global distribution and replicate data through any number of regions in which Cosmos DB is available. The number of regions is continuously increasing and there are no limitations on the number of regions to which we can replicate data. For example, we can have a Cosmos DB database service working with the West US, East US, Brazil South, Japan East, and Japan West regions. The following diagram shows icons with sample regions in which a Cosmos DB database can be replicated (at the time of writing this book).
The hexagons represent the regions in which a database can be replicated:
Cosmos DB offers five consistency models to enable us to select the most appropriate one based on the most convenient write performance and the desired consistency. This way, we can manage performance with respect to consistency. We will analyze them in detail later in this chapter.
Resource governance makes it possible to provide high availability. Cosmos DB can provide 99.99% (also known as four nines) of availability in a single region and 99.999% (also known as five nines) of availability in multiple regions. Availability is one of the most important aspects of a database. Cosmos DB provides high availability in a transparent and automatic way that doesn't require manual changes in the configuration; that is, we don't need to make changes or redeploy and we can continue using the same endpoint.
Of course, one of the key aspects of a database service is performance. Cosmos DB provides the necessary features for achieving predictable performance. The database service implements resource governance at a very fine level of granularity and on a per-request basis. This way, the database service guarantees a pre-configured desired throughput as well as the latency for each individual request. Hence, capacity planning is really straightforward.