Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Getting Started with CockroachDB
Getting Started with CockroachDB

Getting Started with CockroachDB: A guide to using a modern, cloud-native, and distributed SQL database for your data-intensive apps

Arrow left icon
Profile Icon Kishen Das Kondabagilu Rajanna
Arrow right icon
$19.99 per month
Full star icon Full star icon Full star icon Full star icon Half star icon 4.7 (6 Ratings)
Paperback Mar 2022 246 pages 1st Edition
eBook
$9.99 $39.99
Paperback
$48.99
Subscription
Free Trial
Renews at $19.99p/m
Arrow left icon
Profile Icon Kishen Das Kondabagilu Rajanna
Arrow right icon
$19.99 per month
Full star icon Full star icon Full star icon Full star icon Half star icon 4.7 (6 Ratings)
Paperback Mar 2022 246 pages 1st Edition
eBook
$9.99 $39.99
Paperback
$48.99
Subscription
Free Trial
Renews at $19.99p/m
eBook
$9.99 $39.99
Paperback
$48.99
Subscription
Free Trial
Renews at $19.99p/m

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing
Table of content icon View table of contents Preview book icon Preview Book

Getting Started with CockroachDB

Chapter 1: CockroachDB – A Brief Introduction

In this chapter, we will go over the history of databases, where we will learn about the evolution of SQL, NoSQL, and NewSQL databases, various relational models, different categories for classifying databases, and timelines. Later, we will discuss the CAP theorem. Finally, we will briefly discuss the motivation for creating a new database and learn about the basic architecture of CockroachDB.

The following topics will be covered in this chapter:

  • The history and evolution of databases
  • Database concepts
  • CAP theorem
  • CockroachDB

The history and evolution of databases

A database is a collection of data that can be organized, managed, modified, and retrieved using a computer. The system that helps with managing data in a database is called a database management system (DBMS).

In the 1950s and 1960s, several advancements were made in terms of processors, storage, memory, and networks. We also had our first programming languages, COBOL and FORTRAN. The development of hard disk drives for data storage further spurred the development of databases. Around the same time, the first notion of a modern-day computer with a mouse and graphical user interface came into existence, making it easy for the general public to consume it. In this section, we will discuss how various types of databases evolved.

SQL

The first database was designed by Charles William Bachman III, an American computer scientist. In 1963, he developed the Integrated Data Store (IDS), which gave rise to the concept of the navigational database. In navigational databases, we can find records by chasing references from other objects. For example, let's say that in a school database, you want to find all the students from a specific grade in a specific school. In a navigational database, first, you have to go to the group of students that belong to a particular school and then to the group that belongs to a particular grade. So, records can be accessed by hierarchical navigation. Based on IDS, Bachman later developed the CODASYL database model in 1969. CODASYL stands for Conference/Committee on Data Systems Languages, which was a consortium to guide the development of programming languages. Around the same time Edgar F. Codd, an IBM employee, developed the IBM Information Management System (IMS), which was based on the hierarchical database model. A hierarchical database model is a data model in which the data is designed in a tree-like structure. In 1970, Donald D. Chamberlin and Raymond F. Boyce developed Structured Query Language (SQL) based on what they'd learned about IMS. They initially called it Structured English Query Language (SEQUEL), which System R was later developed with by a group at the IBM San Jose research laboratory. In 1976, QUEL, which is a relational database query language designed by Michael Ralph Stonebraker, was developed as part of the Interactive Graphics Retrieval System (INGRES) database management system at the University of California, Berkeley.

Based on QUEL and SQL, several databases were implemented. Some of the most prominent ones include Post Ingres (Postgres), Sybase, Microsoft SQL, IBM DB2, Oracle, MariaDB, and MySQL.

Object-oriented databases

In the 1980s, object-oriented database systems (OODBMSes) grew in popularity. In OODBMSes, information is represented as objects compared to tables in relational databases. Some of the important ones include Gemstone/S, Objectivity/DB, InterSystems Cache, Perst, ZODB, Wakanda, ObjectDB, ODABA, and Realm.

NoSQL

The concept of non-SQL or non-relational databases has existed since the 1960s, but the term NoSQL became has much more popular in the last decade. NoSQL databases focus on performance and scaling and mostly rely on a non-relational data model such as a document, key-value, wide-column, or graph to organize the data. Some of the most popular ones in this category include Cassandra, MongoDB, Couchbase, Dynamo, FoundationDB, Neo4j, and Hbase.

NewSQL

With the introduction of the on-demand availability of compute, storage, and network resources and the pay-as-you-go model, which is collectively known as cloud computing, the amount of data that we collect, process, manage, and analyze has been growing exponentially. Although it was relatively easier for some of the NoSQL databases to adapt to the cloud, it is still much harder for traditional SQL databases to do so. Many of them are better suited for vertical scaling and do not consider geographically distributed data, the shared-nothing architecture, and enormous scale as part of their initial design. This created a void. We needed SQL databases that are cloud-native, scale well with data growth, and are easy to manage. Many companies developed in-house solutions on top of existing SQL databases:

  • Facebook developed TAO, a NoSQL graph API built on top of sharded MySQL.
  • YouTube developed Vitess to easily scale and manage MySQL clusters.
  • Dropbox developed Edgestore, a metadata store to power their services and products, which again was built on top of MySQL.
  • GreenPlum developed a massively parallel data platform by the same name for analytics, machine learning, and AI on top of Postgres.

However, it was still relatively hard and painful to manage the data as the underlying database was not built to scale.

In 2012, Google published a seminal paper on Google Spanner: a globally distributed database service and storage solution. Spanner essentially combined the important features of SQL databases such as ACID transactions, strongly consistent reads, and the SQL interface with some of the features that were only available with NoSQL databases, such as scaling across geographical locations, multi-site replication, and failover. It created a new category of databases called NewSQL, which is meant to indicate a combination of SQL features at NoSQL scale. YugabyteDB and CockroachDB were developed later, both of which got their inspiration from Google Spanner.

Database concepts

In this section, we will learn about some of the core database concepts, including cardinality, database models, and various processing models.

Cardinality

Before we discuss database models, it is important to know about cardinality. Cardinality refers to the relationship between two entities or tables. The most popular ones include one-to-many, many-to-one, and many-to-many.

One-to-one relationship

In the case of a one-to-one relationship, a row or entry in one entity or table can be related to only one row in another entity or table. For example, in a Department of Motor Vehicles database, let's say there are two tables called License Info and Driver Info, as shown in the following diagram:

Figure 1.1 – An example of a one-to-one relationship

Figure 1.1 – An example of a one-to-one relationship

Here, Driver ID can only be assigned to one driver as it has to uniquely identify a driver. Also, a driver can only be assigned one Driver ID. So, here, any row in the License Info table will be associated with a specific row in the Driver Info table.

One-to-many relationship

In a one-to-many relationship, a single row from one entity or table can be associated with multiple rows in another entity or table.

For example, let's consider the Driver Info and City Info tables shown in the following diagram::

Figure 1.2 – An example of a one-to-many relationship

Figure 1.2 – An example of a one-to-many relationship

Here, for every row in City Info, there will be multiple rows in Driver Info, as there can be many drivers that live in a particular city.

Many-to-many relationship

In a many-to-many relationship, a single row in one entity or table can be associated with multiple rows in another entity or table and vice versa.

For example, let's consider two tables: Vehicle Ownership History, where we are maintaining the history of ownership of a given vehicle, and Driver Ownership History, where we are maintaining the history of vehicles owned by a given driver:

Figure 1.3 – An example of a many-to-many relationship

Figure 1.3 – An example of a many-to-many relationship

Here, a driver can own multiple vehicles and a vehicle can have multiple owners over time. So, a given row in the Vehicle Ownership History table can be associated with multiple rows in the Driver Ownership History table. Similarly, a given row in the Driver Ownership History table can be associated with multiple rows in the Vehicle Ownership History table.

Now, let's take a look at some of the most important database models.

Overview of database models

A database model determines how the data is stored, organized, and modified. Databases are typically implemented based on a specific data model. It is also possible to borrow concepts from multiple database models when you are designing a new database. The relational database model happens to be the most widely known and has been popularized by databases such as Oracle, IBM DB2, and MySQL.

Hierarchical database model

In the hierarchical database model, the data is organized in the form of a tree. There is a root at the first level and multiple children at the subsequent levels. Since a single parent can have multiple children, one-to-many relationships can easily be represented here. A child cannot have multiple parents, so this results in the advantage of not being able to model many-to-many relationships.

IBM's Information Management System (IMS) was the first database that implemented this data model.

The following diagram shows an example of a hierarchical database model:

Figure 1.4 – An example of a hierarchical database model

Figure 1.4 – An example of a hierarchical database model

Typically, the tree starts with a single root and the data is organized into this tree. Any node except the leaves can have multiple children, but a child can have only one parent.

Network model

The network model was developed as an enhancement of the hierarchical database model to accommodate many-to-many relationships. The network model relies on a graph structure to organize its data. So, there is no concept of a single root, and a child can have multiple parents and a parent can have multiple children. Integrated Data Store (IDS), Integrated Database Management Systems (IDMS), and Raima Database Manager (RDM) are some of the popular databases that use the network model.

As shown in the following diagram, there is no single root and a given child (for example, Object 2 can have multiple parents; that is, Object 1 and Object 3):

Figure 1.5 – An example of a network model

Figure 1.5 – An example of a network model

Relational model

Although the network model was an improvement over the hierarchical model, it was still a little restrictive when it came to representing data. In the relational model, any record can have a relationship with any other with the help of a common field. This drastically reduced the design's complexity and made it easier to independently add, update, and access records, without having to walk down the tree or traverse the graph. SQL was combined with the relational database model to provide a simple query interface to add and retrieve data.

All the popular traditional databases such as Oracle database, IBM DB2, MySQL, MariaDB, and Microsoft SQL Server implement relational data models.

Let's look at two tables called Employee and Employee Info:

Figure 1.6 – Employee tables showing the column names

Figure 1.6 – Employee tables showing the column names

Here, Employee ID is the common field or column between the Employee and Employee Info tables. The Employee table is responsible for ensuring that a given Employee ID is unique, while Employee Info is responsible for more detailed information about a given employee.

Object-relational model

The object-relational model, as the name suggests, combines the best of the relational and object data models. The concept of objects, classes, and inheritance are directly supported as first-class citizens as part of the database and in queries. SQL:1999, the fourth revision of SQL, introduced several features for embedding object concepts into the relational database. One of the main features was to create structured user-defined types with CREATE TYPE to define an object's structure.

Over time, relational databases have added more support for objects. There is a varying degree of support for object concepts in Oracle database, IBM DB2, PostgreSQL, and Microsoft SQL Server.

Given the scope of this book, we will not discuss the entity-relational model, object model, document model, star schema, snowflake schema, and many other less well-known models.

Now, let's look at how databases can be classified based on what kinds of workload they can be used for.

Processing models

Based on how you want to consume and process data, databases can be categorized into four different processing systems. Let's take a look.

Online transaction processing (OLTP)

OLTP systems support the concept of transactions. A transaction refers to the ability to atomically apply changes (insert, update, delete, and read) to a given system. One popular example is a bank, where withdrawing or depositing money to a given bank account must be done atomically to ensure data is not lost or incorrect. So, the main purpose here is to maintain data integrity and consistency. Also, these systems are generally suited for fast-running queries.

Online analytical processing (OLAP)

OLAP focuses mostly on running queries to analyze multi-dimensional data and to extract some intelligence or patterns from it. Typically, such systems support generating some sort of report that can be used for marketing, sales, financing, budgeting, management, and many more. Data mining and data analytics applications would typically have to have an OLAP system in some form. OLAP doesn't deal with transactions, and the emphasis is more on analyzing large amounts of data from different sources to extract business intelligence. Some databases also provide built-in support for MapReduce to run queries across a large set of data.

A data warehouse is a piece of software that's used for reporting and data analysis. Warehouses are typically developed for OLAP. It is also very common to retrieve the data from OLTP in batches or bulk, run it through an Extract, Load, and Transform (ELT) or Extract, Transform, and Load (ETL) data transformation pipeline, and store it in an OLAP system.

Online event processing (OLEP)

OLEP guarantees strong consistency without the traditional atomic commit protocols or distributed locking. OLEP also focuses on high performance, larger scales, and fault tolerance.

Hybrid transaction/analytical processing (HTAP)

As the name suggests, this system tries to provide the best of both transactions and analytical processing. Most of the NoSQL and NewSQL databases provide support for managing both transactional and analytical workloads. Vitess is a database clustering system that can be used to scale and shard MySQL instances. Vitess provides HTAP features on top of MySQL by allowing a given MySQL instance to be configured as master or read-only, where read-only can be used for analytical queries and MapReduce. It is possible to use CockroachDB as HTAP by propagating changes with the help of change data capture (CDC) in the OLTP cluster or primary cluster to a separate cluster, which is solely used for analytical processing.

Now, let's learn a bit about embedded and mobile databases, including why they exist and some of the most popular ones in this space.

Embedded and mobile databases

Embedded databases usually refer to databases that can be tightly integrated into an application, without needing separate hardware to support them. Also, they don't have to be managed separately. Some of the most popular embedded databases include SQLite, Berkeley DB from Oracle Corporation, and SQL Server Compact from Microsoft Corporation. Embedded databases are also very useful for testing purposes as they can be started within test suites.

Mobile database refers to the class of databases that work with very limited memory footprint and compute and can be deployed within a mobile device. They are typically used for storing user data for apps running on mobile devices. SQLite, SQL Server Compact, Oracle database Lite, Couchbase Lite, SQL Anywhere, SQL Server Express, and DB2 Everyplace belong to this category,

Database storage engines

A database storage engine is a component within a database management system that is responsible for Create, Read, Update, Delete (CRUD) operations and transferring data between disk and memory, without compromising data integrity. Some of the most popular ones include Apache Derby, HSQLDB, InfinityDB, LevelDB, RocksDB, and SQLite. CockroachDB initially started with RocksDB as its database engine, but from release 20.2 onward, Pebble will be the database engine by default. Pebble, as per Cockroach Labs, is a RocksDB-inspired and RocksDB-compatible key-value store focused on the needs of CockroachDB. RocksDB was implemented in C++, whereas Pebble was implemented in Golang. This makes it easier to manage and maintain as CockroachDB itself was written in Golang. This means that we only have to deal with one language now.

CAP theorem

Eric A. Brewer gave a keynote talk in 2000 titled Towards Robust Distributed Systems at a symposium on Principles of Distributed Computing, summarizing his years of learning about distributed systems. Brewer talked about key aspects of a distributed system: consistency, availability, and tolerance toward network partition. Consistency refers to the fact that every read should see the data from the most recent write; otherwise, it should error out. Availability means every requested read or write should receive a non-error response. Partition tolerance indicates that the system should continue to serve, irrespective of delays and communication failures between nodes in the system. Consistency, Availability, and Partition Tolerance (CAP) theorem claims that, at most, you can only have two of these three properties in a distributed system.

Consistency and partition tolerance (CP)

A CP database provides consistency and partition tolerance but cannot provide availability. This is also called a CAP-consistent system. Let's understand this by looking at an example:

Figure 1.7 – CP system

Figure 1.7 – CP system

Let's consider the system shown in the preceding diagram, where two servers are serving read and write traffic. For this example, let's say writes only land on Server 1 and reads only land on Server 2. So long as Server 1 can talk to Server 2, all the writes that come to Server 1 can be propagated synchronously to Server 2. This ensures that any reads that come to Server 2 are always consistent, which means they see the latest data written by the latest write in Server 1:

Figure 1.8 – CP system during a communication failure

Figure 1.8 – CP system during a communication failure

Now, let's say that, as shown in the preceding diagram, the communication between Server 1 and Server 2 has broken down and now Server 1 is no longer able to propagate the writes synchronously. This results in partitioning. Since the data cannot be propagated between the two servers, read or write traffic cannot be served until we resolve the partition issue as we have to ensure data consistency.

Some of the most popular databases that have CP characteristics are HBase, Couchbase, and MongoDB. CockroachDB also falls into this category.

Availability and partition tolerance (AP)

In this case, a database is guaranteed to always be available and it can tolerate partitioning, but at the cost of consistency. This is also known as a CAP-available system. Here, the application is expected to deal with data consistency:

Figure 1.9 – AP system during a communication failure

Figure 1.9 – AP system during a communication failure

Similar to the previous example, if the communication between Server 1 and Server 2 breaks down, Server 1 and Server 2 continue to serve the traffic but reads to Server 1 and Server 2 might return different versions of the data, based on when the communication has failed and whether there was any change to that data, after the communication failure. Cassandra, Riak, and CouchDB are popular examples of AP databases.

Consistency and availability (CA)

In the case of a CA database, the system cannot tolerate partitioning but can guarantee consistency and availability. Traditional databases with single-server deployments with no replication or slaves can be classified as CA. Now, many traditional RDBMS databases can be configured in various ways to have CA, CP, or AP as desired.

CockroachDB

The name CockroachDB was inspired by the insect that goes by the same name. Just like how cockroaches have been surviving for millions of years and colonizing the entire planet and thriving, CockroachDB instances are supposed to replicate and repair data, spread naturally across multiple availability zones, and survive total regional failures. Also, once CockroachDB becomes part of a given software ecosystem, it's impossible to get rid of or replace it, just like cockroaches. Here, we will discuss why there is a need for yet another database, known as Inspiration, and provide a high-level overview of CockroachDB.

Why yet another database?

As more companies shift from on-premises to the cloud, they are looking for SQL datastores on various cloud platforms to manage their transactional data. Most of the traditional databases such as MySQL, Postgres, and Oracle are not built for the cloud. This necessitates a cloud-native, consistent, distributed SQL that can scale with the growth of data. CockroachDB fills this gap.

Inspiration

As we previously discussed in the NewSQL section, in 2012, Google published a seminal paper on Google Spanner: a globally distributed database service and storage solution. Although Google Spanner combined the best of both SQL and NoSQL and was very useful for a lot of applications, it was not available for public usage. Also, Google Spanner was and still is not an open source project and has only been available on Google Cloud Platform since 2017. So, this created a necessity for an open source Spanner-like database that can be used in different cloud providers and on-premises. Around 2012, Spencer Kimball, Peter Mattis, and Ben Darnell were working at Google on the Google File System and Google Reader projects. They also got acquainted with both Bigtable and Spanner during their tenure at Google. They decided to build something very similar to Spanner to make it available for everyone and started an open source project on GitHub in 2014. After a year, they decided to leave Google and founded Cockroach Labs in 2015 before officially working on CockroachDB in June 2015.

Key terms and concepts

Before we look at the various functional layers, let's look at some of the key concepts and terms. A CockroachDB cluster refers to a group of nodes that act as a single logical unit. A node is a single machine that runs an instance of CockroachDB. CockroachDB stores all the data as sorted key-value pairs. These keys are divided into ranges. CockroachDB replicates each range and stores each replica on a different node. For each range, there will be a leaseholder, which acts as a primary owner of a given range and receives and coordinates all the traffic for that range. For each range, one of the replicas acts as a leader for write requests and ensures that the majority of the replicas are in consensus, before committing a given write. For each range, there will be a time-ordered log of writes, called a raft log, for which the majority of replicas agreed upon.

High-level overview

CockroachDB is a cloud-native, consistent, highly scalable relational database. Some of the primary goals of CockroachDB are to provide strong consistency, geo-distribution of data, high availability, SQL support, easy deployment, and less maintenance. Since we will be dealing with CockroachDB internals in detail in subsequent chapters, we will just provide a high-level overview here:

Figure 1.10 – High-level overview of the CockroachDB architecture

Figure 1.10 – High-level overview of the CockroachDB architecture

CockroachDB exposes a SQL interface, using which clients can interact with the database. Client requests can land on any node within a given cluster and work just fine since all the nodes are symmetrical.

CockroachDB can be divided into five functional layers:

  • SQL
  • Transactional
  • Distribution
  • Replication
  • Storage

The SQL layer is responsible for receiving SQL queries and converting them into key-value operations. The transactional layer ensures that all CRUD operations that happen on multiple key-value pairs are transactional. The distribution layer is responsible for ensuring ranges are evenly distributed among all the available nodes in a cluster. The replication layer ensures that ranges are replicated synchronously, whenever there is a change. Finally, the storage layer is responsible for managing key-value data on the disk.

Summary

In this chapter, we learned about the evolution of databases, how databases can be categorized based on various criteria, CAP theorem, and a brief introduction to CockroachDB. By now, you should also be familiar with database and processing models, what the CP, CA, and AP systems in CAP theorem offer, and the functional layers of CockroachDB.

In the next chapter, we will take a deep dive into CockroachDB's architecture and design concepts.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Gain insights into CockroachDB and build highly reliable cloud-native applications
  • Explore the power of a scalable and highly available cloud-native SQL database to distribute data and workloads automatically
  • Build high-speed database services using CockroachDB and troubleshoot performance issues

Description

Getting Started with CockroachDB will introduce you to the inner workings of CockroachDB and help you to understand how it provides faster access to distributed data through a SQL interface. The book will also uncover how you can use the database to provide solutions where the data is highly available. Starting with CockroachDB's installation, setup, and configuration, this SQL book will familiarize you with the database architecture and database design principles. You'll then discover several options that CockroachDB provides to store multiple copies of your data to ensure fast data access. The book covers the internals of CockroachDB, how to deploy and manage it on the cloud, performance tuning to get the best out of CockroachDB, and how to scale data across continents and serve it locally. In addition to this, you'll get to grips with fault tolerance and auto-rebalancing, how indexes work, and the CockroachDB Admin UI. The book will guide you in building scalable cloud services on top of CockroachDB, covering administrative and security aspects and tips for troubleshooting, performance enhancements, and a brief guideline on migrating from traditional databases. By the end of this book, you'll have gained sufficient knowledge to manage your data on CockroachDB and interact with it from your application layer.

Who is this book for?

Software engineers, database developers, database administrators, and anyone who wishes to learn about the features of CockroachDB and how to build database solutions that are fast, highly available, and cater to business-critical applications, will find this book useful. Although no prior exposure to CockroachDB is required, familiarity with database concepts will help you to get the most out of this book.

What you will learn

  • Become well-versed with the overall architecture and design concepts of CockroachDB
  • Understand how auto-rebalancing of data can avoid performance bottlenecks
  • Get to know how CockroachDB achieves atomicity, consistency, isolation, and durability
  • Partition your data across multiple geolocations to ensure very low latency when serving data
  • Find out how indexes are stored and the optimizations used to serve query results faster
  • Discover the key concepts of deploying and managing CockroachDB clusters

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Mar 11, 2022
Length: 246 pages
Edition : 1st
Language : English
ISBN-13 : 9781800560659
Vendor :
Cockroach Labs
Category :
Languages :
Concepts :

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details

Publication date : Mar 11, 2022
Length: 246 pages
Edition : 1st
Language : English
ISBN-13 : 9781800560659
Vendor :
Cockroach Labs
Category :
Languages :
Concepts :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total $29.97 $125.97 $96.00 saved
Mastering MongoDB 6.x
$51.99
The Kubernetes Bible
$54.99
Getting Started with CockroachDB
$48.99
Total $29.97$125.97 $96.00 saved Stars icon
Banner background image

Table of Contents

15 Chapters
Section 1: Getting to Know CockroachDB Chevron down icon Chevron up icon
Chapter 1: CockroachDB – A Brief Introduction Chevron down icon Chevron up icon
Chapter 2: How Does CockroachDB Work Internally? Chevron down icon Chevron up icon
Section 2: Exploring the Important Features of CockroachDB Chevron down icon Chevron up icon
Chapter 3: Atomicity, Consistency, Isolation, and Durability (ACID) Chevron down icon Chevron up icon
Chapter 4: Geo-Partitioning Chevron down icon Chevron up icon
Chapter 5: Fault Tolerance and Auto-Rebalancing Chevron down icon Chevron up icon
Chapter 6: How Indexes Work in CockroachDB Chevron down icon Chevron up icon
Section 3: Working with CockroachDB Chevron down icon Chevron up icon
Chapter 7: Schema Creation and Management Chevron down icon Chevron up icon
Chapter 8: Exploring the Admin User Interface Chevron down icon Chevron up icon
Chapter 9: An Overview Of Security Aspects Chevron down icon Chevron up icon
Chapter 10: Troubleshooting Issues Chevron down icon Chevron up icon
Chapter 11: Performance Benchmarking and Migration Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.7
(6 Ratings)
5 star 66.7%
4 star 33.3%
3 star 0%
2 star 0%
1 star 0%
Filter icon Filter
Top Reviews

Filter reviews by




Bosco Apr 23, 2022
Full star icon Full star icon Full star icon Full star icon Full star icon 5
The introductory Chapter 1 provides the reader with a quick catchup on the history and evolution of databases through SQL, NoSQL, and NewSQL - a modern globally distributed database service and storage solutions. The introductory chapter starts with basics on understanding the relational model, making this a great book for absolute beginners who’d like to learn and get started on CockroachDB or get familiar with NewSQL databases. CAP-consistent systems are explained clearly with clear visuals, and how this applies to CockroachDB. This section also covers key terms and concepts, and provides an overview of the CockroachDB architecture.Chapter 2 goes into how CockroachDB works, and setting up of a single-node CockroachDB cluster using Docker. This gives the beginner some hands-on experience with CockroachDB, and also goes into the details of how data is distributed across multiple nodes, and describes the Raft distributed consensus protocol in clear terms. Also included in this chapter is the storage engine, how these work with various storage types.Section 2 consists of several chapters that explores several important features of CockroachDB including ACID support, Geo-Partitioning, Fault Tolerance, Auto-Rebalancing, and indexing in CockroachDB.Section 3 consists of several chapters related to working with CockroachDB including schema creation and management, the Admin user interface, security aspects, troubleshooting issues, performance benchmarking, and migration.I’ve worked with CockroachDB on setting up a non-production, multi-node setup, and I was surprised at how easy it was to setup. However, it is important to understand a lot of concepts that make CockroachDB an ACID compliant, with strongly consistent reads, geographically distributed, and a fault-tolerant database. All of these concepts are explained very well in this book, “Getting Started with CockroachDB”. There appears to be just enough detail to get these concepts across. Whether you are an experienced database professional or a complete beginner interested in learning about CockroachDB, you will find this book extremely helpful.
Amazon Verified review Amazon
QSeller Apr 07, 2022
Full star icon Full star icon Full star icon Full star icon Full star icon 5
The book describes concepts associated with relational databases and evaluation of the latest distributed SQL databases. This book explains the details of the CockroachDB cluster internal mechanics and the rich features of the CockroachDB. The data partitions and auto-rebalancing chapters are explained very well.The other sections of schema creation and management are also explained in detail specific to CockroachDB. The admin user features and security of the DB features are also explained very well. I would like to recommend adding more troubleshooting issues. This will help users address the issues they face, but these details will change very frequently with the releases of new CockroachDB versions. Overall the book is an excellent guide to learn the CockroachDB.
Amazon Verified review Amazon
Luiz Ribas Apr 05, 2022
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Is an excellent book for database administrators that wants to learn about a different database prepared for the cloud, and extremely recommended for developers that need to understand a little bit more about databases. I believe that CockroachDB is the real next-generation database prepared for the cloud and obtaining a piece of knowledge about it is magnific, this book covers important topics about this database and how to put it in a productive mode.I highly recommend this book, to database administrators, developers, and software architects, after reading you will try to put it in the next project.
Amazon Verified review Amazon
Donald E Lutz Apr 21, 2022
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This book gives a very complete overview of CockroachDB and its inner workings as well as explaining how to deal with distributing data across different locations using SQL. It describes the setup, installation, and configuration of CockroachDB. It completely describes the database architecture and design principles to distribute data for fast access. It covers all the system internals and describes the administration and performance tuning and further how to manage it in the cloud. Overall, a great read to help understand CockroachDB and its usage.
Amazon Verified review Amazon
John Apr 13, 2022
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
I found this book to be tremendously helpful on understanding the foundational fundamentals when implementing cockroachDB into any project. The book walked me through different use cases and scenarios including how cockroachDB internally processes. For my situation, the most helpful sections were the storage options for a hash table, B+ tree, heap, Log-structured merge-tree (LSM-tree), solid-state drive, flash storage, hard disk, and remote storage.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is included in a Packt subscription? Chevron down icon Chevron up icon

A subscription provides you with full access to view all Packt and licnesed content online, this includes exclusive access to Early Access titles. Depending on the tier chosen you can also earn credits and discounts to use for owning content

How can I cancel my subscription? Chevron down icon Chevron up icon

To cancel your subscription with us simply go to the account page - found in the top right of the page or at https://subscription.packtpub.com/my-account/subscription - From here you will see the ‘cancel subscription’ button in the grey box with your subscription information in.

What are credits? Chevron down icon Chevron up icon

Credits can be earned from reading 40 section of any title within the payment cycle - a month starting from the day of subscription payment. You also earn a Credit every month if you subscribe to our annual or 18 month plans. Credits can be used to buy books DRM free, the same way that you would pay for a book. Your credits can be found in the subscription homepage - subscription.packtpub.com - clicking on ‘the my’ library dropdown and selecting ‘credits’.

What happens if an Early Access Course is cancelled? Chevron down icon Chevron up icon

Projects are rarely cancelled, but sometimes it's unavoidable. If an Early Access course is cancelled or excessively delayed, you can exchange your purchase for another course. For further details, please contact us here.

Where can I send feedback about an Early Access title? Chevron down icon Chevron up icon

If you have any feedback about the product you're reading, or Early Access in general, then please fill out a contact form here and we'll make sure the feedback gets to the right team. 

Can I download the code files for Early Access titles? Chevron down icon Chevron up icon

We try to ensure that all books in Early Access have code available to use, download, and fork on GitHub. This helps us be more agile in the development of the book, and helps keep the often changing code base of new versions and new technologies as up to date as possible. Unfortunately, however, there will be rare cases when it is not possible for us to have downloadable code samples available until publication.

When we publish the book, the code files will also be available to download from the Packt website.

How accurate is the publication date? Chevron down icon Chevron up icon

The publication date is as accurate as we can be at any point in the project. Unfortunately, delays can happen. Often those delays are out of our control, such as changes to the technology code base or delays in the tech release. We do our best to give you an accurate estimate of the publication date at any given time, and as more chapters are delivered, the more accurate the delivery date will become.

How will I know when new chapters are ready? Chevron down icon Chevron up icon

We'll let you know every time there has been an update to a course that you've bought in Early Access. You'll get an email to let you know there has been a new chapter, or a change to a previous chapter. The new chapters are automatically added to your account, so you can also check back there any time you're ready and download or read them online.

I am a Packt subscriber, do I get Early Access? Chevron down icon Chevron up icon

Yes, all Early Access content is fully available through your subscription. You will need to have a paid for or active trial subscription in order to access all titles.

How is Early Access delivered? Chevron down icon Chevron up icon

Early Access is currently only available as a PDF or through our online reader. As we make changes or add new chapters, the files in your Packt account will be updated so you can download them again or view them online immediately.

How do I buy Early Access content? Chevron down icon Chevron up icon

Early Access is a way of us getting our content to you quicker, but the method of buying the Early Access course is still the same. Just find the course you want to buy, go through the check-out steps, and you’ll get a confirmation email from us with information and a link to the relevant Early Access courses.

What is Early Access? Chevron down icon Chevron up icon

Keeping up to date with the latest technology is difficult; new versions, new frameworks, new techniques. This feature gives you a head-start to our content, as it's being created. With Early Access you'll receive each chapter as it's written, and get regular updates throughout the product's development, as well as the final course as soon as it's ready.We created Early Access as a means of giving you the information you need, as soon as it's available. As we go through the process of developing a course, 99% of it can be ready but we can't publish until that last 1% falls in to place. Early Access helps to unlock the potential of our content early, to help you start your learning when you need it most. You not only get access to every chapter as it's delivered, edited, and updated, but you'll also get the finalized, DRM-free product to download in any format you want when it's published. As a member of Packt, you'll also be eligible for our exclusive offers, including a free course every day, and discounts on new and popular titles.