Preface
Before the invention of NoSQL, almost all databases were structural. This means that developers had to define the structure of the database before using it. Despite all the benefits of using this approach, sometimes, following such a method came with issues. For instance, you couldn't (or at least it was difficult to) have schemaless data.
Later, the NoSQL concept and all the technologies related to it were invented to rescue programmers.
The following sections show a brief history of the term NoSQL, which is taken from http://en.wikipedia.org/wiki/NoSQL.
"Carlo Strozzi used the term NoSQL in 1998 to name his lightweight, open source relational database that did not expose the standard SQL interface."
NoSQL databases are classified in the following ways:
- Column (HBase, Cassandra)
- Document (MongoDB, Couchbase)
- Key-value (Redis, Riak, MemcacheDB)
- Graph (Neo4j, OrientDB)
Have a look at the following image:
So, why use NoSQL instead of relational databases? There are many different opinions about the benefits of relational or non-relational databases, but to give you the gist of all conversations, the following are the major reasons to use NoSQL:
- A more flexible data model and a dynamic schema
- Scalability
- Better efficiency and performance
Compared to relational database systems, NoSQL databases have a remarkable feature that enables developers to change the data model after inserting data; that is, developers can insert data without defining the data model. This comes in handy when you have a data model that might change after data is inserted.
One of the great NoSQL database facilities is scaling. Almost all NoSQL technologies support a built-in mechanism to scale a database horizontally, and not vertically. Auto-sharding is responsible for this task.
Additionally, NoSQL databases support integrated caching, which improves the read/write performances of a database. The database will frequently use data in memory and restore them while reading data, but not from the disk. This method will affect database performance and improve the overall database speed when reading and writing data.
MongoDB is one of the pioneers in implementing the NoSQL concept by using "Document" as the infrastructure when saving and restoring data from a database. MongoDB is a cross-platform, document-oriented database system. MongoDB was developed by 10gen, a software company, in October 2007. The latest stable version of MongoDB is 2.4.9, and was released on January 10, 2014.
MongoDB is the leading NoSQL database, with stunning implementation, and it has a vibrant community. As you know, one of the basic reasons to choose a technology is an active and lively community so there is always someone who can help you and answer your questions. The graph, shown in the following screenshot is taken from http://www.mongodb.com/leading-nosql-database:
You can ask your questions in StackOverflow or on their individual forums, and you will get an answer at the earliest. Furthermore, there are various books and articles available about MongoDB.
Here, we have some remarkable features of MongoDB:
- Schemaless data: Developers are able to store any data model or change the schema during or after inserting data.
- Replication: MongoDB provides high availability with replica sets. A replica set contains two or more copies of data, and each one can be either primary or secondary.
- Load balancing: Using sharding, MongoDB can scale horizontally, so data will split between two ranges based on sharding keys.
- File storage: MongoDB has a feature that is called GridFS, so you can use MongoDB as a filesystem to store and load data from the disk.
In this book, we will discuss remedies and solutions to provide a highly available MongoDB server. First of all, we will go through the problems and issues that cause server downtime, such as errors or server crashes. In the next chapters, by introducing remedies and exploring the problem with a real-world example, we will sort out the issues.