Summary
We have started learning about Cassandra. You can set up your local machine, play with CQL3 in cqlsh
, and write a simple program that uses Cassandra on the backend. It seems like we are all done. But, it's not so. Cassandra is not all about ease in modeling or simple to code around with (unlike RDBMS). It is all about speed, availability, and reliability. The only thing that matters in a production setup is how quickly and reliably your application can serve a fickle-minded user. It does not matter if you have an elegant database architecture with the third normal form or if you use a functional programming language and follow the Don't Repeat Yourself (DRY) principle religiously. Cassandra and many other modern databases, especially in the NoSQL space, are there to provide you with speed. Cassandra's performance increases almost linearly with the addition of new nodes, which makes it suitable for high throughput applications without committing a lot of expensive infrastructure to begin with. For more information, visit http://vldb.org/pvldb/vol5/p1724_tilmannrabl_vldb2012.pdf. The rest of the book is aimed at giving you a solid understanding of the following aspects of Cassandra—one chapter at a time:
- You will learn the internals of Cassandra and the general programming pattern for Cassandra
- Setting up a cluster and tweaking Cassandra and Java settings to get the maximum out of Cassandra for your use
- Infrastructure maintenance—nodes going down, scaling up and down, backing the data up, keeping vigil monitoring, and getting notified about an interesting event on your Cassandra setup will be covered
- Cassandra is easy to use with the Apache Hadoop and Apache Pig tools and we will see simple examples of this
The best thing about these chapters is that there is no prerequisite. Most of these chapters start from the basics to get you familiar with the concept and then take you to an advanced level. So, if you have never used Hadoop, do not worry. You can still have a simple setup up and running with Cassandra.
In the next chapter, we will see Cassandra internals and what makes it so fast.