Preface
Back in 2007, Twitter users would experience "fail whale" captioned with "Too many tweets..." occasionally. On August 03, 2013, Twitter posted a new high-tweet rate record: 143,199 per second, and we rarely saw the fail whale. Many things changed since 2007. People and things connected to the Internet have increased exponentially. Cloud computing and hardware on demand have become cheap and easily available. Distributed computing and the NoSQL paradigm have taken off with a plethora of freely available, robust, proven, and open source projects to store large datasets, process it, and visualize it. "Big Data" has become a cliché. With massive amounts of data that get generated at a very high speed via people or machines, our capability to store and analyze data has increased. Cassandra is one of the most successful data stores that scales linearly, is easy to deploy and manage, and is blazing fast.
This book is about Cassandra and its ecosystem. The aim of this book is to take you from the basics of Apache Cassandra to understand what goes on under the hood. The book has three broad goals. First, to help you take right design decisions and understand the patterns and antipatterns. Second, to enable you to manage infrastructure on a rainy day. Third, to introduce you to some of the tools that work with Cassandra to monitor and manage Cassandra and to analyze the big data that you have inside it.
This book does not take a purist approach, rather a practical one. You will come to know proprietary tools, GitHub projects, shell scripts, third-party monitoring tools, and enough references to go beyond and dive deeper if you want.