Introduction to Cassandra
Cassandra is an open source, distributed, non-relational, partitioned row store. Cassandra rows are organized into tables and indexed by a key. It uses an append-only, log-based storage engine. Data in Cassandra is distributed across multiple masterless nodes, with no single point of failure. It is a top-level Apache project, and its development is currently overseen by the Apache Software Foundation (ASF).
Each individual machine running Cassandra is known as a node. Nodes configured to work together and support the same dataset are joined into a cluster (also called a ring). Cassandra clusters can be further subdivided based on geographic location, by being assigned to a logical data center (and potentially even further into logical racks.) Nodes within the same data center share the same replication factor, or configuration, that tells Cassandra how many copies of a piece of data to store on the nodes in that data center. Nodes within a cluster are kept informed...