In the previous sections, we learned how to work with an MQTT stream and Kafka to apply rule-based analytics. Now we want to store our data in a scalable and fault-tolerant database. We can use various NoSQL databases, such as HBase, Parquet, or Kudu, but for our platform we are going to use Apache Cassandra.
Storing time-series data on Apache Cassandra
Apache Cassandra
Apache Cassandra is a decentralized NoSQL database that has a good level of scalability and high availability without compromising performance. Apache Cassandra supports thousands of nodes and different levels of replicas and consistency. It also has a high level of data sharding.
Apache Cassandra is organized as a ring of nodes. Each node takes care of a portion...