Evaluating requirements
It is generally a good idea to examine what kind of load Cassandra is going to face when deployed on a production server. It does not have to be accurate, but some sense of traffic can give a little more clarity to what you expect from Cassandra (criteria for load tests), whether you really need Cassandra (the halo effect), or whether you can bear all the expenses that a running Cassandra cluster can incur on a daily basis (the value proposition). Let's see how to choose various hardware specifications for a specific need.
Hard disk capacity
A rough disk space calculation of the user that will be stored in Cassandra involves adding up data stored in four data components on disk: commit logs, SSTable, an index file, and a bloom filter. When the incoming data is compared with the data on the disk, you need to take account of the database overheads associated with each type of data. The data on disk can be about two times as large as raw data. Disk usage can be...