On the 4th of September 2018, Spotify labs announced that cstar- the Cassandra orchestration tool for the command line, will be made freely available to the public.
In Cassandra, it is complicated to understand how to achieve the perfect performance, security, and data consistency. You need to run a specific set of shell commands on every node of a cluster, usually in some coordination to avoid the cluster being down. This task can be easy for small clusters, but can get tricky and time consuming for the big clusters. Imagine having to run those commands on all Cassandra nodes in the company! It would be time consuming and labor intensive.
A scheduled upgrade of the entire Cassandra fleet at Spotify included a precise procedure that involved numerous steps. Since Spotify has clusters with hundreds of nodes, upgrading one node at a time is unrealistic. Upgrading all nodes at once also wasn't a probable option, since that would take down the whole cluster.
In addition to the outlined performance problems, other complications while dealing with Cassandra involved:
Spotify was in dire need of an efficient and robust method to counteract these performance issues on thousands of computers in a coordinated manner.
Ansible and Fabric are not topology-aware. They can be made to run commands in parallel on groups of machines. Some wrapper scripts and elbow grease, can help split a Cassandra cluster into multiple groups, and execute a script on all machines in one group in parallel. But on the downside, this solution doesn’t wait for Cassandra nodes to come back up before proceeding nor does it notice if random Cassandra nodes go down during execution.
cstar is based on paramiko-a Python (2.7, 3.4+) implementation of the SSHv2 protocol, and shares the same ssh/scp implementation that Fabric uses. Being a command line tool, it runs an arbitrary script on all hosts in a Cassandra cluster in “topology aware” fashion.
Example of cstar running on a 9 node cluster with replication factor of 3, with the assumption that the script brings down the Cassandra process. Notice how there are always 2 available replicas for each token range.
Source: Spotify Labs
Installing cstar and running a command on a cluster is easy and can be done by following this quick example
Source: Spotify Labs
Execution of a script on one or more clusters is a job. Job control in cstar works like in Unix shells. A user can pause running jobs and then resume them at a later point in time. It is also possible to configure cstar to pause a job after a certain number of nodes have completed. This helps users to:
The features of Cstar has made it really easy for Spotify to work with Cassandra clusters. You can find more insights to this article on Spotify Labs.
Mozilla releases Firefox 62.0 with better scrolling on Android, a dark theme on macOS, and more
Baidu releases EZDL – a platform that lets you build AI and machine learning models without any coding knowledge
PrimeTek releases PrimeReact 2.0.0 Beta 3 version