Counting things often gets you to where you need to be, but sometimes more complex tools are required to do the job. Fortunately, we can write our own tools in the UNIX paradigm and use them in our workstream pipes along with our other command-line tools if we so desire.
One such tool is python, along with popular data science libraries such as pandas, numpy, and scikit-learn. This isn't a text on all the great things those libraries can do for you (if you'd like to learn, a good place to start is the official python tutorial (https://docs.python.org/3/tutorial/) and the basics of Pandas data structures in the Pandas documentation (https://pandas.pydata.org/pandas-docs/stable/basics.html). Make sure you have Python, pip, and pandas installed before you continue (see Chapter 1, Data Science at the Command Line and Setting It Up)...