The only requirement throughout this book is a recent version of Python, at least Python 3.6, but a higher version is preferable. Some readers might prefer to use the Anaconda distribution of Python, which comes with many of the packages and tools required in this book. If this is the case, you should use the conda package manager to install the packages. Python is supported on all major operating systems – Windows, macOS, and Linux – and on many platforms. The following table covers the main libraries and their versions used at the time of writing this book:
Software/libraries covered in the book |
Version |
Chapter |
Python |
3.6 or higher |
All |
NumPy |
1.18.3 |
All |
SciPy |
1.4.1 |
All |
Matplotlib |
3.2.1 |
All |
Pandas |
1.0.3 |
6 - 10 |
Bokeh |
2.1.0 |
6 |
Scikit-Learn |
0.22.1 |
7 |
Dask |
2.18.1 |
10 |
Apache Kafka |
2.5.0 |
10 |
If you are using the digital version of this book, we advise you to type the code yourself or access the code via the GitHub repository (link available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.
Some readers may prefer to work through the code samples in this book in a Jupyter notebook rather than in a simple Python file. There are one or two places in this book where you may need to repeat plotting commands. These places are marked in the instructions.