We will start by installing the required software. This will include the Python distribution, some fundamental Python libraries, and external bioinformatics software. Here, we will also be concerned with the world outside Python. In bioinformatics and big data, R is also a major player; therefore, you will learn how to interact with it via rpy2, which is a Python/R bridge. We will also explore the advantages that the IPython framework (via Jupyter Notebook) can give us in order to efficiently interface with R. This chapter will set the stage for all of the computational biology that we will perform in the rest of this book.
As different users have different requirements, we will cover two different approaches for installing the software. One approach is using the Anaconda Python (http://docs.continuum.io/anaconda/) distribution, and another approach to install the software is via Docker (a server virtualization method based on containers sharing the same operating system kernel—https://www.docker.com/). If you are using a Windows-based operating system, you are strongly encouraged to consider changing your operating system or use Docker via some of the existing options on Windows. On macOS, you might be able to install most of the software natively, though Docker is also available.