Technical requirements
Let’s understand the technical requirements for the different Python packages and other ML libraries that are needed to apply ML in genomics in this chapter.
Python packages
The following are some common Python packages that every data scientist and genomic researcher uses for not only genomic analysis for any kind of data analysis.
Pandas
Pandas
is one of the most popular data analysis tools in Python. Pandas
do not need an introduction as it is part and parcel of every data scientist’s tool. The great thing about Pandas
is it contains all the functions and methods to support data analysis irrespective of the type of data. It’s also super easy to install Pandas, which you can do by simply entering pip install pandas
in your terminal. Then, you can include import pandas as pd
in your Python script, which you will see later in the chapter.
Matplotlib
We will be using Matplotlib, a very popular Python library for visualization...