Using the pandas module for data and graphical analysis
The pandas
module is a popular Python library for data analysis because of its easy-to-apply utility functions and a high-performance tabular data structure called DataFrame. However, for the module to work, it needs the numpy
module, a low-level library that supports multi-dimensional array objects called ndarray
and its mathematical operations, and matplotlib
, a library for visualizations. So, install these two modules first:
pip install numpy matplotlib
Then, install the pandas
module:
pip install pandas
Since our data will be coming from XLSX sheets, install the openpyxl
dependency module of pandas
that deals with reading and writing XLSX documents:
pip install openpyxl
After installing all the dependency modules, we can start creating the DataFrame
object.
Utilizing the DataFrame
To read an XLSX document, the pandas
module has a read_excel()
method with parameters such as usecols
, which indicates the...