After downloading the HTML pages from the server, we have to extract the required data from them. There are many modules in Python to help with this. Here we can make use of the Python package BeautifulSoup.
Parsing HTML tables
Getting ready
As usual, make sure that you install all the required packages. For this script, we require BeautifulSoup and pandas. You can install them with pip:
pip install bs4 pip install pandas
pandas is an open source data analysis library in Python.
How to do it...
We can parse HTML tables from the downloaded pages as following:
- As usual...