Chapter 3: Web Scraping and Interactive Visualizations
Activity 3: Web Scraping with Jupyter Notebooks
- For this page, the data can be scraped using the following code snippet:
data = [] for i, row in enumerate(soup.find_all('tr')): row_data = row.find_all('td') try: d1, d2, d3 = row_data[1], row_data[5], row_data[6] d1 = d1.find('a').text d2 = float(d2.text) d3 = d3.find_all('span')[1].text.replace('+', '') data.append([d1, d2, d3]) except: print('Ignoring row {}'.format(i)
- In the
lesson-3-workbook.ipynb
Jupyter Notebook, scroll toActivity A: Web scraping with Python
. - Set the
url
variable and load an IFrame of our page in the notebook by running the following code:url = 'http://www.worldometers.info/world-population/ population-by-country/' IFrame(url, height=300, width=800)
The page should load in the notebook. Scrolling down, we can see the Countries in the world by population heading and...