Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
The Applied Data Science Workshop

You're reading from   The Applied Data Science Workshop Get started with the applications of data science and techniques to explore and assess data effectively

Arrow left icon
Product type Paperback
Published in Jul 2020
Publisher Packt
ISBN-13 9781800202504
Length 352 pages
Edition 2nd Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Alex Galea Alex Galea
Author Profile Icon Alex Galea
Alex Galea
Arrow right icon
View More author details
Toc

Summary

In this chapter, we worked through the process of pulling tables from Wikipedia using web scraping techniques, cleaning up the resulting data with pandas, and producing a final analysis.

We started by looking at how HTTP requests work, focusing on GET requests and their response status codes. Then, we went into the Jupyter Notebook and made HTTP requests with Python using the requests library. We saw how Jupyter can be used to render HTML in the notebook, along with actual web pages that can be interacted with. In order to learn about web scraping, we saw how BeautifulSoup can be used to parse text from the HTML, and used this library to scrape tabular data from Wikipedia.

After pulling two tables of data, we processed them for analysis with pandas. The first table contained the central bank interest rates for each country, while the second table contained the populations. We combined these into a single table that was then used for the final analysis, which involved...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime