Chapter 4
Data Acquisition Features: Web APIs and Scraping
Data analysis often works with data from numerous sources, including databases, web services, and files prepared by other applications. In this chapter, you will be guided through two projects to add additional data sources to the baseline application from the previous chapter. These new sources include a web service query, and scraping data from a web page.
This chapter’s projects cover the following essential skills:
Using the requests package for Web API integration. We’ll look at the Kaggle API, which requires signing up to create an API token.
Using the Beautiful Soup package to parse an HTML web page.
Adding features to an existing application and extending the test suite to cover these new alternative data sources.
It’s important to recognize this application has a narrow focus on data acquisition. In later chapters, we’ll validate the data and convert it to a more useful form. This reflects...