Chapter 5. Getting Comfortable with Different Kinds of Data Sources
Note
Learning Objectives
By the end of this chapter, you will be able to:
Read CSV, Excel, and JSON files into pandas DataFrames
Read PDF documents and HTML tables into pandas DataFrames
Perform basic web scraping using powerful yet easy to use libraries such as Beautiful Soup
Extract structured and textual information from portals
Note
In this chapter, you will be exposed to real-life data wrangling techniques, as applied to web scraping.