5. Getting Comfortable with Different Kinds of Data Sources
Overview
This chapter will provide you with the skills to read CSV, Excel, and JSON files into pandas DataFrames. You will learn how to read PDF documents and HTML tables into pandas DataFrames and perform basic web scraping operations using powerful yet easy-to-use libraries such as Beautiful Soup. You will also see how to extract structured and textual information from portals. By the end of this chapter, you will be able to implement data wrangling techniques such as web scraping in the real world.