Even Easier Scraping!
In the previous chapter, we covered the basics of web scraping, which is the act of harvesting data from the web for your uses and projects. In this chapter, we will explore even easier approaches to web scraping and will also introduce you to social media scraping. The previous chapter was very long, as we had a lot to cover, from defining scraping to explaining how the Natural Language Toolkit (NLTK), the Requests
library, and BeautifulSoup
can be used to collect web data. I will show simpler approaches to getting useful text data with less cleaning involved. Keep in mind that these easier ways do not necessarily replace what was explained in the previous chapter. When working with data or in software projects and things do not immediately work out, it is useful to have options. But for now, we’re going to push forward with a simpler approach to scraping web content, as well as giving an introduction to scraping social media text.
First, we will cover...