What does the term web scraping mean?
Web scraping is the process of collecting information directly from HTML web pages. Just like mining, we have to first collect ore of the HTML, from which we can then refine the valuable data points.
What are the main differences between scraping and using a web API? What are the challenges?
The main difference is the lack of any guarantees – there is no promise that the web page won't change in terms of its structure, or will be shown at all. In fact, many services actively attempt to prevent web scraping. Another challenge is processing raw HTML into valuable information, as it often requires some custom code.
What exactly does Beautiful Soup do? Can we scrape without it?
In our stack (requests and BeautifulSoup), the latter allows us to navigate the document and query it, pulling specific values. We can definitely...