So far, we have learned about web-development technologies, data-finding techniques, and accessing various Python libraries to scrape data from the web.
In this chapter, we will be learning about and exploring two Python libraries that are popular for document parsing and scraping activities: Scrapy and Beautiful Soup.
Beautiful Soup deals with document parsing. Parsing a document is done for element traversing and extracting its content. Scrapy is a web crawling framework written in Python. It provides a project-oriented scope for web scraping. Scrapy provides plenty of built-in resources for email, selectors, items, and so on, and can be used from simple to API-based content extraction.
In this chapter, we will learn about the following:
- Web scraping using Beautiful Soup
- Web scraping using Scrapy
- Deploying a web crawler (learning...