By now, you should have a very broad understanding of how to build a solid web scraper. Up to this point, you have learned how to collect information from the internet efficiently, safely, and respectfully. The tools that you have at your disposal are enough to build web scrapers on a small to medium scale, which may be just what you need to accomplish your goals. However, there may come a day when you need to upscale your application to handle large and production-sized projects. You may be lucky enough to make a living out of offering services, and, as that business grows, you will need an architecture that is robust and manageable. In this chapter, we will review the architectural components that make a good web scraping system, and look at example projects from the open source community. Here are the topics we will discuss:
- Components of a web scraping system...