The following figure illustrates the high-level design of the pipeline that we will be building throughout the first half of this chapter:
![](https://static.packt-cdn.com/products/9781838554491/graphics/assets/c88d9228-c7fa-46e8-813d-1e7464d29804.png)
Figure 1: A generic, multistage pipeline
Keep in mind that this is definitely not the only, or necessarily the best, way to go about implementing a data-processing pipeline. Pipelines are inherently application specific, so there is not really a one-size-fits-all guide for constructing efficient pipelines.
Having said that, the proposed design is applicable to a wide variety of use cases, including, but not limited to, the crawler component for the Links 'R' Us project. Let's examine the preceding figure in a bit more detail and identify the basic components that the pipeline comprises:
- The input source: Inputs essentially function as data-sources that pump data into the pipeline...