Web spidering starts with a URL or a list of URLs to visit, and when the spider gets a new page, it analyzes the page to identify all the hyperlinks, adding these links to the list of URLs to be crawled. This action continues recursively for as long as new data is found.
A web spider can find new URLs and index them for crawling or download useful data from them. In the following recipe, we will use Scrapy to create a web spider.