As their name indicates, link extractors are the objects that are used to extract links from the Scrapy response object. Scrapy has built-in link extractors, such as scrapy.linkextractors.
Link extractor with Scrapy
How to do it...
Let's build a simple link extractor with Scrapy:
- As we did for the previous recipe, we have to create another spider for getting all the links.
In the new spider file, import the required modules:
import scrapy from scrapy.linkextractor import LinkExtractor from scrapy.spiders import Rule, CrawlSpider
- Create a new spider class and initialize the variables:
class HomeSpider2(CrawlSpider): name = 'home2' allowed_domains = ['books.toscrape.com'] start_urls...