Under what circumstances does scrapy Crawspider use

CrawlSpiders
you can quickly create code for CrawlSpider templates with the following command:

scrapy genspider-t crawl tencent tencent.com

in the last case, we made a new url as a Request request parameter through regular expressions, and now we can change it.

class scrapy.spiders.CrawlSpider

it is a derivative of Spider. The Spider class is designed to crawl only the pages in the start_url list, while the CrawlSpider class defines some regular (rule) to provide a convenient mechanism to follow link. It is more suitable to get link from the crawled web page and continue to crawl.

Mar.20,2021
Menu