Solutions to the problems in the implementation of distributed scrapy-redis

  1. after the scrapy-redis distributed crawler starts, can it run scrapy runspdier xx.py on a new machine to add slaves while it is crawling? Will you crawl the same url?
  2. A running project has configured scrapy-redis-related settings (REDIS_HOST, etc.) in settings.py, but what inherits in spider.py is that scrapy.Spider, does not inherit RedisSpider, to run the crawler on one machine, and then start the crawler on another machine (not at the same time). Will they crawl to the same url?
  3. what is the difference between, scrapy crawl and scrapy runspider in a crawler project?
Menu