What is the order in which Scrapy automatically turns the page and crawls?

recently read Learning Scrapy, which mentions a crawler that automatically turns pages and crawls items on each page. The book says that Scrapy uses last-in, first-out queues.

suppose there are 30 items on each page, and start_url is set to the first page. My understanding of LIFO is that the first item out should be the item at the bottom of the last page, but the result of the routine runs first is the last item on the first page. In fact, the overall order is page one and page two. On the last page, the content order of each page is from the last to the first.

The order of

is really good, but I think the overall order of the results of routines should start from the last page. After the next link cannot be extracted on the last page, why not go directly to item_selector , and then extract the links of each project on the last page and give it to parse_item for processing? How can we deal with the first page first?

is there a problem in my understanding of yield that leads to this misunderstanding? I hope to get your help.

def parse(self, response):
    -sharp Get the next index URLs and yield Requests
    next_selector = response.xpath("//*[contains(@class,"
        ""next")]//@href")
    for url in next_selector.extract():
        yield Request(urlparse.urljoin(response.url, url))
        
    -sharp Get item URLs and yield Requests
    item_selector = response.xpath("//*[@itemprop="url"]/@href")
    for url in item_selector.extract():
        yield Request(urlparse.urljoin(response.url, url),
            callback=self.parse_item)
Mar.11,2021

looks at the source code and draws the following conclusions:
1. For each depth response , the default priority for each parse issued request is -(depth+1) . The scheduler queue of
2 and scrapy is a PriorityQueue (priority queue, priority as small as possible). The elements in this queue are LifoQueue , that is, what you call last-in-first-out queue, each LifoQueue corresponds to a priority ;
3, for each yield request ,
I hope you can understand what I'm saying.

Menu