Whether the scrapy retry request carries a new request header and proxy ip?

scrapy sets RetryMiddleware middleware

the purpose is to re-initiate the current request when the CAPTCHA is encountered, so as to increase the integrity of the crawled data.

class LocalRetryMiddleware(RetryMiddleware):

    def process_response(self, request, response, spider):
        if request.meta.get("dont_retry", False):
            return response
        print(":", response.body)
        -sharp 
        img = response.xpath("//img[@src="/Account/ValidateImage"]")
        print(img)
        if img:
            print("3 ")
            time.sleep(random.choice(range(6)))
            print("ip:", request.meta.get("proxy"))
                
            return self._retry(request, response.body, spider) or response

whether the above code is a repeat request
then whether the repeat request carries a random UserAgent and a new proxy IP? The number of repeated requests is set, and each returned result still has a CAPTCHA.

Jul.15,2021

if your random ua and proxy ip are in the form of middleware, these middleware will be executed every time you retry, so the proxy and ua will reload one if you use it randomly.

Menu