How to remove url from DupeFilter when scrapy-redis acquisition fails

problem: when collecting a page, it may return empty content due to network reasons, but this collection record is recorded in the DupeFilter of redis, so that it cannot be collected again.
excuse me: how to manually remove the failed url from the xx:DupeFilter of redis during the writing process of redis.

Apr.25,2022

finally got it.
introduces
from scrapy.utils.request import request_fingerprint

.

in spiders, manually determine whether the response meets the crawling requirements, and if not, delete the fingerprint.

from scrapy.utils.request import request_fingerprint

    def parse(self,response):
        ajaxT = json.loads(response.text)
        if ajaxT['status'] == 'success':
             -sharp
        else:
            -sharpredis
            fp = request_fingerprint(response.request, include_headers=None)
            self.server.srem(self.name + ':dupefilter', fp)
Menu