Python crawler report 404

019-01-05 15:50:16 [csrc][scrapy.extensions.logstats] INFO: Crawled 167 pages (at 10 pages/min), scraped 0 items (at 0 items/min)
2019-01-05 15:50:19 [csrc][scrapy.core.engine] DEBUG: Crawled (404) <GET http://www.csrc.gov.cn/pub/zjhpublic/G00306202/201806/t20180622_340240.htm> (referer: http://www.csrc.gov.cn/pub/newsite/xxpl/yxpl/index_9.html)
2019-01-05 15:50:19 [csrc][scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 http://www.csrc.gov.cn/pub/zjhpublic/G00306202/201806/t20180622_340240.htm>: HTTP status code is not handled or not allowed
2019-01-05 15:50:24 [csrc][scrapy.core.engine] DEBUG: Crawled (200) <GET http://www.csrc.gov.cn/pub/zjhpublic/G00306202/201806/t20180622_340241.htm> (referer: http://www.csrc.gov.cn/pub/newsite/xxpl/yxpl/index_9.html)
2019-01-05 15:50:29 [csrc][scrapy.core.engine] DEBUG: Crawled (404) <GET http://www.csrc.gov.cn/pub/zjhpublic/G00306202/201806/t20180622_340243.htm> (referer: http://www.csrc.gov.cn/pub/newsite/xxpl/yxpl/index_9.html)
2019-01-05 15:50:29 [csrc][scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 http://www.csrc.gov.cn/pub/zjhpublic/G00306202/201806/t20180622_340243.htm>: HTTP status code is not handled or not allowed
2019-01-05 15:50:36 [csrc][scrapy.core.engine] DEBUG: Crawled (404) <GET http://www.csrc.gov.cn/pub/zjhpublic/G00306202/201806/t20180622_340245.htm> (referer: http://www.csrc.gov.cn/pub/newsite/xxpl/yxpl/index_9.html)
2019-01-05 15:50:36 [csrc][scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 http://www.csrc.gov.cn/pub/zjhpublic/G00306202/201806/t20180622_340245.htm>: HTTP status code is not handled or not allowed
2019-01-05 15:50:42 [csrc][scrapy.core.engine] DEBUG: Crawled (404) <GET http://www.csrc.gov.cn/pub/zjhpublic/G00306202/201806/t20180622_340247.htm> (referer: http://www.csrc.gov.cn/pub/newsite/xxpl/yxpl/index_9.html)
2019-01-05 15:50:42 [csrc][scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 http://www.csrc.gov.cn/pub/zjhpublic/G00306202/201806/t20180622_340247.htm>: HTTP status code is not handled or not allowed
2019-01-05 15:50:49 [csrc][scrapy.core.engine] DEBUG: Crawled (404) <GET http://www.csrc.gov.cn/pub/zjhpublic/G00306202/201806/t20180622_340246.htm> (referer: http://www.csrc.gov.cn/pub/newsite/xxpl/yxpl/index_9.html)
2019-01-05 15:50:49 [csrc][scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 http://www.csrc.gov.cn/pub/zjhpublic/G00306202/201806/t20180622_340246.htm>: HTTP status code is not handled or not allowed


Apr.02,2022

articles have been deleted. It even reported 404

.
Menu