Python3 crawler suddenly HTTPError.

A website that used to crawl suddenly added protective measures. It used to be normal, but now when you only use urlopen in urllib and only try to get the home page HTML, this error occurs when
Request joins headers:

Requestheaders

but all browsers can be accessed normally.

I would like to ask the god of this situation, there is nothing urlib can do, can only change scrapy or selenium?

Sep.24,2021

304 means that the resource has not been modified. You must have passed the wrong header


grab the packet and analyze it. What is the difference between the request made by your crawler and that made by the browser?
is there another proxy pool used?

Menu