In the current structure of Sina Weibo, how do I crawl all Weibo of a single user?

recently, I followed a well-informed blogger on Sina Weibo. He has more than 20, 000 Weibo posts, mostly in plain text.

have you ever been a partner of data collection and crawlers, tell me about this part of the way of thinking and understanding? (I am also groping)

Mar.17,2021

I have done a Weibo climbing article before, using puppeteer.js, to completely simulate user behavior and will not be blocked from detection
you can take a look at this library


it is illegal to climb Weibo. Please read Weibo's user agreement carefully. So just do it secretly, don't do it with so much fanfare.


Java
has never done Weibo, but the idea is to first obtain authentication Cookie,Token and so on, then grab the package with Fiddler, mainly the interface for requesting data, and then capture the Weibo part for persistence with Jsoup.
about the source, there should be an App interface, or a PC page or an H5 page, to see which is easier to choose.


previously wrote a simulated login with Java and climbed my own private message
because I was lazy. Instead of using Weibo's API
, I used Fiddler to grab packets, analyze parameters, simulate browser login, send requests, and parse Json
. The disadvantage is that it is relatively passive, so others can't play with a parameter program.

if I were asked to write another one now, I would choose to write a Chrome plug-in
, after all, it is a browser. Don't worry about authentication, just climb

.

if the plug-in doesn't bother to write, you can take a look at this
from=groupmessage" rel=" nofollow noreferrer "> without writing code. Webscraper grabs Li Xiaolai in 30 seconds


Weibo has its own open platform, which you can get through Weibo's API. There is no need to use crawlers

.
Menu