How does a crawler determine when to stop?

would like to ask, write a crawler, how to tell when the crawler should stop?
the initial state is a url; and then there is a

while(isNotEmpty(urlList)){
    // do something
}

my idea is this, but the speed of queuing url can not keep up with the speed of consumption, so that the urlList is empty, and then the crawler stops. I would like to ask which Daniel has written the framework of the crawler and what are the conditions under which the crawler stops running.

Apr.02,2021

the idea is a little strange, urlList links are also put in, put a climb on the line. The crawler stops when you don't put a link to urlList.


depending on the specific circumstances to be crawled:

1:
2:Kafkatopic

how the crawler stops depends on your own business.
as long as the reptile is done to remove the weight.
if the crawler is controllable, replace multithreading with a single process. By killing the process. Control the crawler.
screen deploy the crawler project.

Menu