Search headless browser cannot search

Enterprise search cannot be searched with selenium headless browser

https://www.qichacha.com/

Mar.18,2021

may have anti-crawler means, but selenium still has some characteristics, such as some special properties in global objects.


see that you have asked a lot of questions for crawling. Here I would like to remind you:
if you use ChromeDriver headless mode, you cannot visit the site with js scripts inserted through document.write () . Refer to a question on stackoverflow :
example:

>>> from selenium import webdriver
>>> option = webdriver.ChromeOptions()
>>> option.add_argument('--headless')
>>> driver = webdriver.Chrome(chrome_options=option)
[0608/163830.206:ERROR:gpu_process_transport_factory.cc(1007)] Lost UI shared context.

DevTools listening on ws://127.0.0.1:60357/devtools/browser/36a1f861-d1ab-4cef-a5a9-3072bbada0fc
>>> driver.get('https://www.baidu.com')
[0608/163849.677:INFO:CONSOLE(715)] "A parser-blocking, cross site (i.e. different eTLD+1) script, https://ss1.bdstatic.com/5eN1bjq8AAUYm2zgoY3K/r/www/cache/static/protocol/https/global/js/all_async_search_8d20902.js, is invoked via document.write. The network request for this script MAY be blocked by the browser in this or a future page load due to poor network connectivity. If blocked in this page load, it will be confirmed in a subsequent console message. See https://www.chromestatus.com/feature/5718547946799104 for more details.", source: https://www.baidu.com/ (715)

here https://ss1.bdstatic.com/5eN1bjq8AAUYm2zgoY3K/r/www/cache/static/protocol/https/global/js/all_async_search_8d20902.js is written into the html text through document.write () and then loaded, and will not be executed, so an error is reported.

but Firefox doesn't have this problem, so I recommend you use Firefox's headless mode, or phantomjs, a headless browser.
Firefox example:

from selenium import webdriver
option = webdriver.FirefoxOptions()
option.add_argument('--headless')
driver = webdriver.Firefox(firefox_options=option)
driver.get('https://www.qichacha.com')
-sharp ...

of course, you need to install Firefox. before using Firefox


chat 3327815988 DATA, qq.com, the full library of Sky Eye

Menu