Problems encountered in crawling articles

encountered problems when fetching headline articles
Why are the content of articles obtained by various operations of getting pages different

I use curl crawl to save dom to file, use PHPSPIDER crawl to save dom to file, directly in the Google browser page right-click the audit element to see the dom, to check the source code to see the dom, these dom are unexpectedly different, the browser is easy to understand, it should be opened after the js run dom also changed.

for guidance, what I do is crawl the headline article of the user-specified link

Php

Nov.23,2021

should be generated dynamically through js. If you want to take the content, parse the string in the js script directly.

if you want to crawl based on dom, you can use Google's Headless Chromium

Headless Chromium

Previous: Advice on id encryption in URL (avoid random calls and modifications)

Next: Nodejs uses mongoose to operate the database, and the query result is empty

How to self-start the php-fpm service when building a LNMP environment with docker?
in order to facilitate development, we plan to create a LNMP environment based on docker, which makes it much more convenient to change computers or unify the team s development environment. choreographed a docker-compose.yaml file with nginx, php-fp...

Centos7 fpm php docker

Feb.26,2021
Nginx reverse proxy cookie set cannot be obtained!
external network access domain name: https: api.baidu.com internal network domain name: http: local.baidu.com this domain name is inaccessible outside the network! It is 127.0.0.1 pointed to by a private network address. I use nginx reverse proxy ...

Php cookie nginx

Feb.26,2021
On the problem of regular matching
description: a regular match is given to the content of an input box, and the matching content is the product activation code. looks like this: "0C31-0B81-BB32-3094-0C31-0B81-BB32-3094 " Code: $( -sharplicenseCode ).keyup(function () { le...

Python html5 html php javascript

Feb.26,2021
Problems with Chrome downloading using window.open ()
I configured the MIME type of the file with the amr suffix in the apache configuration as application ms-download, Why it is a garbled page when opening a file in amr format using window.open in chrome, while a file in amr format can be successfully dow...

Firefox chrome html php javascript

Feb.26,2021
WeChat Mini Programs login problem
what does the code circled in the following picture mean? ...

Php javascript

Feb.26,2021
Why can't php apache rewriterule rewrite url fetch & the following parameters?
rewrite rules: RewriteRule (.*) (.*) (.*) index .html index.php?p=$1&c=$2&a=$3 [QSA] access Home Blog blog index.html?page=2 can access index.php?p=Home&c=Blog&c=blog&page=2 normally but when the original link is index.php?p=Home&c=Blog&a...

Url-rewrites apache php

Feb.26,2021
Php mariadb sum divided by if there is no value, can NAN, be shown as 0?
$selectTip = $pdo->query("SELECT * FROM `tips_rate` WHERE `tip_id` = ".$row[ id ]." " ); $selecttotal = mysqli_num_rows($selectTip); $rate = $pdo->query("SELEC...

Mysqli mariadb php

Feb.26,2021
Php+ajax randomly generates problems. $username=$number executes errors like this
the following is the back-end code I think $username= is a randomly generated string in the output, so write $username=$number; like this, but this is an execution error, how to do it? For advice, thank you <?php $username=$_POST[ username ]; $nu...

Ajax php

Feb.26,2021
1045 Access denied for user 'root'@'localhost'
adopt a reward of 10 yuan, 1045 Access denied for user root @ localhost (using password:NO) recently, when I was at the front end of my self-study, I came into contact with a little bit of database. This happened when I made a new connection on ...

Shell html5 node.js php javascript

Feb.26,2021
Hurry! How to change the domain name of Aliyun Yunmeng?
the original domain name of our company s website has expired. I want to change it to another domain name. How can I change it? The website is developed using Aliyun s Yunmeng. ...

Html php apache linux

Feb.26,2021
Laravel queries all the articles under it through the tag tag.
post table is the article table, the primary key is id tag table is the tag table, the primary key is id post_tag is the intermediate mapping table, and the field is post_id,tag_id now I want to select tag 1 to find out that all the tags and article...

Laravel php

Feb.26,2021
When getting the list of followers through access_token, it shows that this operation cannot be performed.
what happens when you get a list of followers through access_token after getting access_token and show that this operation cannot be performed? $url = "https: api.weixin.qq.com cgi-bin token?grant_type=client_credential&appid=$appid&secre...

Php access-token

Feb.26,2021
Problems in apache,event mode
In apache,event mode, 8 processes are started. Set to perform 10000 automatic recreations. On the next day, there were only two hundred and three processes. is this the number of processes that apache thinks there are not enough tasks and automaticall...

Event php apache

Feb.26,2021
[help] php sent the template message of Wechat official account, but could not change the line.
$first is the data read by the database as shown in figure : read through the database can not change the line, directly written in the code can break the line, solve =! ...

Php

Feb.26,2021
Domain name configuration of the analysis page generated by php xhprof
as shown in the figure, the generated url does not have a domain name. Where do I need to configure it? ...

Php

Feb.26,2021
How to get user id?
there is an h5 page embedded in the encapsulated APP. After the function of issuing an order, how can I get the user s id to add the address and place the order ...

Html css php javascript

Feb.26,2021
How to check whether Global Register for opline and execute_data is open or not
1. Background of the problem: recently, it has been almost exclusively said on php7, that only php7 compiled with GCC 4.8 will open Global Register for opline and execute_data . 2. Question description: Q: what is 1Global Register for opline and e...

Php7

Feb.26,2021
How does php MongoDB turn ObjectId into string
the query data is printed to see that _ id is objectID,. If it is directly converted to json,_id, it will be empty. Can you change objectID to string, and then to json? ...

Mongodb php

Feb.26,2021
Array_slice is useless.
$result= array(); foreach ($infos as $key => $info) { $result[$info[ time ]][] = $info;}; $results=array_slice($result,0,255); in this code, I want to use array_slice to remove the key value of $result, but I don t use . ...

Php

Feb.26,2021
Can the field be Filter for the results of the query in the TP3.2 database? GetXxxAttribute () similar to laravel
stateful fields are often saved in the database, which are represented by numbers: 0-> fail reject 1-> success pass and so on, I want to convert 0 or 1 into Chinese characters in the obtained results, is there any good way to quickly complete t...

Thinkphp

Feb.26,2021

MySQL Query : SELECT * FROM `codeshelper`.`v9_news` WHERE status=99 AND catid='6' ORDER BY rand() LIMIT 5
MySQL Error : Disk full (/tmp/#sql-temptable-64f5-381c44c-7773.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
MySQL Errno : 1021
Message : Disk full (/tmp/#sql-temptable-64f5-381c44c-7773.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
Need Help?