What if the data get of the node.js crawler is a template string similar to ejs?

A project of practicing hands. I want to climb the singer information on NetEase Cloud
code is similar to

.
const request = require("superagent");
const cheerio = require("cheerio");
request
    .get("http://music.163.com/-sharp/discover/artist/cat")
    .accept("text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8")
    .query({
        id: 1001,
        initial: 65
    })

but the data crawled to is a string template:

<li class="j-item" data-userId=${user.userId} data-username=${user.nickname} data-index=${user_index}><a href="-sharp"><img src=${user.avatarUrl}>${user.nickname}</a></li>

what should be done? Thank you ~


first of all, you get ${user.userId} indicating that it is not rendered by the server, so the data can only be rendered through js at the front end.
then you have two ways to get the data
one is to look at the http request, including the resource file and the XHR, to see which request gets the data, but this way may not necessarily get the data you want to display on the page, because there may be rules such as stitching and splitting Filter, so you may have to do code analysis.
the second is to open a headless browser to let js finish execution, and then get the value displayed by the element, such as puppeteer , and so on.


obviously you should look for the user object in the script tag in html instead of staring at the template. Of course, you can also use phantom.js to load the page and then crawl it.


this may be what you need http://music.163.com/discover.

Menu