Beautifulsoup crawls digital notes and cannot extract the original numbers. the original text shows "8080", but after crawling, different numbers are displayed each time.

question: the content of the original text is "8080", but after crawling, different numbers are displayed each time.

1. Page content

II. Program
import requests
from bs4 import BeautifulSoup

User_Agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.79 Safari/537.36"
headers= {" User-agent":User_Agent}

proxy = []
url =" https://proxy.coderbusy.com/"
res = requests.get (url,headers=headers)
soup = BeautifulSoup (res.text,"lxml")
ips = soup.findAll ("tr")

for x in range (1 (ips)):

ip = ips[x]
ip_temp = soup.select("-sharpsite-app > div > div > div > div > table > tbody > tr > td.port-box")
aa=ip_temp[0].attrs.get("data-ip")
aaa=ip_temp[0].string

print (ip_temp [0])
print (aa)
print (aaa)

3. Running result
< td class= "port-box" data-i= "8450" data-ip= "62.33.159.116" > 17981 < / td >
62.33.159.116
17981


the real port is replaced with js after the page loads. The review page element has an encrypted mian.js:

eval(function (p, a, c, k, e, d) { e = function (c) { return (c < a ? "" : e(parseInt(c / a))) + ((c = c % a) > 35 ? String.fromCharCode(c + 29) : c.toString(36)) }; if (!''.replace(/^/, String)) { while (c--) d[e(c)] = k[c] || e(c); k = [function (e) { return d[e] }]; e = function () { return '\\w+' }; c = 1; }; while (c--) if (k[c]) p = p.replace(new RegExp('\\b' + e(c) + '\\b', 'g'), k[c]); return p; }('$(e(){$(\'\\f\\3\\g\\8\\1\\r\\p\\g\\k\')["\\4\\2\\q\\o"](e(u,h){5 7=$(h);5 j=7["\\i\\2\\1\\2"](\'\\a\\3\');5 9=l["\\3\\2\\8\\d\\4\\m\\b\\1"](7["\\i\\2\\1\\2"](\'\\a\'));5 c=j["\\d\\3\\n\\a\\1"](\'\\f\');t(5 6=0;6<c["\\n\\4\\b\\s\\1\\o"];6PP){9-=l["\\3\\2\\8\\d\\4\\m\\b\\1"](c[6])}7["\\1\\4\\k\\1"](9)})})', 31, 31, '|x74|x61|x70|x65|var|d7|ClpoEy3|x72|TO5|x69|x6e|tVF6|x73|function|x2e|x6f|fnDKXroKU2|x64|jgemfCG4|x78|window|x49|x6c|x68|x62|x63|x2d|x67|for|wssP1'.split('|'), 0, {}))

online decryption to get:

$(function()
    {
    $('\x2e\x70\x6f\x72\x74\x2d\x62\x6f\x78')["\x65\x61\x63\x68"](function(wssP1,fnDKXroKU2)
        {
        var ClpoEy3=$(fnDKXroKU2);
        var jgemfCG4=ClpoEy3["\x64\x61\x74\x61"]('\x69\x70');
        var TO5=window["\x70\x61\x72\x73\x65\x49\x6e\x74"](ClpoEy3["\x64\x61\x74\x61"]('\x69'));
        var tVF6=jgemfCG4["\x73\x70\x6c\x69\x74"]('\x2e');
        for(var d7=0;
        d7<tVF6["\x6c\x65\x6e\x67\x74\x68"];
        d7PP)
            {
            TO5-=window["\x70\x61\x72\x73\x65\x49\x6e\x74"](tVF6[d7])
        }
        ClpoEy3["\x74\x65\x78\x74"](TO5)
    }
    )
}
)

after the hexadecimal is converted to a string, you get:

$(function() {
    $('.port-box')["each"](function(wssP1, fnDKXroKU2) {
        var ClpoEy3 = $(fnDKXroKU2);
        var jgemfCG4 = ClpoEy3["data"]('ip');
        var TO5 = window["parseInt"](ClpoEy3["data"]('i'));
        var tVF6 = jgemfCG4["split"]('.');
        for (var d7 = 0; d7 < tVF6["length"]; d7PP) {
            TO5 -= window["parseInt"](tVF6[d7])
        }
        ClpoEy3["text"](TO5)
    })
})

as you can see from the code, the real port is. The value of the data-ip attribute in prot-box minus the sum of four digits of ip


Thank you very much! You said, I looked up the page JS, really has. But if you can't see any information from the outside, how to determine whether there is a JS?

"as you can see from the code, the real port is. The value of the data-ip attribute in prot-box minus the sum of four digits of ip."
this is a fixed value, but the value changes every time it is updated. Do I misunderstand that?

Menu