Problem parsed by xpath

crawls the movie of Douban, saying that the

  • tag of each movie is parsed into list, but after traversing the list, it is found that every element in the list is the same

    .
    movies = selector.xpath("//*[@id="content"]/div/div[1]/ol/li") -sharphtmlli
            for movie in movies:
                print(movie.xpath("//span[@class="title"][1]/text()"))-sharpli

  • Mar.04,2021

    import requests
    from pyquery import PyQuery as Q
    
    r = requests.get('https://movie.douban.com/')
    for _ in Q(r.text).find('.ui-slide-item'):
        print Q(_).find('.title').text()

    / / span is preceded by a.


    can you give me the URL you crawled? Can't find the page you climbed


    it is written in selenium, but the xpath is the same. The landlord can try to change it

    .
    driver = webdriver.PhantomJS()
    driver.get("https://movie.douban.com/top250")
    print driver.find_elements_by_xpath('//tbody/tr/td[2]/div/p')
    moves = driver.find_elements_by_xpath(".//*[@id='content']/div/div[1]/ol/li/div/div[2]/div[2]/p")
    for move in moves:
        print move.text
    • Use xpath to get the value of the node as None

      want to crawl http: 47.99.86.238 portal li. the data of this website, use scrapy, to set everything else, only one value to get is None, please take a look at it. I can match the value using the plug-in myself, but the result printed by storyMale in...

      Aug.20,2021
    • How does Xpath match multiple times in one range?

      The code is a little messy, but that s what the original page looks like. You can change it . <li class="list__item"><div class="list__title">The world this week< div><a itemProp="url" class="link...

      Sep.16,2021
    Menu