Scrapy crawler, the contents of the crawl are shown in the execution log in cmd, but after the log is hidden at run time (add-- nolog), after running the command), there is no output result.

crawler file:
import scrapy
from xtzx.items import XtzxItem

< H1 > from scrapy.http import Request < / H1 >

class LessonSpider (scrapy.Spider):

name = "lesson"
allowed_domains = ["xuetangx.com"]
start_urls = ["http://www.xuetangx.com/courses?credential=0&page_type=0&cid=118&process=0&org=0&course_mode=0&page=2"]

def parse(self, response):
    item=XtzxItem()
    item["title"]=response.xpath("//div[@class="fl list_inner_right cf"]/div[@class="coursename"]/a/h2[@class="coursetitle"]/text()").extract()
    -sharpitem["school"]=response.xpath("//div[@class="fl name"]/ul/li/span/text()").extract()
    -sharpitem["stu"]=response.xpath("//div[@class="fl name] "/li/span[@class="ri-tag fl"]/text()").extract()
    yield item
    
< hr >

pipelines:

class XtzxPipeline (object):

def process_item(self, item, spider):
    print(item["title"][0])
    -sharpprint(item["school"][0])
    -sharpprint(item["stu"][0])
    return item

< hr >

2018-04-28 14:43:09 [scrapy.utils.log] INFO: Scrapy 1.5.0 started (bot: xtzx)
2018-04-28 14:43:09 [scrapy.utils.log] INFO: Versions: lxml 4.2.1.0, libxml2 2.9.7, cssselect 1.0.3, parsel 1.4.0, w3lib 1.19.0, Twisted 17.9.0, Python 3.5.4 (v3.5.4br 3f56838, Aug 8 2017, 02:17:05) [MSC v.1900 64 bit (AMD64)], pyOpenSSL 17.5.0 (OpenSSL 1.1.0h 27 Mar 2018), cryptography 2.2.2, Platform Windows-10-10.0.16299-SP0
2018-04-28 14:43:09 [scrapy.crawler] INFO: Overridden settings: {"NEWSPIDER_MODULE":" xtzx.spiders", "SPIDER_MODULES": [" xtzx.spiders"], "BOT_NAME": "xtzx"}
2018-04-28 14:43:09 [scrapy.middleware] INFO: Enabled extensions:
[" scrapy.extensions.telnet.TelnetConsole",
"scrapy.extensions.corestats.CoreStats",
" scrapy.extensions.logstats.LogStats"]
2018-04-28 14:43:09 [scrapy.middleware] INFO: Enabled downloadermiddlewares:
["scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware",
" scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware",
"scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware",
" scrapy.downloadermiddlewares.useragent.UserAgentMiddleware",
"scrapy.downloadermiddlewares.retry.RetryMiddleware",
" scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware",
"scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware",
" scrapy.downloadermiddlewares.redirect.RedirectMiddleware",
"scrapy.downloadermiddlewares.cookies.CookiesMiddleware",
" scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware",
"scrapy.downloadermiddlewares.stats.DownloaderStats"]
2018-04-28 14:43:09 [scrapy.middleware] INFO: Enabled spidermiddlewares:
[" scrapy.spidermiddlewares.httperror.HttpErrorMiddleware",
"scrapy.spidermiddlewares.offsite.OffsiteMiddleware",
" scrapy.spidermiddlewares.referer.RefererMiddleware",
"scrapy.spidermiddlewares.urllength.UrlLengthMiddleware",
" scrapy.spidermiddlewares.depth.DepthMiddleware"]
2018-04-28 14:43:09 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2018-04-28 14:43:09 [scrapy.core.engine] INFO: Spider opened
2018-04-28 14:43:09 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), Scraped 0 items (at 0 items/min)
2018-04-28 14:43:09 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1 scraped 6023
2018-04-28 14:43:10 [scrapy.core.engine] DEBUG: Crawled (200) < GET http://www.xuetangx.com/cours.; Page_type=0&cid=118&process=0&org=0&course_mode=0&page=2 > (referer: None)
2018-04-28 14:43:10 [scrapy.core.scraper] DEBUG: Scraped from http://www.xuetangx.com/cours.;page_type=0&cid=118&process=0&org=0&course_mode=0&page=2>
is the content to be crawled, and you have got

< hr >

{"title": [" Accounting principles (Spring 2018)",

       "",
       "2018",
       "()",
       "2018",
       "102:",
       "102:2018",
       """ 2018",
       "",
       "2018"]}

< hr >

2018-04-28 14:43:10 [scrapy.core.engine] INFO: Closing spider (finished)
2018-04-28 14:43:10 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{"downloader/request_bytes": 292,
" downloader/request_count": 1,
"downloader/request_method_count/GET": 1,
" downloader/response_bytes": 27101,
"downloader/response_count": 1,
" downloader/response_status_count/200": 1,
"finish_reason":" finished",
"finish_time": datetime.datetime (2018, 4, 28, 6, 43, 10, 470916),
" item_scraped_count": 1,
"log_count/DEBUG": 3,
" log_count/INFO": 7,
"response_received_count": 1,
" scheduler/dequeued": 1,
"scheduler/dequeued/memory": 1,
" scheduler/enqueued": 1,
"scheduler/enqueued/memory": 1,
start_time": datetime.datetime (2018, 4, 28, 6, 43, 9, 860924)}
2018-04-28 14:43:10 [scrapy.core.engine] INFO: Spider closed (finished)

execute scrapy crawl lesson in cmd to get the above results
but execute scrapy crawl lesson-- nolog has no output
besides, title should be a list, right?
in addition, I also asked a question just now, and then a teacher answered it, but I didn"t respond to it. Is there any limit on adopting the answer?

Mar.06,2021

does the landlord have a solution? I have the same problem

Menu