Scrapy 爬虫在尝试访问 'china.chinadaily.com.cn' 时遇到错误,可能是由于网络连接问题或目标网站出现了故障。

日志中显示了 SSL 证书验证错误,可能是因为目标网站的 SSL 证书与其域名不匹配或已过期。

以下是错误日志的详细信息:

[23/06/09 00:32:45 open_spider(engine.py:272) INFO ] Spider opened [23/06/09 00:32:45 log(logstats.py:48) INFO ] Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) [23/06/09 00:32:45 start_listening(telnet.py:69) INFO ] Telnet console listening on 127.0.0.1:6027 [23/06/09 00:32:45 _identityVerifyingInfoCallback(tls.py:69) WARNING ] Remote certificate is not valid for hostname 'china.chinadaily.com.cn'; VerificationError(errors=[DNSMismatch(mismatched_id=DNS_ID(hostname=b'china.chinadaily.com.cn'))]) [23/06/09 00:32:46 _identityVerifyingInfoCallback(tls.py:69) WARNING ] Remote certificate is not valid for hostname 'china.chinadaily.com.cn'; VerificationError(errors=[DNSMismatch(mismatched_id=DNS_ID(hostname=b'china.chinadaily.com.cn'))]) [23/06/09 00:32:46 _identityVerifyingInfoCallback(tls.py:69) WARNING ] Remote certificate is not valid for hostname 'china.chinadaily.com.cn'; VerificationError(errors=[DNSMismatch(mismatched_id=DNS_ID(hostname=b'china.chinadaily.com.cn'))]) [23/06/09 00:32:46 _retry(retry.py:97) ERROR ] Gave up retrying <GET https://china.chinadaily.com.cn/5bd5639ca3101a87ca8ff636/> (failed 3 times): [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>] [23/06/09 00:32:46 _log_download_errors(scraper.py:219) ERROR ] Error downloading <GET https://china.chinadaily.com.cn/5bd5639ca3101a87ca8ff636/> Traceback (most recent call last): File "/usr/local/python3.7/lib/python3.7/site-packages/scrapy/core/downloader/middleware.py", line 45, in process_request return (yield download_func(request=request, spider=spider)) twisted.web._newclient.ResponseNeverReceived: [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>] [23/06/09 00:32:46 close_spider(engine.py:309) INFO ] Closing spider (finished)

Scrapy 爬虫错误: 无法连接到 china.chinadaily.com.cn

原文地址: https://www.cveoy.top/t/topic/owxM 著作权归作者所有。请勿转载和采集!

免费AI点我,无需注册和登录