#2
cpxuvs2016-10-11 22:02
|
我是一名努力学习python的小白,有一些搞不明白的问题还想请教各位:
自己在摸索着pytharm的安装与应用,在用到scrapy框架爬图的编译过程中出现了这样的问题 正是搞不明白是怎么一回事。
程序代码:
C:\Python27\python.exe C:/first/main.py
2016-10-09 23:19:48 [scrapy] INFO: Scrapy 1.2.0 started (bot: first)
2016-10-09 23:19:48 [scrapy] INFO: Overridden settings: {'NEWSPIDER_MODULE': 'first.spiders', 'SPIDER_MODULES': ['first.spiders'], 'ROBOTSTXT_OBEY': True, 'USER_AGENT': 'Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0', 'BOT_NAME': 'first'}
2016-10-09 23:19:48 [scrapy] INFO: Enabled extensions:
['scrapy.extensions.logstats.LogStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.corestats.CoreStats']
2016-10-09 23:19:49 [scrapy] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware',
'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.chunked.ChunkedTransferMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2016-10-09 23:19:49 [scrapy] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware']
2016-10-09 23:19:49 [scrapy] INFO: Enabled item pipelines:
[]
2016-10-09 23:19:49 [scrapy] INFO: Spider opened
2016-10-09 23:19:49 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2016-10-09 23:19:49 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6023
2016-10-09 23:19:49 [scrapy] ERROR: Error downloading <GET https:///robots.txt>: Empty domain
Traceback (most recent call last):
File "C:\Python27\Lib\site-packages\twisted\internet\defer.py", line 1105, in _inlineCallbacks
result = result.throwExceptionIntoGenerator(g)
File "C:\Python27\Lib\site-packages\twisted\python\failure.py", line 389, in throwExceptionIntoGenerator
return g.throw(self.type, self.value, self.tb)
File "C:\Python27\lib\site-packages\scrapy-1.2.0-py2.7.egg\scrapy\core\downloader\middleware.py", line 43, in process_request
defer.returnValue((yield download_func(request=request,spider=spider)))
File "C:\Python27\lib\site-packages\scrapy-1.2.0-py2.7.egg\scrapy\utils\defer.py", line 45, in mustbe_deferred
result = f(*args, **kw)
File "C:\Python27\lib\site-packages\scrapy-1.2.0-py2.7.egg\scrapy\core\downloader\handlers\__init__.py", line 65, in download_request
return handler.download_request(request, spider)
File "C:\Python27\lib\site-packages\scrapy-1.2.0-py2.7.egg\scrapy\core\downloader\handlers\http11.py", line 60, in download_request
return agent.download_request(request)
File "C:\Python27\lib\site-packages\scrapy-1.2.0-py2.7.egg\scrapy\core\downloader\handlers\http11.py", line 285, in download_request
method, to_bytes(url, encoding='ascii'), headers, bodyproducer)
File "C:\Python27\Lib\site-packages\twisted\web\client.py", line 1470, in request
parsedURI.port)
File "C:\Python27\Lib\site-packages\twisted\web\client.py", line 1450, in _getEndpoint
tlsPolicy = self._policyForHTTPS.creatorForNetloc(host, port)
File "C:\Python27\lib\site-packages\scrapy-1.2.0-py2.7.egg\scrapy\core\downloader\contextfactory.py", line 57, in creatorForNetloc
return ScrapyClientTLSOptions(hostname.decode("ascii"), self.getContext())
File "C:\Python27\Lib\site-packages\twisted\internet\_sslverify.py", line 1059, in __init__
self._hostnameBytes = _idnaBytes(hostname)
File "C:\Python27\Lib\site-packages\twisted\internet\_sslverify.py", line 86, in _idnaBytes
return idna.encode(text).encode("ascii")
File "C:\Python27\Lib\site-packages\idna\core.py", line 350, in encode
raise IDNAError('Empty domain')
IDNAError: Empty domain
2016-10-09 23:19:49 [scrapy] ERROR: Error downloading <GET https:///%20//%20www. (most recent call last):
File "C:\Python27\Lib\site-packages\twisted\internet\defer.py", line 1105, in _inlineCallbacks
result = result.throwExceptionIntoGenerator(g)
File "C:\Python27\Lib\site-packages\twisted\python\failure.py", line 389, in throwExceptionIntoGenerator
return g.throw(self.type, self.value, self.tb)
File "C:\Python27\lib\site-packages\scrapy-1.2.0-py2.7.egg\scrapy\core\downloader\middleware.py", line 43, in process_request
defer.returnValue((yield download_func(request=request,spider=spider)))
File "C:\Python27\lib\site-packages\scrapy-1.2.0-py2.7.egg\scrapy\utils\defer.py", line 45, in mustbe_deferred
result = f(*args, **kw)
File "C:\Python27\lib\site-packages\scrapy-1.2.0-py2.7.egg\scrapy\core\downloader\handlers\__init__.py", line 65, in download_request
return handler.download_request(request, spider)
File "C:\Python27\lib\site-packages\scrapy-1.2.0-py2.7.egg\scrapy\core\downloader\handlers\http11.py", line 60, in download_request
return agent.download_request(request)
File "C:\Python27\lib\site-packages\scrapy-1.2.0-py2.7.egg\scrapy\core\downloader\handlers\http11.py", line 285, in download_request
method, to_bytes(url, encoding='ascii'), headers, bodyproducer)
File "C:\Python27\Lib\site-packages\twisted\web\client.py", line 1470, in request
parsedURI.port)
File "C:\Python27\Lib\site-packages\twisted\web\client.py", line 1450, in _getEndpoint
tlsPolicy = self._policyForHTTPS.creatorForNetloc(host, port)
File "C:\Python27\lib\site-packages\scrapy-1.2.0-py2.7.egg\scrapy\core\downloader\contextfactory.py", line 57, in creatorForNetloc
return ScrapyClientTLSOptions(hostname.decode("ascii"), self.getContext())
File "C:\Python27\Lib\site-packages\twisted\internet\_sslverify.py", line 1059, in __init__
self._hostnameBytes = _idnaBytes(hostname)
File "C:\Python27\Lib\site-packages\twisted\internet\_sslverify.py", line 86, in _idnaBytes
return idna.encode(text).encode("ascii")
File "C:\Python27\Lib\site-packages\idna\core.py", line 350, in encode
raise IDNAError('Empty domain')
IDNAError: Empty domain
2016-10-09 23:19:49 [scrapy] INFO: Closing spider (finished)
2016-10-09 23:19:49 [scrapy] INFO: Dumping Scrapy stats:
{'downloader/exception_count': 2,
'downloader/exception_type_count/idna.core.IDNAError': 2,
'downloader/request_bytes': 539,
'downloader/request_count': 2,
'downloader/request_method_count/GET': 2,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2016, 10, 9, 15, 19, 49, 706000),
'log_count/DEBUG': 1,
'log_count/ERROR': 2,
'log_count/INFO': 7,
'scheduler/dequeued': 1,
'scheduler/dequeued/memory': 1,
'scheduler/enqueued': 1,
'scheduler/enqueued/memory': 1,
'start_time': datetime.datetime(2016, 10, 9, 15, 19, 49, 256000)}
2016-10-09 23:19:49 [scrapy] INFO: Spider closed (finished)
在pycharm中出现这样的回复到底是什么问题呢?弄了两天了依旧没有弄明白,都快被这些弄疯了,求各位帮帮忙?2016-10-09 23:19:48 [scrapy] INFO: Scrapy 1.2.0 started (bot: first)
2016-10-09 23:19:48 [scrapy] INFO: Overridden settings: {'NEWSPIDER_MODULE': 'first.spiders', 'SPIDER_MODULES': ['first.spiders'], 'ROBOTSTXT_OBEY': True, 'USER_AGENT': 'Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0', 'BOT_NAME': 'first'}
2016-10-09 23:19:48 [scrapy] INFO: Enabled extensions:
['scrapy.extensions.logstats.LogStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.corestats.CoreStats']
2016-10-09 23:19:49 [scrapy] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware',
'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.chunked.ChunkedTransferMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2016-10-09 23:19:49 [scrapy] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware']
2016-10-09 23:19:49 [scrapy] INFO: Enabled item pipelines:
[]
2016-10-09 23:19:49 [scrapy] INFO: Spider opened
2016-10-09 23:19:49 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2016-10-09 23:19:49 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6023
2016-10-09 23:19:49 [scrapy] ERROR: Error downloading <GET https:///robots.txt>: Empty domain
Traceback (most recent call last):
File "C:\Python27\Lib\site-packages\twisted\internet\defer.py", line 1105, in _inlineCallbacks
result = result.throwExceptionIntoGenerator(g)
File "C:\Python27\Lib\site-packages\twisted\python\failure.py", line 389, in throwExceptionIntoGenerator
return g.throw(self.type, self.value, self.tb)
File "C:\Python27\lib\site-packages\scrapy-1.2.0-py2.7.egg\scrapy\core\downloader\middleware.py", line 43, in process_request
defer.returnValue((yield download_func(request=request,spider=spider)))
File "C:\Python27\lib\site-packages\scrapy-1.2.0-py2.7.egg\scrapy\utils\defer.py", line 45, in mustbe_deferred
result = f(*args, **kw)
File "C:\Python27\lib\site-packages\scrapy-1.2.0-py2.7.egg\scrapy\core\downloader\handlers\__init__.py", line 65, in download_request
return handler.download_request(request, spider)
File "C:\Python27\lib\site-packages\scrapy-1.2.0-py2.7.egg\scrapy\core\downloader\handlers\http11.py", line 60, in download_request
return agent.download_request(request)
File "C:\Python27\lib\site-packages\scrapy-1.2.0-py2.7.egg\scrapy\core\downloader\handlers\http11.py", line 285, in download_request
method, to_bytes(url, encoding='ascii'), headers, bodyproducer)
File "C:\Python27\Lib\site-packages\twisted\web\client.py", line 1470, in request
parsedURI.port)
File "C:\Python27\Lib\site-packages\twisted\web\client.py", line 1450, in _getEndpoint
tlsPolicy = self._policyForHTTPS.creatorForNetloc(host, port)
File "C:\Python27\lib\site-packages\scrapy-1.2.0-py2.7.egg\scrapy\core\downloader\contextfactory.py", line 57, in creatorForNetloc
return ScrapyClientTLSOptions(hostname.decode("ascii"), self.getContext())
File "C:\Python27\Lib\site-packages\twisted\internet\_sslverify.py", line 1059, in __init__
self._hostnameBytes = _idnaBytes(hostname)
File "C:\Python27\Lib\site-packages\twisted\internet\_sslverify.py", line 86, in _idnaBytes
return idna.encode(text).encode("ascii")
File "C:\Python27\Lib\site-packages\idna\core.py", line 350, in encode
raise IDNAError('Empty domain')
IDNAError: Empty domain
2016-10-09 23:19:49 [scrapy] ERROR: Error downloading <GET https:///%20//%20www. (most recent call last):
File "C:\Python27\Lib\site-packages\twisted\internet\defer.py", line 1105, in _inlineCallbacks
result = result.throwExceptionIntoGenerator(g)
File "C:\Python27\Lib\site-packages\twisted\python\failure.py", line 389, in throwExceptionIntoGenerator
return g.throw(self.type, self.value, self.tb)
File "C:\Python27\lib\site-packages\scrapy-1.2.0-py2.7.egg\scrapy\core\downloader\middleware.py", line 43, in process_request
defer.returnValue((yield download_func(request=request,spider=spider)))
File "C:\Python27\lib\site-packages\scrapy-1.2.0-py2.7.egg\scrapy\utils\defer.py", line 45, in mustbe_deferred
result = f(*args, **kw)
File "C:\Python27\lib\site-packages\scrapy-1.2.0-py2.7.egg\scrapy\core\downloader\handlers\__init__.py", line 65, in download_request
return handler.download_request(request, spider)
File "C:\Python27\lib\site-packages\scrapy-1.2.0-py2.7.egg\scrapy\core\downloader\handlers\http11.py", line 60, in download_request
return agent.download_request(request)
File "C:\Python27\lib\site-packages\scrapy-1.2.0-py2.7.egg\scrapy\core\downloader\handlers\http11.py", line 285, in download_request
method, to_bytes(url, encoding='ascii'), headers, bodyproducer)
File "C:\Python27\Lib\site-packages\twisted\web\client.py", line 1470, in request
parsedURI.port)
File "C:\Python27\Lib\site-packages\twisted\web\client.py", line 1450, in _getEndpoint
tlsPolicy = self._policyForHTTPS.creatorForNetloc(host, port)
File "C:\Python27\lib\site-packages\scrapy-1.2.0-py2.7.egg\scrapy\core\downloader\contextfactory.py", line 57, in creatorForNetloc
return ScrapyClientTLSOptions(hostname.decode("ascii"), self.getContext())
File "C:\Python27\Lib\site-packages\twisted\internet\_sslverify.py", line 1059, in __init__
self._hostnameBytes = _idnaBytes(hostname)
File "C:\Python27\Lib\site-packages\twisted\internet\_sslverify.py", line 86, in _idnaBytes
return idna.encode(text).encode("ascii")
File "C:\Python27\Lib\site-packages\idna\core.py", line 350, in encode
raise IDNAError('Empty domain')
IDNAError: Empty domain
2016-10-09 23:19:49 [scrapy] INFO: Closing spider (finished)
2016-10-09 23:19:49 [scrapy] INFO: Dumping Scrapy stats:
{'downloader/exception_count': 2,
'downloader/exception_type_count/idna.core.IDNAError': 2,
'downloader/request_bytes': 539,
'downloader/request_count': 2,
'downloader/request_method_count/GET': 2,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2016, 10, 9, 15, 19, 49, 706000),
'log_count/DEBUG': 1,
'log_count/ERROR': 2,
'log_count/INFO': 7,
'scheduler/dequeued': 1,
'scheduler/dequeued/memory': 1,
'scheduler/enqueued': 1,
'scheduler/enqueued/memory': 1,
'start_time': datetime.datetime(2016, 10, 9, 15, 19, 49, 256000)}
2016-10-09 23:19:49 [scrapy] INFO: Spider closed (finished)
代码链接在http://www.