python爬虫,输出过程中遇到问题,UnicodeEncodeError:
经常看到这种UnicodeEncodeError,希望懂的人,详细的讲解一下,或者推荐几篇相关帖子看一下,谢谢!源码:
import random
import urllib.request
import re
uapools = ['Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36 OPR/26.0.1656.60',
'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:34.0) Gecko/20100101 Firefox/34.0',
'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36']
def ua(uapools):
thisua = random.choice(uapools)
print(thisua)
headers = ('User-Agent',thisua)
opener = urllib.request.build_opener()
opener.addheaders = [headers]
urllib.request.install_opener(opener)
for i in range(0,10):
ua(uapools)
thisurl = 'https://www.'+ str(i+1)+'/'
thispage = urllib.request.urlopen(thisurl).read().decode('utf-8','ignore')
pat = '<div class="content">.*?<span>(.*?)</span>.*?</div>'
rst = (pat,re.S).findall(thispage)
for j in range(0,len(rst)):
print(rst[j])
print('-------------------------------')
输入结果: 有一部分爬取内容结果,运行过程中报错:
Traceback (most recent call last):
File "D:/python/ex/ex_uapools.py", line 22, in <module>
print(rst[j])
File "D:\python\lib\idlelib\PyShell.py", line 1344, in write
return self.shell.write(s, self.tags)
UnicodeEncodeError: 'UCS-2' codec can't encode characters in position 53-53: Non-BMP character not supported in Tk