爬虫问题,无法获取audio 的 src
我用python 3.6.4 抓取酷狗音乐的时候遇到无法解析的问题。代码如下:import requests
from bs4 import BeautifulSoup
import bs4
def getHTMText(url):
try:
r = requests.get(url)
r.raise_for_status()
r.encoding = r.apparent_encoding
return r.text
except:
return ""
def get_song_url(url):
song_url = ""
html = getHTMText(url)
if(html):
soup = BeautifulSoup(html,"html.parser")# 我换了解析器也不行
song_url = soup.find(id="myAudio")
if isinstance(song_url,bs4.element.Tag):
print(type(song_url))
print(song_url)
else:
return(song_url)
def main():
url = "http://www.
get_song_url(url)
main()
'''
运行结果:
<class 'bs4.element.Tag'>
<audio class="music" id="myAudio" src="">
<!-- <p class="myAudiohide">你的浏览器不支持<code>audio</code>标签.</p> -->
</audio>
用开发者工具抓取audio tag的内容如下:
<audio class="music" id="myAudio" src="http://fs.w. preload="auto">
<!-- <p class="myAudiohide">你的浏览器不支持<code>audio</code>标签.</p> -->
</audio>
我想获取audio的src
'''
谢啦!