如何解决Python WebScraping FlashScore
from requests_html import AsyncHTMLSession
from collections import defaultdict
import pandas as pd
url = 'https://www.flashscore.com/football/netherlands/eredivisie/results/'
asession = AsyncHTMLSession()
async def get_scores():
r = await asession.get(url)
await r.html.arender()
return r
results = asession.run(get_scores)
results = results[0]
times = results.html.find("div.event__time")
home_teams = results.html.find("div.event__participant.event__participant--home")
scores = results.html.find("div.event__scores.fontBold")
away_teams = results.html.find("div.event__participant.event__participant--away")
event_part = results.html.find("div.event__part")
dict_res = defaultdict(list)
for ind in range(len(times)):
dict_res['times'].append(times[ind].text)
dict_res['home_teams'].append(home_teams[ind].text)
dict_res['scores'].append(scores[ind].text)
dict_res['away_teams'].append(away_teams[ind].text)
dict_res['event_part'].append(event_part[ind].text)
df_res = pd.DataFrame(dict_res)
print(df_res)
结果如下:
times home_teams scores away_teams event_part
0 22.01. 20:00 Willem II 1 - 3 Zwolle (1 - 0)
1 17.01. 16:45 Ajax 1 - 0 Feyenoord (1 - 0)
2 17.01. 14:30 Groningen 2 - 2 Twente (0 - 2)
3 17.01. 14:30 Venlo 1 - 1 Heerenveen (0 - 0)
4 17.01. 12:15 Waalwijk 1 - 1 Willem II (1 - 0)
.. ... ... ... ... ...
101 25.10. 20:00 Den Haag 2 - 2 AZ Alkmaar (0 - 1)
102 25.10. 16:45 Waalwijk 2 - 2 Feyenoord (0 - 0)
103 25.10. 14:30 Sparta Rotterdam 1 - 1 Heracles (0 - 0)
104 25.10. 14:30 Vitesse 2 - 1 PSV (1 - 0)
105 25.10. 12:15 Sittard 1 - 3 Groningen (0 - 2)
[106 rows x 5 columns]
但是,无论何时访问网站 https://www.flashscore.com/football/netherlands/eredivisie/results/,它都会在底部显示“显示更多匹配项”按钮。输出仅显示前几个匹配项,而不显示单击“显示更多匹配项”时显示的附加信息。是否也可以提取此附加信息?
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。