使用 BS4 抓取电影细节

如何解决使用 BS4 抓取电影细节

抓取“http://fresco-movies.surge.sh/”并获取电影详细信息并附加data.csv

我必须从网站上抓取数据，即电影名称、持续时间、流派、评级、描述、导演和投票，并将其保存到 data.csv。

请帮我写代码

 from bs4 import BeautifulSoup
import requests
url = "http://fresco-movies.surge.sh/"
req = requests.get(url)
soup = BeautifulSoup(req.content,'html.parser')
print(soup)

解决方法

HTML 文档的直接导航

from bs4 import BeautifulSoup
import requests
url = "http://fresco-movies.surge.sh/"
req = requests.get(url)
soup = BeautifulSoup(req.content,'html.parser')
names = []
for m in soup.find_all("div",class_="row"):
    names.append({"name":m.find("a").text,"director":m.find("div",class_="ratings-bar").find("a").text,"votes":m.find("div",class_="ratings-bar").find("p",class_="sort-num_votes-visible").find_all("span")[1].text,"certificate":m.find("span",class_="certificate"),"runtime":m.find("span",class_="runtime"),})
    
print(pd.DataFrame(names).head(5).to_string(index=False))

输出

                                          name              director    votes certificate    runtime
                      The Shawshank Redemption        Frank Darabont  2033239       [9.3]  [142 min]
                                 The Godfather  Francis Ford Coppola  1394179       [9.2]  [175 min]
                               The Dark Knight     Christopher Nolan  2001026       [9.0]  [152 min]
                        The Godfather: Part II  Francis Ford Coppola   966187       [9.0]  [202 min]
 The Lord of the Rings: The Return of the King         Peter Jackson  1447736       [8.9]  [201 min]

我特意留了一些工作让你做，但这应该足以让你开始。

ORDER BY

如果您无法弄清楚其余部分，请查阅 BeautifulSoup 文档。

使用 BS4 抓取电影细节

如何解决使用 BS4 抓取电影细节

解决方法

输出

相关推荐