微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

抓取谷歌学术搜索结果导致错误:无法从谷歌学术中获取

如何解决抓取谷歌学术搜索结果导致错误:无法从谷歌学术中获取

我尝试使用 Scholarly 包检索 Google Scholar 搜索结果。(文档:https://scholarly.readthedocs.io/en/latest/ProxyGenerator.html#module-scholarly._proxy_generator) (例如:https://pypi.org/project/scholarly/搜索了大约 2000 篇论文,我只想从结果中获取标题、期刊和年份信息,将它们保存为 csv 文件。由于我是Python新手,即使看文档也不知道如何实现代码。 (https://scholarly.readthedocs.io/en/latest/ProxyGenerator.html#module-scholarly._proxy_generator)

from scholarly import scholarly
from scholarly import proxygenerator
import pandas as pd
import numpy as np
from fp.fp import FreeProxy

pg = proxygenerator()
proxy = FreeProxy(rand=True,timeout=1,country_id=[ 'BR','KR','US']).get()
pg.SingleProxy(http =proxy,https =proxy)

pg.Tor_External(tor_sock_port=9050,tor_control_port=9051,tor_password="scholarly_password")

scholarly.use_proxy(pg)

search_query = scholarly.search_pubs('gait AND "machine learning" AND insole',year_low = 2018)

def removekey(d,key):
r = dict(d)
del r[key]
return r


def summary(generator):
info = []
for i in generator:
    info.append(i)

entire = []
for i,v in enumerate(info):
    new = removekey(info[i]['bib'],'author')
    entire.append(new)

total = pd.DataFrame(entire)
return total


summary(search_query)

并导致错误:无法从 Google 学术搜索获取

您可以通过帮助我来挽救我的生命和心理健康..! 谢谢

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。