微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

TheGuardian API-脚本崩溃

如何解决TheGuardian API-脚本崩溃

import json
import requests
from os import makedirs
from os.path import join,exists
from datetime import date,timedelta

ARTICLES_DIR = join('tempdata','articles')
makedirs(ARTICLES_DIR,exist_ok=True)

API_ENDPOINT = 'http://content.guardianapis.com/search'
my_params = {
    'q': 'coronavirus,stock,covid','sectionID': 'business','from-date': "2019-01-01",'to-date': "2020-09-30",'order-by': "newest",'show-fields': 'all','page-size': 300,'api-key': '### my cryptic key ###'
}


# day iteration from here:
# http://stackoverflow.com/questions/7274267/print-all-day-dates-between-two-dates
start_date = date(2019,1,1)
end_date = date(2020,9,30)
dayrange = range((end_date - start_date).days + 1)
for daycount in dayrange:
    dt = start_date + timedelta(days=daycount)
    datestr = dt.strftime('%Y-%m-%d')
    fname = join(ARTICLES_DIR,datestr + '.json')
    if not exists(fname):
        # then let's download it
        print("Downloading",datestr)
        all_results = []
        my_params['from-date'] = datestr
        my_params['to-date'] = datestr
        current_page = 1
        total_pages = 1
        while current_page <= total_pages:
            print("...page",current_page)
            my_params['page'] = current_page
            resp = requests.get(API_ENDPOINT,my_params)
            data = resp.json()
            all_results.extend(data['response']['results'])
            # if there is more than one page
            current_page += 1
            total_pages = data['response']['pages']

        with open(fname,'w') as f:
            print("Writing to",fname)

            # re-serialize it for pretty indentation
            f.write(json.dumps(all_results,indent=2))

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-18-f04b4f0fe9ed> in <module>
     49             resp = requests.get(API_ENDPOINT,my_params)
     50             data = resp.json()
---> 51             all_results.extend(data['response']['results'])
     52             # if there is more than one page
     53             current_page += 1

KeyError: 'results'

页面”发生相同的错误

起初没有问题,并且能够运行它。 2020-03-24后下载失败。从那时起,无法再次运行代码

我指的是第51行和第54行。至少在这一点上,代码崩溃了。 不知道如何摆脱这个问题。有什么想法吗?

解决方法

了解错误消息将是第一步-它就缺少键进行了比较。检查是否存在data['response']['results'](提示:不存在),并检查data['response']的确切结构。

幸运的是,可以使用api参数'test',因此我们可以帮助您使用该键:

my_params = {
    'q': 'coronavirus,stock,covid','sectionID': 'business','from-date': "2019-01-01",'to-date': "2020-09-30",'order-by': "newest",'show-fields': 'all','page-size': 300,'api-key': 'test'    # test key for that API
}

在运行时,我遇到相同的异常,检查data['response']并得到:

number too big

让我们看看给出了什么参数,对吧?

my_params = {
    'q': 'coronavirus,# TOO BIG
    'api-key': 'test'
}

将其固定为200,您将得到

Downloading 2019-01-01
...page 1
Writing to tempdata\articles\2019-01-01.json
Downloading 2019-01-02
...page 1
Writing to tempdata\articles\2019-01-02.json
Downloading 2019-01-03
...page 1
Writing to tempdata\articles\2019-01-03.json
Downloading 2019-01-04
...page 1
Writing to tempdata\articles\2019-01-04.json
Downloading 2019-01-05
[snipp]

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。