如何解决了解 tweepy 的状态对象
我一直在与 tweepy streaming API
合作,我注意到一件事。 status
对象的响应不是纯粹的 JSON
它还有一些附加信息。这是我的问题:
如果我需要推文的文本内容,我必须执行status.text
,它位于_json 字典/json 对象中。但是,如果我需要媒体或全文,我必须做 status.entities['media'][0]['media_url_https']
或 status.extended_tweet['full_text']
,尽管 entities
和 extended_tweet
都在另一个字典中_json
对象。
为什么我们必须使用 dot (.) notation
来访问 _json
中的外部字典,而使用 [] notation
来访问内部字典中的值?
我理解 [] 符号,但点符号是什么意思?
状态对象:
Status(_api=<tweepy.api.API object at 0x7f6851708190>,_json={'created_at': 'Mon Feb 01 10:58:31 +0000 2021','id': 1356195217687392256,'id_str': '1356195217687392256','text': 'hiii IWD1FKPH0JEFS2PNH7KBSPXQ2EAZVORAWCE2580MWFW4N0OAFM63WI06CAZ4OYBLMPATC4VL9OAMFH86K660EXVMP53M36PN0FTU1ETKBIIP7D…,'display_text_range': [0,140],'source': '<a href="https://mobile.twitter.com" rel="nofollow">Twitter Web App</a>','truncated': True,'in_reply_to_status_id': None,'in_reply_to_status_id_str': None,'in_reply_to_user_id': None,'in_reply_to_user_id_str': None,'in_reply_to_screen_name': None,'user': {'id': 1355079285812453379,'id_str': '1355079285812453379','name': 'db1-testing','screen_name': 'Db1Testing','location': None,'url': None,'description': None,'translator_type': 'none','protected': False,'verified': False,'followers_count': 0,'friends_count': 0,'listed_count': 0,'favourites_count': 0,'statuses_count': 29,'created_at': 'Fri Jan 29 09:04:24 +0000 2021','utc_offset': None,'time_zone': None,'geo_enabled': False,'lang': None,'contributors_enabled': False,'is_translator': False,'profile_background_color': 'F5F8FA','profile_background_image_url': '','profile_background_image_url_https': '','profile_background_tile': False,'profile_link_color': '1DA1F2','profile_sidebar_border_color': 'C0DEED','profile_sidebar_fill_color': 'DDEEF6','profile_text_color': '333333','profile_use_background_image': True,'profile_image_url': 'http://abs.twimg.com/sticky/default_profile_images/default_profile_normal.png','profile_image_url_https': 'https://abs.twimg.com/sticky/default_profile_images/default_profile_normal.png','default_profile': True,'default_profile_image': False,'following': None,'follow_request_sent': None,'notifications': None},'geo': None,'coordinates': None,'place': None,'contributors': None,'is_quote_status': False,'extended_tweet': {'full_text': 'hiii IWD1FKPH0JEFS2PNH7KBSPXQ2EAZVORAWCE2580MWFW4N0OAFM63WI06CAZ4OYBLMPATC4VL9OAMFH86K660EXVMP53M36PN0FTU1ETKBIIP7DMBJ3XCQN2XXXA1KA6VSCW292X86SJGHWEH3L1J2HVLV42SHPV8LCVZYY6S762GJ2MOBF3J6IH0,189],'entities': {'hashtags': [],'urls': [],'user_mentions': [],'symbols': [],'media': [{'id': 1356193076398731264,'id_str': '1356193076398731264','indices': [190,213],'media_url': 'http://pbs.twimg.com/media/EtIqauWXMAAZsTn.jpg','media_url_https': 'https://pbs.twimg.com/media/EtIqauWXMAAZsTn.jpg','display_url': 'pic.twitter.com/BWiI8Lh6tW','expanded_url': 'https://twitter.com/Db1Testing/status/1356195217687392256/photo/1','type': 'photo','sizes': {'thumb': {'w': 150,'h': 150,'resize': 'crop'},'small': {'w': 680,'h': 665,'resize': 'fit'},'medium': {'w': 827,'h': 809,'large': {'w': 827,'resize': 'fit'}}}]},'extended_entities': {'media': [{'id': 1356193076398731264,'resize': 'fit'}}}]}},'quote_count': 0,'reply_count': 0,'retweet_count': 0,'favorite_count': 0,'urls': [{ 'expanded_url': 'https://twitter.com/i/web/status/1356195217687392256','display_url': 'twitter.com/i/web/status/1…','indices': [117,140]}],'symbols': []},'favorited': False,'retweeted': False,'possibly_sensitive': False,'filter_level': 'low','lang': 'ht','timestamp_ms': '1612177111442'},created_at=datetime.datetime(2021,2,1,10,58,31),id=1356195217687392256,id_str='1356195217687392256',text='hiii IWD1FKPH0JEFS2PNH7KBSPXQ2EAZVORAWCE2580MWFW4N0OAFM63WI06CAZ4OYBLMPATC4VL9OAMFH86K660EXVMP53M36PN0FTU1ETKBIIP7D…,display_text_range=[0,source='Twitter Web App',source_url='https://mobile.twitter.com',truncated=True,in_reply_to_status_id=None,in_reply_to_status_id_str=None,in_reply_to_user_id=None,in_reply_to_user_id_str=None,in_reply_to_screen_name=None,author=User(_api=<tweepy.api.API object at 0x7f6851708190>,_json={'id': 1355079285812453379,id=1355079285812453379,id_str='1355079285812453379',name='db1-testing',screen_name='Db1Testing',location=None,url=None,description=None,translator_type='none',protected=False,verified=False,followers_count=0,friends_count=0,listed_count=0,favourites_count=0,statuses_count=29,29,9,4,24),utc_offset=None,time_zone=None,geo_enabled=False,lang=None,contributors_enabled=False,is_translator=False,profile_background_color='F5F8FA',profile_background_image_url='',profile_background_image_url_https='',profile_background_tile=False,profile_link_color='1DA1F2',profile_sidebar_border_color='C0DEED',profile_sidebar_fill_color='DDEEF6',profile_text_color='333333',profile_use_background_image=True,profile_image_url='http://abs.twimg.com/sticky/default_profile_images/default_profile_normal.png',profile_image_url_https='https://abs.twimg.com/sticky/default_profile_images/default_profile_normal.png',default_profile=True,default_profile_image=False,following=False,follow_request_sent=None,notifications=None),user=User(_api=<tweepy.api.API object at 0x7f6851708190>,'geo_enabled':False,geo=None,coordinates=None,place=None,contributors=None,is_quote_status=False,extended_tweet={
'full_text': 'hiii IWD1FKPH0JEFS2PNH7KBSPXQ2EAZVORAWCE2580MWFW4N0OAFM63WI06CAZ4OYBLMPATC4VL9OAMFH86K660EXVMP53M36PN0FTU1ETKBIIP7DMBJ3XCQN2XXXA1KA6VSCW292X86SJGHWEH3L1J2HVLV42SHPV8LCVZYY6S762GJ2MOBF3J6IH0,quote_count=0,reply_count=0,retweet_count=0,favorite_count=0,entities={'hashtags': [],'urls': [{'expanded_url': 'https://twitter.com/i/web/status/1356195217687392256',favorited=False,retweeted=False,possibly_sensitive=False,filter_level='low',lang='ht',timestamp_ms='1612177111442')
Stackoverflow 的答案 here 建议使用 status.extended_tweet.full_text
,但在我执行 status.extended_tweet['full_text']
之前它不起作用
解决方法
主要区别在于 Tweepy 有一些预定义的对象(状态、用户)但经常使用字典(随着数据模型的发展它可能更灵活)。
status.user
是一个 User 对象,例如
print(type(status.user)) # <class 'tweepy.models.User'>
print(status.user.screen_name) # beppecatanese
extended_tweet
(当它存在时)包含一个属性字典,例如
print(type(status.extended_tweet)) # <class 'dict'>
entities 是各种对象(主题标签、媒体、URL 等)的数组,再次包装在字典中
print(type(tweet.entities)) # <class 'dict'>
print(tweet.entities['urls'][0]) # First url
print(type(tweet.entities['urls'][0])) # <class 'dict'>
print(tweet.entities['urls'][0]['expanded_url']) # expanded url of first url
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。