微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

如何将搜索到的每个项目的链接存储到列表中?

如何解决如何将搜索到的每个项目的链接存储到列表中?

代码对短代码进行简单的谷歌搜索,然后打印找到的链接,最多 10 个链接。如何将为每个短代码找到的链接存储到与搜索到的短代码对应的列表或字典中?

try:
    from googlesearch import search
except ImportError:
    print("No module named 'google' found")

with open('UnkNown.xlsx',"rb") as f:
    df = pd.read_excel(f)  # can also index sheet by name or fetch all sheets
    shortcode_list = df['Short Code'].tolist()

def stopwatch(sec):
    while sec:
        minn,sec = divmod(sec,60)
        timeformat = '{:02d}:{:02d}'.format(minn,sec)
        print(timeformat,end='\r')
        time.sleep(1)
        sec -= 1

pauses = np.arange(2,8,1).tolist()
pause = np.random.choice(pauses)

delays = np.arange(1,60,1).tolist()
delay = np.random.choice(delays)

for i in tqdm(range(len(shortcode_list))):
    try:
        shortcode = shortcode_list[i]
        delays = np.arange(1,1).tolist()
        delay = np.random.choice(delays)
        pauses = np.arange(2,1).tolist()
        pause = np.random.choice(pauses)
        stopwatch(delay)
        string = "text * to " + '"' + str(shortcode) + '"'
        query = string
        url = ('https://www.google.com?q=' + query)
        res = requests.get(url,headers=headers)
        print("\nThe query will be " + query + " " + str(res))
        for k in search(query,tld="co.in",num=10,stop=10,pause=pause,country='US',user_agent=googlesearch.get_random_user_agent(),verify_ssl=True):
            print(k)
    except HTTPError as exception:
        if exception.code == 429:
            print(exception)
            print("Waiting for 8 minutes and Continue")
            stopwatch(480)
            continue

解决方法

您可以使用以简码为键、以列表为值的字典。

通过使用这种方法,您的代码应该是这样的:

import numpy as np
from tqdm import tqdm
import time
import requests

try:
    from googlesearch import search
except ImportError:
    print("No module named 'google' found")

shortcode_list = ["abc","SO"]
def stopwatch(sec):
    while sec:
        minn,sec = divmod(sec,60)
        timeformat = '{:02d}:{:02d}'.format(minn,sec)
        print(timeformat,end='\r')
        time.sleep(1)
        sec -= 1

pauses = np.arange(2,8,1).tolist()
pause = np.random.choice(pauses)

delays = np.arange(1,60,1).tolist()
delay = np.random.choice(delays)

results = {}  # Create an empty dict 
for i in tqdm(range(len(shortcode_list))):
    try:
        shortcode = shortcode_list[i]
        delays = np.arange(1,5,1).tolist()
        delay = np.random.choice(delays)
        pauses = np.arange(2,1).tolist()
        pause = np.random.choice(pauses)
        stopwatch(delay)
        string = "text * to " + '"' + str(shortcode) + '"'
        query = string
        url = ('https://www.google.com?q=' + query)
        res = requests.get(url)
        print("\nThe query will be " + query + " " + str(res))
        cur_res = []  # Create a list to store the results for that shortcode
        for k in search(query,num_results=10):
            print(k)
            cur_res.append(k)  # Add a single res to the list
        results[shortcode_list[i]] = cur_res  # Update the res dict

    except Exception as exception:
        print(exception)
        print("Waiting for 8 minutes and Continue")
        stopwatch(480)
        continue

作为对下一个问题的建议,您应该删除或修改不可重现的代码(即未知的 xls),并且您的代码段应该准备好进行调试以帮助想要帮助您的人(包括所有导入、. .)

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。