微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

如果找不到元素或硒中发生timeoutexception时如何跳至下一个url等待功能

如何解决如果找不到元素或硒中发生timeoutexception时如何跳至下一个url等待功能

我正在尝试从气象站刮擦每日观测表。我有以下用于获取特定表的代码

#Iterate request to each weather station and date
for station,month,year in product(weather_station,year): 
    
    areacode = weather_station[station]['areacode']
    
    #Set link according to data need
    driver.get('https://www.wunderground.com/history/monthly/'+countrycode+'/'+station+'/'+areacode+'/date/'+str(year)+'-'+str(month))
    
    #Wait webpage to fully load necessary tables
    wait = webdriverwait(driver,15)
    
    #Update xpath incase webpage html format changes
    xpath_html_loc='//*[@id="inner-content"]/div[2]/div[1]/div[5]/div[1]/div/lib-city-history-observation/div/div[2]/table'
    tables = wait.until(EC.presence_of_all_elements_located((By.XPATH,xpath_html_loc)))
    
    #Save only the necessary table from loaded webpage
    for table in tables:
        histo_table = pd.read_html(table.get_attribute('outerHTML'))
        histo_weather = histo_table[2].fillna('')
        
    print("Weather observations for ",str(month),"-",str(year)," from station",station,"is ready \n")

代码遍历网站上所有必要的页面,并且在获取所需的特定表时工作正常,但是当该页面中不存在该表或链接不可用时,它将返回此错误timeoutexception >

我阅读了有关try和except选项的信息,但在这种情况下似乎无法使它起作用。您能建议一个更好的解决方案吗?下面带有try和except的代码仍会输出timeoutexception错误。如果表元素不存在或链接不可用,我希望有一个代码可以跳过当前URL并转到下一个URL(即,返回到for循环的开头以迭代下一个URL)。

try:
    #Set link according to data need
    driver.get('https://www.wunderground.com/history/monthly/'+countrycode+'/'+station+'/'+areacode+'/date/'+str(year)+'-'+str(month))

    #Wait webpage to fully load necessary tables
    wait = webdriverwait(driver,15)

    #Update xpath incase webpage html format changes
    xpath_html_loc='//*[@id="inner-content"]/div[2]/div[1]/div[5]/div[1]/div/lib-city-history-observation/div/div[2]/table'
    tables = driver.find_elements(By.XPATH,xpath_html_loc)
    print(tables)
except TimeoutException as exception:
    raise exception

解决方法

您可以使用以下方法实现相同的目的。

if len(driver.find_elements(By.XPATH,xpath_html_loc))>0:
     //Do something
else:
    //Do something

使用完整的解决方案更新代码。

#Set link according to data need
driver.get('https://www.wunderground.com/weather/us/pa/indiana/date/2020-09')

#Wait webpage to fully load necessary tables
wait = WebDriverWait(driver,15)
wait.until(EC.element_to_be_clickable((By.XPATH,"//lib-city-header//lib-subnav//div[@class='subnav-contain']//span[contains(text(),'History')]")))
#driver.find_element_by_xpath("//lib-city-header//lib-subnav//div[@class='subnav-contain']//span[contains(text(),'History')]").click()
tables = 0;
try:
    xpath_html_loc='//lib-city-history-observation//table'
    wait.until(EC.element_to_be_clickable((By.XPATH,xpath_html_loc)))
    tables = driver.find_elements(By.XPATH,xpath_html_loc)
    print(len(tables))
except TimeoutException as exception:
    pass

if tables > 0:
    print('IF')
else:
    print('Else')
,

我可以使用以下解决方法:

for link in links
    try:
        print("Trying for ",link)
        #Set link according to data need
        driver.get(link)
        #Wait webpage to fully load necessary tables
        wait = WebDriverWait(driver,15)

        #Update xpath incase webpage html format changes
        xpath_html_loc='//*[@id="inner-content"]/div[2]/div[1]/div[5]/div[1]/div/lib-city-history-observation/div/div[2]/table'
        wait.until(EC.presence_of_all_elements_located((By.XPATH,xpath_html_loc)))
        tables = driver.find_elements(By.XPATH,xpath_html_loc)
    except:
        # If the loading took too long,print message
        print("Loading took too long! Data unavailable")
        continue
    
    if(len(tables)>0:
        #Do code here
    else:
       print("data is unavailable")
       continue

即使链接不可用或由于try和except代码而无法加载表,循环仍将继续(这避免了超时异常)。我使用了wait.until直到期望的条件(完全加载所需的网页和表)和find_elements(查找特定的表)。如果在页面中找不到该表,或者即使在加载网页后该表仍然不可用,则@Dilip下方建议的if-else代码将继续for循环。

感谢您的所有帮助!

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。