如果找不到元素或硒中发生timeoutexception时如何跳至下一个url等待功能

如何解决如果找不到元素或硒中发生timeoutexception时如何跳至下一个url等待功能

我正在尝试从气象站刮擦每日观测表。我有以下用于获取特定表的代码：

#Iterate request to each weather station and date
for station,month,year in product(weather_station,year): 
    
    areacode = weather_station[station]['areacode']
    
    #Set link according to data need
    driver.get('https://www.wunderground.com/history/monthly/'+countrycode+'/'+station+'/'+areacode+'/date/'+str(year)+'-'+str(month))
    
    #Wait webpage to fully load necessary tables
    wait = webdriverwait(driver,15)
    
    #Update xpath incase webpage html format changes
    xpath_html_loc='//*[@id="inner-content"]/div[2]/div[1]/div[5]/div[1]/div/lib-city-history-observation/div/div[2]/table'
    tables = wait.until(EC.presence_of_all_elements_located((By.XPATH,xpath_html_loc)))
    
    #Save only the necessary table from loaded webpage
    for table in tables:
        histo_table = pd.read_html(table.get_attribute('outerHTML'))
        histo_weather = histo_table[2].fillna('')
        
    print("Weather observations for ",str(month),"-",str(year)," from station",station,"is ready \n")

此代码遍历网站上所有必要的页面，并且在获取所需的特定表时工作正常，但是当该页面中不存在该表或链接不可用时，它将返回此错误：timeoutexception >

我阅读了有关try和except选项的信息，但在这种情况下似乎无法使它起作用。您能建议一个更好的解决方案吗？下面带有try和except的代码仍会输出timeoutexception错误。如果表元素不存在或链接不可用，我希望有一个代码可以跳过当前URL并转到下一个URL（即，返回到for循环的开头以迭代下一个URL）。

try:
    #Set link according to data need
    driver.get('https://www.wunderground.com/history/monthly/'+countrycode+'/'+station+'/'+areacode+'/date/'+str(year)+'-'+str(month))

    #Wait webpage to fully load necessary tables
    wait = webdriverwait(driver,15)

    #Update xpath incase webpage html format changes
    xpath_html_loc='//*[@id="inner-content"]/div[2]/div[1]/div[5]/div[1]/div/lib-city-history-observation/div/div[2]/table'
    tables = driver.find_elements(By.XPATH,xpath_html_loc)
    print(tables)
except TimeoutException as exception:
    raise exception

解决方法

您可以使用以下方法实现相同的目的。

if len(driver.find_elements(By.XPATH,xpath_html_loc))>0:
     //Do something
else:
    //Do something

使用完整的解决方案更新代码。

#Set link according to data need
driver.get('https://www.wunderground.com/weather/us/pa/indiana/date/2020-09')

#Wait webpage to fully load necessary tables
wait = WebDriverWait(driver,15)
wait.until(EC.element_to_be_clickable((By.XPATH,"//lib-city-header//lib-subnav//div[@class='subnav-contain']//span[contains(text(),'History')]")))
#driver.find_element_by_xpath("//lib-city-header//lib-subnav//div[@class='subnav-contain']//span[contains(text(),'History')]").click()
tables = 0;
try:
    xpath_html_loc='//lib-city-history-observation//table'
    wait.until(EC.element_to_be_clickable((By.XPATH,xpath_html_loc)))
    tables = driver.find_elements(By.XPATH,xpath_html_loc)
    print(len(tables))
except TimeoutException as exception:
    pass

if tables > 0:
    print('IF')
else:
    print('Else')

我可以使用以下解决方法：

for link in links
    try:
        print("Trying for ",link)
        #Set link according to data need
        driver.get(link)
        #Wait webpage to fully load necessary tables
        wait = WebDriverWait(driver,15)

        #Update xpath incase webpage html format changes
        xpath_html_loc='//*[@id="inner-content"]/div[2]/div[1]/div[5]/div[1]/div/lib-city-history-observation/div/div[2]/table'
        wait.until(EC.presence_of_all_elements_located((By.XPATH,xpath_html_loc)))
        tables = driver.find_elements(By.XPATH,xpath_html_loc)
    except:
        # If the loading took too long,print message
        print("Loading took too long! Data unavailable")
        continue
    
    if(len(tables)>0:
        #Do code here
    else:
       print("data is unavailable")
       continue

即使链接不可用或由于try和except代码而无法加载表，循环仍将继续（这避免了超时异常）。我使用了wait.until直到期望的条件（完全加载所需的网页和表）和find_elements（查找特定的表）。如果在页面中找不到该表，或者即使在加载网页后该表仍然不可用，则@Dilip下方建议的if-else代码将继续for循环。

感谢您的所有帮助！