如何解决RSelenium - 从多个 URL 中抓取
我希望在整个 URL 中循环抓取并将其绑定到一个数据帧中。我的问题是我的 for 循环只抓取最后一个 URL。代码如下:
如果该列还包含从中抓取的页面的 URL,以便它可以轻松跟踪,我们也将不胜感激。感谢帮助
library(tidyverse)
library(dplyr)
library(magrittr)
driver <- RSelenium::rsDriver(browser = "chrome",chromever =
system2(command = "wmic",args = 'datafile where name="C:\\\\Program Files (x86)\\\\Google\\\\Chrome\\\\Application\\\\chrome.exe" get Version /value',stdout = TRUE,stderr = TRUE) %>%
stringr::str_extract(pattern = "(?<=Version=)\\d+\\.\\d+\\.\\d+\\.") %>%
magrittr::extract(!is.na(.)) %>%
stringr::str_replace_all(pattern = "\\.",replacement = "\\\\.") %>%
paste0("^",.) %>%
stringr::str_subset(string =
binman::list_versions(appname = "chromedriver") %>%
dplyr::last()) %>%
as.numeric_version() %>%
max() %>%
as.character())
url <- c("https://shopee.ph/shop/57465664/search","https://shopee.ph/shop/29990515/search")
remote_driver <- driver[["client"]]
for (i in 1:(length(url))){
remote_driver$navigate(paste0(url[[i]]))
Sys.sleep(1)
name <- remote_driver$findElements(using = 'class',value = 'PFM7lj')
name <- lapply(name,function(x)
x$getElementText())
name <- unlist(name)
price <- remote_driver$findElements(using = 'class',value = '_29R_un')
price <- lapply(price,function(x)
x$getElementText())
price <- unlist(price)
#shopee <- cbind(data.frame(name),data.frame(price))
shopee<- rbind(name,price)
final <- cbind(data.frame(name),data.frame(price))
}
final
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。