R. 删除“#NAME”有问题吗？来自 excel 导入在数据框中

如何解决R. 删除“#NAME”有问题吗？来自 excel 导入在数据框中

我有一个从 excel 导入的 .csv 文件，其中包含我想删除的公式挂起。数据的简单版本如下。

library(tidyverse)
df <- data.frame(
  species = letters[1:5],param1 = c("Place","creek","river","#VALUE!","desert"),param2 = c(-23.8,43.23,"#NAME?",45,0.23),param3 = c(2.4,2,5.7,0.00003,-2.5),stringsAsFactors = FALSE
) # This is a simplified version of the excel .csv import

df[df == "#VALUE!"] <- ""     # Removes excel cells where the formula left "#VALUE!"
df[df == "#NAME\\?"] <- ""   # This does not work

ndf <- df  # This is an attempt to reassign the columns to numeric
ndf
class(ndf$param2)
class(ndf$param3)

主要问题是当它需要为 Param2 或我必须在其上运行的函数时，将保留在其中的数据列 character 分配给 numeric不工作。

我尝试了很多不同的东西，但是我似乎总是无法识别单元格。如何删除“#NAME”？请穿过df？

解决方法

您正在执行完全匹配（而不是正则表达式匹配），因此您无需以不同方式转义特殊变量（如 ?、!）。试试：

df[df == "#VALUE!"] <- ""  
df[df == "#NAME?"] <- NA
df <- type.convert(df,as.is = TRUE)
df
#  species param1 param2   param3
#1       a  Place -23.80  2.40000
#2       b  creek  43.23  2.00000
#3       c  river     NA  5.70000
#4       d         45.00  0.00003
#5       e desert   0.23 -2.50000

str(df)
#'data.frame':  5 obs. of  4 variables:
# $ species: chr  "a" "b" "c" "d" ...
# $ param1 : chr  "Place" "creek" "river" "" ...
# $ param2 : num  -23.8 43.23 NA 45 0.23
# $ param3 : num  2.4 2 5.7 0.00003 -2.5

这是一个带有 dplyr 的 sub 解决方案，可以一次性替换不需要的值：

df %>%
  mutate(across(matches("\\d"),~sub("#.*","NA",.)))
  species param1 param2 param3
1       a  Place  -23.8    2.4
2       b  creek  43.23      2
3       c  river     NA    5.7
4       d     NA     45  3e-05
5       e desert   0.23   -2.5

如果您不知道不需要的值出现在哪些列中，此解决方案会很有帮助：

library(stringr)
df %>% 
  mutate(across(where(~any(str_detect(.,"#"))),.)))

这第三个解决方案既可以替换任何地方不需要的值，又可以将列转换为正确的类型（感谢@Ronak 的启发）：

df %>% 
  mutate(across(where(~any(str_detect(.,.)),across(everything(),~type.convert(.,as.is = TRUE)))

R. 删除“#NAME”有问题吗？ 来自 excel 导入在数据框中

如何解决R. 删除“#NAME”有问题吗？ 来自 excel 导入在数据框中

解决方法

相关推荐

R. 删除“#NAME”有问题吗？来自 excel 导入在数据框中

如何解决R. 删除“#NAME”有问题吗？来自 excel 导入在数据框中