微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

如何计算 R Dplyr 中的 NA

如何解决如何计算 R Dplyr 中的 NA

以下是我正在运行的包列表、示例数据和脚本。下面是模式。您会注意到其中两个值大于 500,因此不符合架构。期望的结果将只考虑那些符合架构的(雇用少于 500 人)。当我在更大的数据集(不是下面的示例数据集)上运行它时,我得到的结果就像在底部找到的一样。简而言之,我将如何修改脚本,使其忽略大于 500 的条目,因此不返回 NA 的第五行?

library(dplyr)
library(data.table)
library(odbc)
library(DBI)
library(stringr)

firm <- c("firm1","firm2","firm3","firm4","firm5","firm6","firm7","firm8","firm9","firm10","firm11")
employment <- c(1,50,90,249,499,115,145,261,210,874,1140)
small <- c(1,1,3,4,2,NA,NA)

smbtest <- data.frame(firm,employment,small)

smbsummary2<-smbtest %>% 
select(employment,small) %>%
group_by(small) %>%
summarise(employment = sum(employment),worksites = n(),.groups = 'drop') %>% 
mutate(employment = cumsum(employment),worksites = cumsum(worksites))

smb1     >= 0 and <100
smb2     >= 0 and <150
smb3     >= 0 and <250
smb4     >= 0 and <500

smb      employment   worksites
 1           1000         20
 2           1500         22
 3           2500         25
 4           10000        29
 5           25000        NA

解决方法

在这里我相信这会有所帮助

firm <- c("firm1","firm2","firm3","firm4","firm5","firm6","firm7","firm8","firm9","firm10","firm11")
employment <- c(1,50,90,249,499,115,145,261,210,874,1140)
small <- c(1,1,3,4,2,NA,NA)

smbtest <- data.frame(firm,employment,small)

smbtest %>% 
select(employment,small) %>%
group_by(small) %>%
summarise(employment = sum(employment),worksites = n(),.groups = 'drop') %>% 
 mutate(employment = cumsum(employment),worksites = cumsum(worksites)) %>% drop_na() %>% filter(employment < 500)

我刚刚添加了两行语法

  • “drop_na”
  • "过滤器(就业

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。