如何解决在ddply摘要期间从df中选择一个值
我想使用ddply
和summarise
来获取几年数据的每月中位数。我可以成功做到这一点。但是,我也想创建一个包含一年数据值的列。我知道其他添加方式,但是想在ddply
行中添加。数据在底部。
如果所有年份的中位数为16,而2018年的值为30,结果的第一行将如下所示:
Month Median 2018
Apr 16.0 30
这是我尝试过的: 可以按预期工作:
Summary<-ddply(df,~Month,summarise,Median = median(Value))
Summary
当我尝试添加单年值时,我似乎想不出一种方法:
Summary<-ddply(df,Median = median(Value),SingleYearValue = which(df[,"Year"]==2018));Summary
df<-structure(list(Month = c("Apr","Apr","Aug","Dec","Feb","Jan","Jul","Jun","Mar","May","Nov","Oct","Sep","Sep"),Year = c("1960","1961","1962","1963","1964","1965","1966","1967","1968","1969","1970","1971","1972","2002","2003","2004","2005","2006","2007","2008","2009","2010","2011","2012","2013","2014","2015","2016","2017","2018","2019","1960","2001","1959","2019"),Value = 1:295),row.names = c(NA,-295L),class = "data.frame")
解决方法
您可以subset
设置特定的年份,然后merge
:
year = 2018
data <- subset(df,Year == year,select = -Year)
names(data)[names(data) == 'Value'] <- year
merge(Summary,data,by = 'Month',all.x = TRUE)
# Month Median 2018
#1 Apr 16.0 30
#2 Aug 47.5 62
#3 Dec 70.0 NA
#4 Feb 83.0 NA
#5 Jan 96.0 NA
#6 Jul 118.5 133
#7 Jun 150.5 165
#8 Mar 175.0 182
#9 May 199.0 213
#10 Nov 223.5 232
#11 Oct 248.0 262
#12 Sep 279.5 294
,
如果我们想在plyr
中完成所有操作,请使用plyr::join
plyr::join(Summary,subset(df,Year == 2018,select = -Year))
# Month Median Value
#1 Apr 16.0 30
#2 Aug 47.5 62
#3 Dec 70.0 NA
#4 Feb 83.0 NA
#5 Jan 96.0 NA
#6 Jul 118.5 133
#7 Jun 150.5 165
#8 Mar 175.0 182
#9 May 199.0 213
#10 Nov 223.5 232
#11 Oct 248.0 262
#12 Sep 279.5 294
或者如果我们想在ddply
plyr::ddply(df,~ Month,summarise,Median = median(Value),`2018` = Value[Year == 2018][1])
# Month Median 2018
#1 Apr 16.0 30
#2 Aug 47.5 62
#3 Dec 70.0 NA
#4 Feb 83.0 NA
#5 Jan 96.0 NA
#6 Jul 118.5 133
#7 Jun 150.5 165
#8 Mar 175.0 182
#9 May 199.0 213
#10 Nov 223.5 232
#11 Oct 248.0 262
#12 Sep 279.5 294
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。