如何解决R 根据特定条件对所有列中的数据进行分组并减去行
这是我的数据的一个子集:
structure(list(First.Name = c(5006L,5006L,5007L,5008L,5009L),Session = c("Post","Pre","Post",NA,"Post"),RHR = c(65.2352941176471,60,62.5882352941176,63,63.4,48.6060606060606),HRV = c(79.1470588235294,73.5,91.4117647058823,80.5555555555556,102.4,146.606060606061),Hours.in.bed = c(6.76441176470588,6.325,5.98058823529412,4.86,6.503,5.43787878787879),Hours.of.Sleep = c(5.88058823529412,5.59833333333333,4.89117647058824,3.93666666666667,5.933,5.10484848484848),Sleep.disturbances = c(6.85294117647059,6.66666666666667,4.52941176470588,3.55555555555556,5.2,2.93939393939394),Latency.min = c(6.96558823529412,3.31333333333333,3.77411764705882,2.81333333333333,2.88,2.90424242424242
),Cycles = c(5.73529411764706,5.83333333333333,3.23529411764706,2.22222222222222,5,3.33333333333333),REM.Sleep.hours = c(1.42970588235294,1.55,0.466470588235294,0.413333333333333,1.42,0.698181818181818
),Deep.Sleep.hours = c(0.612058823529412,0.55,1.17352941176471,0.972222222222222,0.68,1.73909090909091),Light.Sleep.hours = c(3.83647058823529,3.49666666666667,3.25058823529412,2.55111111111111,3.835,2.66636363636364),Awake.hours = c(0.881764705882353,0.723333333333333,1.08764705882353,0.92,0.568,0.333030303030303),Missing.Data.hours = c(0,0),Respiratory.Rate = c(NaN,NaN,NaN),Year_Day = c(147.852941176471,127.5,145.117647058824,129.888888888889,130.5,146),Week_Year = c(21.5588235294118,18.6666666666667,21.1764705882353,19,19.1,21.2727272727273
)),row.names = c(NA,-6L),groups = structure(list(First.Name = 5006:5009,.rows = structure(list(1:2,3:4,5L,6L),ptype = integer(0),class = c("vctrs_list_of","vctrs_vctr","list"))),4L),class = c("tbl_df","tbl","data.frame"),.drop = TRUE),class = c("grouped_df","tbl_df","data.frame"))
看起来像:
First.Name Session RHR HRV Hours.in.bed Hours.of.Sleep Sleep.disturbances Latency.min Cycles REM.Sleep.hours Deep.Sleep.hours Light.Sleep.hou~
<int> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 5006 Post 65.2 79.1 6.76 5.88 6.85 6.97 5.74 1.43 0.612 3.84
2 5006 Pre 60 73.5 6.32 5.60 6.67 3.31 5.83 1.55 0.55 3.50
3 5007 Post 62.6 91.4 5.98 4.89 4.53 3.77 3.24 0.466 1.17 3.25
4 5007 Pre 63 80.6 4.86 3.94 3.56 2.81 2.22 0.413 0.972 2.55
5 5008 NA 63.4 102. 6.50 5.93 5.2 2.88 5 1.42 0.68 3.84
6 5009 Post 48.6 147. 5.44 5.10 2.94 2.90 3.33 0.698 1.74 2.67
我正在尝试根据 Session
列减去所有列中的特定行。具体来说,Post
- Pre
跨列每个 First.Name
ID。但是,某些 ID 缺少 Pre
或 Post
值,或两者都缺少。
例如:
主题 RHR
的列 5006
将是 Post - Pre
或 65.2-60
,以此类推。
我尝试过
的变体DF %>%
group_by(First.Name) %>%
summarise(RHR[Session == "Post"] - RHR[Session == "Pre"])
但我确信有一种方法可以使用汇总或应用函数,这些函数不需要因值的差异而改变新列。感谢帮助。
解决方法
也许可以使用汇总来一次获取所有行。
summarize(df%>%group_by(First.Name),across(RHR:Week_Year,function(x) {return(x[1]-x[2])}))
First.Name RHR HRV Hours.in.Bed Hours.of.Sleep Sleep.Disturbances Latency.min Cycles
<int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 5006 5.24 5.65 0.439 0.282 0.186 3.65 -0.0980
2 5007 -0.412 10.9 1.12 0.955 0.974 0.961 1.01
3 5008 NA NA NA NA NA NA NA
4 5009 NA NA NA NA NA NA NA
,
我们也可以使用 diff
library(dplyr)
DF %>%
group_by(First.Name) %>%
summarise(across(RHR:Week_Year,~ -diff(.)[1]))
-输出
# A tibble: 4 x 16
# First.Name RHR HRV Hours.in.Bed Hours.of.Sleep Sleep.Disturban… Latency.min Cycles REM.Sleep.hours Deep.Sleep.hours Light.Sleep.hou…
#* <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 5006 5.24 5.65 0.439 0.282 0.186 3.65 -0.0980 -0.120 0.0621 0.340
#2 5007 -0.412 10.9 1.12 0.955 0.974 0.961 1.01 0.0531 0.201 0.699
#3 5008 NA NA NA NA NA NA NA NA NA NA
#4 5009 NA NA NA NA NA NA NA NA NA NA
# … with 5 more variables: Awake.hours <dbl>,Missing.Data.hours <dbl>,Respiratory.Rate <dbl>,Year_Day <dbl>,Week_Year <dbl>
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。