如何解决R-对一组进行动态迭代,以便差异大于26.5并且时间戳小于<48
我正在尝试在R Studio中编写脚本,目的是在患者发展为急性肾损伤时进行捕获,如果是这样,将在第一时间发生。急性肾脏损伤的定义如下:
在48小时内肌酐的增加(实验室测试的名称)> = 26.5 umol / L(单位)
请参见下面的示例表,其数据结构如下:
PatientKey Creatinine Timestamp Mycommentsforthisquestion
1 70 2020-04-03 14:10:10
1 90 2020-04-03 17:11:10
1 98 2020-04-03 19:10:10 FirsT TIME OF AKI,i.e. > 26.5
increase and less than 48 hours
1 100 2020-04-03 22:10:10 NOT RELEVANT,AKI ALREADY
2 140 2019-08-01 00:00:00 ONLY ONE VALUE,IGnorE SINCE NO
DIFFERENCE CAN BE CALculaTED
3 120 2017-01-06 00:00:05 CAME TO HOSPITAL WITH HIGH VALUE
3 70 2017-01-06 10:00:05 DECREASES MORE THAN 26.5 DONT
COUNT --> NO AKI
3 80 2017-01-08 10:00:05
4 70 2020-01-08 22:00:05
4 60 2020-01-09 22:00:05 NOTE IT IS NOT ALWAYS THE FirsT
TEST THAT SERVES AS THE COMPARISON
VALUE
4 90 2020-01-10 02:00:05 90 - 60 > 26.5 --> AKI
4 110 2020-01-10 06:00:05 NOT RELEVANT,AKI ALREADY
5 50 2020-01-12 06:00:05
5 70 2020-01-13 08:00:05
5 80 2020-01-14 22:00:05 NO AKI,DIFFERENCE > 26.5 BUT MORE
THAN 48 HOURS BETWEEN TESTS
我想要的输出如下:
PatientKey CreatinineLOW CreatinineHIGH TimestampLOW TimestampHIGH
1 70 98 2020-04-03 14:10:10 2020-04-03 19:10:10
4 60 90 2020-01-09 22:00:05 2020-01-10 02:00:05
请注意,只有患者1出现了AKI,因此输出中仅应包括该患者的数据。
在R中完成这项工作是否可行?我试图使用dplyr / tidyverse包并做类似的事情(MYDATA是数据帧的名称):
datalist = list()
for (m in MYDATA$PatientKey %>% unique()) {
x = filter(MYDATA,PatientKey == m) %>% pull(PatientKey)
table <- MYDATA %>% filter(PatientKey == m) %>% arrange(Timestamp)
for (i in 1:length(x)) {
table %>%
mutate(indx_creat = Creatinine[1],new_creatinine = Creatinine - indx_creat,indx_time = Timestamp[i],new_time = as.numeric(difftime(Timestamp,indx_time,units = "hours"))) %>%
filter(new_creatinine >= 26.5 & new_time <= -48) -> r
if (nrow(r) == 0) {
table <- table[-1,]
} else if (nrow(r) > 0 {
datalist[[i]] <- r
break
}
}
}
summary.table = do.call(rbind,datalist)
summary.table <- summary_table %>% group_by(PatientKey) %>% slice(1)
但是,这没有用!任何人都对如何完成工作有任何想法?对于临床社区来说,使用程序轻松检测急性肾脏损伤将非常有用!
解决方法
让我们分两个步骤进行操作,使其更通用。
第1步:每次观察发现任何关键增长
首先,我们将创建一列嵌套的数据框,每个数据框包含 any 行,这些行显示了Creatinin的严重增加:
library(dplyr)
library(purrr)
library(tidyr)
library(lubridate)
# loading OPs data
df <- structure(list(PatientKey = c(1L,1L,2L,3L,4L,5L,5L),Creatinine = c(70L,90L,98L,100L,140L,120L,70L,80L,60L,110L,50L,80L),Timestamp = c("2020-04-03 14:10:10","2020-04-03 17:11:10","2020-04-03 19:10:10","2020-04-03 22:10:10","2019-08-01 00:00:00","2017-01-06 00:00:05","2017-01-06 10:00:05","2017-01-08 10:00:05","2020-01-08 22:00:05","2020-01-09 22:00:05","2020-01-10 02:00:05","2020-01-10 06:00:05","2020-01-12 06:00:05","2020-01-13 08:00:05","2020-01-14 22:00:05")),row.names = c(NA,-15L),class = "data.frame")
# find any critical increase,per observation row
df2 <- df %>%
mutate(
Timestamp = as_datetime(Timestamp)
) %>%
group_by(PatientKey) %>%
do(
mutate(.,critical_increase = map2(
Creatinine,Timestamp,.f = function(c,t,data) {
data %>%
filter(
Creatinine - c >= 26.5,Timestamp > t,as.numeric(Timestamp - t,units = "hours") <= 48
) %>%
select(Timestamp,Creatinine)
},data = .
)
)
) %>%
ungroup()
上面的代码映射到每行的“肌酸酐”和“时间戳”,并抛出相应患者数据的分组子集,并对其进行过滤以进行关键性增加。过滤后的df存储在“ critical_increase”列中。
步骤2:将数据转换为预期格式
现在,要获得您描述的格式,我们将取消嵌套先前计算过的数据框,将列重命名为您想要的数据列,并通过“ TimestampHIGH”选择每个患者的第一行:
# unnest,rename & select first critical increase
df_final <- df2 %>%
unnest(critical_increase,names_sep = ".") %>%
transmute(
PatientKey = PatientKey,CreatinineLOW = Creatinine,CreatinineHIGH = critical_increase.Creatinine,TimestampLOW = Timestamp,TimestampHIGH = critical_increase.Timestamp
) %>%
group_by(PatientKey) %>%
arrange(TimestampHIGH,desc(TimestampLOW)) %>%
filter(row_number() == 1) %>%
ungroup()
您可以轻松进行变化,例如arrange(desc(CreatinineHIGH - CreatinineLOW))
,以发现48小时内观察到的最大增加量。
结果
上面给出了预期的结果:
> df_final
# A tibble: 2 x 5
PatientKey CreatinineLOW CreatinineHIGH TimestampLOW TimestampHIGH
<int> <int> <int> <dttm> <dttm>
1 4 60 90 2020-01-09 22:00:05 2020-01-10 02:00:05
2 1 70 98 2020-04-03 14:10:10 2020-04-03 19:10:10
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。