如何解决将复杂数据重塑为宽格式,其中输入是长数据和宽数据的混合
我正在处理相当复杂的数据。这是一个作为数据框 df
的简化快照。
ID Measures ME1 ME2 X1 X2
53-21 comm - 01 narrate 2 1 NA NA
53-21 comm - overall 1 NA NA NA
53-21 comm - 10 participate NA NA NA NA
43-65 comm - 02 project 2 3 NA NA
43-65 comm - 01 narrate 1 1 NA NA
67-21 comm - 06 action 2 1 NA NA
67-21 comm - 08 plan 1 1 NA 1
43-65 comm - overall 2 NA NA NA
53-21 comm - exhibit 1 1 NA NA
这里:
ID
= 唯一用户 ID
对于每个Measure
,用户最多可以根据ME1
、ME2
、X1
和X2
等四个不同的项目进行评分。>
我想以将项目放在行中的格式转换此数据,即每行一个 ID 及其在附加列中的相应度量。我需要的重塑数据框是这样的:
ID comm-01-narrate-ME1 comm-01-narrate-ME2 comm-01-narrate-X1 comm-01-narrate-X2 comm-overall-ME1 comm-overall-ME2 comm-overall-X1 comm-overall-X2 comm-10-participate-ME1 comm-10-participate-ME2 comm-10-participate-X1 comm-10-participate-X2 comm-exhibit-ME1 comm-exhibit-ME2 comm-exhibit-X1 comm-exhibit-X2 comm-02-project-ME1 comm-02-project-ME2 comm-02-project-X1 comm-02-project-X2 comm-06-action-ME1 comm-06-action-ME2 comm-06-action-X1 comm-06-action-X2 comm-08-plan-ME1 comm-08-plan-ME2 comm-08-plan-X1 comm-08-plan-X2
53-21 2 1 NA NA 1 NA NA NA NA NA NA NA 1 1 NA NA NA NA NA NA NA NA NA NA NA NA NA NA
43-65 1 1 NA NA 2 NA NA NA NA NA NA NA NA NA NA NA 2 3 NA NA NA NA NA NA NA NA NA NA
67-21 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2 1 NA NA 1 1 NA 1
输入文件 dput()
的 df
是:
dput(df)
structure(list(ID = structure(c(2L,2L,1L,3L,2L),.Label = c("43-65","53-21","67-21"),class = "factor"),Measures = structure(c(1L,7L,5L,4L,6L),.Label = c("comm - 01 narrate","comm - 02 project","comm - 06 action","comm - 08 plan","comm - 10 participate","comm - exhibit","comm - overall"),ME1 = c(2L,NA,1L),ME2 = c(1L,X1 = c(NA_integer_,NA_integer_,NA_integer_),X2 = c(NA,NA)),class = "data.frame",row.names = c(NA,-9L))
我正在努力定义问题,甚至开始处理。可以将数据视为 long
,但由于多列,它也是 wide
。
感谢您花时间阅读这篇文章。
编辑 1
从相关帖子 Convert data from long format to wide format with multiple measure columns 我尝试了以下解决方案:
library(data.table)
df2 = dcast(setDT(df),ID~Measures,value.var=c("ME1","ME2","X1","X2"))
但是,我收到警告:
Aggregate function missing,defaulting to 'length'
这意味着我的数据中的条目全部更改为 1 或 NA。我不希望这种情况发生。
编辑 2
当我在原始数据上测试建议的解决方案时,它失败了。为了更好地解释,我提供了一个与我的原始数据密切复制的小样本。现有解决方案均无效。
dput(df)
structure(list(
Id = c("39fca07f-d62e-494a-4a86-8dec54836c08","39fca8ee-fe3f-4c85-ab0a-acb3c2db1b9c","39fca8ed-f34c-b7e3-4229-111155aabe35","39fca8e9-1e08-1809-c7a8-d2c8a4bc9b00","39fc6ae5-0de8-4820-eede-343e738e7a4a","39fca8e9-fbf9-a098-cf8c-322810997ce9"),DeliverId = c("39fb74ce-d5e6-69f6-f733-ee5fbc4689e6","39fb74ce-d5e6-69f6-f733-ee5fbc4689e6","39fb74ce-d5e6-69f6-f733-ee5fbc4689e6"),DeliverN = c("1Assess","1Assess","1Assess"),AssessRId = c("39fb74cf-5fb6-4248-6d08-0e36647e190b","39fb74cf-5fb6-4248-6d08-0e36647e190b","39fb74cf-5fb6-4248-6d08-0e36647e190b"),AssessRN = c("P1","P2","P3","P4","P5","P6"),AssesstId = c("1ee2684c99fa","fd2dbea08b43","0e0177a33282","091b8f805553","6e5b9301116d","7a307a90de19"),AssesstN = c("Comm - 09 Narrate","Comm - Prog Level Judge","Comm - O Indi Level Judge","Comm - 02 Int Prj","Comm - 10 Learn Comm Participate","Comm - 05 Exhibit"),S.Time = c("21/05/2020 19:47","23/05/2020 11:06","23/05/2020 11:05","23/05/2020 10:59","11/05/2020 9:58","23/05/2020 11:00"),F.Time = c("24/05/2020 11:02","23/05/2020 11:00","23/05/2020 11:04","23/05/2020 11:03"),Completedindi = c(8L,8L,8L),TotalIndi = c(8L,Progress = c(100L,100L,100L),Build = c("Monice Island","Pink Lasy","",""),Advice = c("Monica","Chandler",TechUserId = c(128L,129L,130L,129L),TechName = c("Barba","Raymond","Raymond"),TechEmail = c("barber@123.com","raymond@123.com","raymond@123.com"),TechLife = c("0 - 2 years","Over 10 years","Over 10 years"),OtherLife = c("0 - 2 years","5 - 10 years","5 - 10 years"),PersonUId = c(470L,455L,455L),PersonDName = c("Tall Tiffany","Sharp Steff","Sharp Steff"),PersonFName = c("Tall","Sharp","Sharp"),PersonLName = c("Tiffany","Steff","Steff"),PersonUID = c("2783-4409","4307-4369","4307-4369"),Gender = c("Female","Female","Female"),PYear = c(2023L,2024L,2024L),Course = c("Undergrad","Grad","Grad"),Special = c("Yes","No","No"),Q1 = c(2L,Q2 = c(1L,Q3 = c(1L,Q4 = c(1L,Q5 = c(1L,Q6 = c(1L,0L,Q7 = c(1L,Q8 = c(2L,Q9 = c(NA,NA),Q10 = c(NA,X = c(NA,X.1 = c(NA,ListDetails = c("Missing","Complete","Complete")),-6L))
所需的输出如下:
Id DeliverId DeliverN AssessRId AssessRN AssesstId S-Time F-Time Completedindi TotalIndi Progress Build Advice TechUserId TechName TechEmail TechLife OtherLife PersonDName PersonFName PersonLName PersonUID Gender PYear Course Special ListDetails PersonUId Q1_Comm - 02 Int Prj Q1_Comm - 05 Exhibit Q1_Comm - 09 Narrate Q1_Comm - 10 Learn Comm Participate Q1_Comm - O Indi Level Judge Q1_Comm - Prog Level Judge Q2_Comm - 02 Int Prj Q2_Comm - 05 Exhibit Q2_Comm - 09 Narrate Q2_Comm - 10 Learn Comm Participate Q2_Comm - O Indi Level Judge Q2_Comm - Prog Level Judge Q3_Comm - 02 Int Prj Q3_Comm - 05 Exhibit Q3_Comm - 09 Narrate Q3_Comm - 10 Learn Comm Participate Q3_Comm - O Indi Level Judge Q3_Comm - Prog Level Judge Q4_Comm - 02 Int Prj Q4_Comm - 05 Exhibit Q4_Comm - 09 Narrate Q4_Comm - 10 Learn Comm Participate Q4_Comm - O Indi Level Judge Q4_Comm - Prog Level Judge Q5_Comm - 02 Int Prj Q5_Comm - 05 Exhibit Q5_Comm - 09 Narrate Q5_Comm - 10 Learn Comm Participate Q5_Comm - O Indi Level Judge Q5_Comm - Prog Level Judge Q6_Comm - 02 Int Prj Q6_Comm - 05 Exhibit Q6_Comm - 09 Narrate Q6_Comm - 10 Learn Comm Participate Q6_Comm - O Indi Level Judge Q6_Comm - Prog Level Judge Q7_Comm - 02 Int Prj Q7_Comm - 05 Exhibit Q7_Comm - 09 Narrate Q7_Comm - 10 Learn Comm Participate Q7_Comm - O Indi Level Judge Q7_Comm - Prog Level Judge Q8_Comm - 02 Int Prj Q8_Comm - 05 Exhibit Q8_Comm - 09 Narrate Q8_Comm - 10 Learn Comm Participate Q8_Comm - O Indi Level Judge Q8_Comm - Prog Level Judge Q9_Comm - 02 Int Prj Q9_Comm - 05 Exhibit Q9_Comm - 09 Narrate Q9_Comm - 10 Learn Comm Participate Q9_Comm - O Indi Level Judge Q9_Comm - Prog Level Judge Q10_Comm - 02 Int Prj Q10_Comm - 05 Exhibit Q10_Comm - 09 Narrate Q10_Comm - 10 Learn Comm Participate Q10_Comm - O Indi Level Judge Q10_Comm - Prog Level Judge X_Comm - 02 Int Prj X_Comm - 05 Exhibit X_Comm - 09 Narrate X_Comm - 10 Learn Comm Participate X_Comm - O Indi Level Judge X_Comm - Prog Level Judge X.1_Comm - 02 Int Prj X.1_Comm - 05 Exhibit X.1_Comm - 09 Narrate X.1_Comm - 10 Learn Comm Participate X.1_Comm - O Indi Level Judge X.1_Comm - Prog Level Judge
39fca07f-d62e-494a-4a86-8dec54836c08 39fb74ce-d5e6-69f6-f733-ee5fbc4689e6 1Assess 39fb74cf-5fb6-4248-6d08-0e36647e190b P1 1ee2684c99fa 21/05/2020 19:47 24/05/2020 11:02 8 8 100 Monice Island Monica 128 Barba barber@123.com 0 - 2 years 0 - 2 years Tall Tiffany Tall Tiffany 2783-4409 Female 2023 Undergrad Yes Missing 470 NA NA 2 NA NA NA NA NA 1 NA NA NA NA NA 1 NA NA NA NA NA 1 NA NA NA NA NA 1 NA NA NA NA NA 1 NA NA NA NA NA 1 NA NA NA NA NA 2 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
39fca8ee-fe3f-4c85-ab0a-acb3c2db1b9c 39fb74ce-d5e6-69f6-f733-ee5fbc4689e6 1Assess 39fb74cf-5fb6-4248-6d08-0e36647e190b P2 fd2dbea08b43 23/05/2020 11:06 23/05/2020 11:06 1 1 100 Pink Lasy Chandler 129 Raymond raymond@123.com Over 10 years 5 - 10 years Sharp Steff Sharp Steff 4307-4369 Female 2024 Grad No Complete 455 3 2 NA 2 3 1 2 2 NA 1 2 NA 3 2 NA 2 3 NA 3 1 NA 2 3 NA 2 2 NA 1 2 NA 1 1 NA 1 0 NA 1 2 NA 2 2 NA 1 2 NA 2 2 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
注意:我不认为这个问题是重复的,因为以前的解决方案都不适用于我的数据集。我请求您再次打开我的问题并使其可见。
如果您能就这些解决方案为何不适用于我的数据集提供任何帮助或建议,我将不胜感激。
解决方法
我们可以使用 pivot_wider
,它需要多个 values_from
列
library(dplyr)
library(tidyr)
df %>%
pivot_wider(names_from = Measures,values_from = ME1:X2)
-输出
# A tibble: 3 x 29
ID `ME1_comm - 01 na… `ME1_comm - overa… `ME1_comm - 10 par… `ME1_comm - 02 p… `ME1_comm - 06 a… `ME1_comm - 08 p… `ME1_comm - exhi…
<fct> <int> <int> <int> <int> <int> <int> <int>
1 53-21 2 1 NA NA NA NA 1
2 43-65 1 2 NA 2 NA NA NA
3 67-21 NA NA NA NA 2 1 NA
# … with 21 more variables: ME2_comm - 01 narrate <int>,ME2_comm - overall <int>,ME2_comm - 10 participate <int>,# ME2_comm - 02 project <int>,ME2_comm - 06 action <int>,ME2_comm - 08 plan <int>,ME2_comm - exhibit <int>,# X1_comm - 01 narrate <int>,X1_comm - overall <int>,X1_comm - 10 participate <int>,X1_comm - 02 project <int>,# X1_comm - 06 action <int>,X1_comm - 08 plan <int>,X1_comm - exhibit <int>,X2_comm - 01 narrate <int>,X2_comm - overall <int>,# X2_comm - 10 participate <int>,X2_comm - 02 project <int>,X2_comm - 06 action <int>,X2_comm - 08 plan <int>,# X2_comm - exhibit <int>
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。