微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

将具有group_by的向量转换为以组为行名的矩阵

如何解决将具有group_by的向量转换为以组为行名的矩阵

在保留每一行的唯一标识符的同时,我似乎无法将数据框的一列重塑为正确的形状。我有以下数据

   id     x        y   indicator
1   1 249.6  1.124985        1 
2   1 250.9  1.124756        1 
3   1 252.2  1.124125        1 
4   1 253.5  1.124598        1 
5   1 254.8  1.127745        1 
6   1 256.1  1.129102        1 
7   2 249.6  2.167348        0   
8   2 250.9  2.165804        0   
9   2 252.2  2.164578        0  
10  2 253.5  2.163828        0  
11  2 254.8  2.164260        0   
12  2 256.1  2.166293        0 
13  3 249.6  0.04647765      0
14  3 250.9  0.04932262      0
15  3 252.2  0.05245448      0
15  3 253.5  0.05692405      0
17  3 254.8  0.06184551      0
18  3 256.1  0.06751989      0

我想将y向量整形为一个矩阵,其中每行对应一个y向量,并且还有id和indicator的其他列,而变量列则由x值标记。像这样:

id indicator  249.6      250.9      252.2      ...
1  1          1.124985   1.124756   1.124125   ...
2  0          2.167348   2.165804   2.164578   ...
3  0          0.04647765 0.04932262 0.05245448 ...

我尝试过使用这样的重塑功能

reshape(df[c('id','x','y')],direction = "wide",idvar = "id",timevar = "x")

在这种情况下,我只是忽略了指标变量,看它是否可以工作,但是我得到的数据框只有两列,第一列是ID,第二列是y.c(249.6,250.9,252.2,253.5,254.8,256.1,257.4,258.7,260,261.3,262.6,263.9,265.2,266.5,267.8,269.1,270.4,271.7,273,274.3,275.6,276.9,278.2,279.5,280.8,282.1,283.4,284.7,286,287.3,288.6,289.9,291.2,292.5,293.8,295.1,[etc]。 / p>

我还尝试使用xtabs函数a = xtabs(formula = y ~ id + indicator+ x,data=df),但这只是返回了一个看起来与我输入的表非常相似的表。

解决方法

在拉各斯(Largo para largo)进行格式化。 (Veja no SO emInglêsaqui)。

python setup.py install

Dados

library(dplyr)
library(tidyr)

df1 %>%
  pivot_wider(
    id_cols = c('id','indicator'),names_from = 'x',values_from = 'y'
  )
## A tibble: 3 x 8
#     id indicator `249.6` `250.9` `252.2` `253.5` `254.8` `256.1`
#  <int>     <int>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
#1     1         1  1.12    1.12    1.12    1.12    1.13    1.13  
#2     2         0  2.17    2.17    2.16    2.16    2.16    2.17  
#3     3         0  0.0465  0.0493  0.0525  0.0569  0.0618  0.0675
 
,

关于问题中的代码,indicator应该是idvar的一部分。另外,如果df中除了列出的4列之外没有其他列,则df[...]可以缩短为df

reshape(df[c('id','x','y','indicator')],direction = "wide",idvar = c("id","indicator"),timevar = "x")

给予:

   id indicator    y.249.6    y.250.9    y.252.2    y.253.5    y.254.8    y.256.1
1   1         1 1.12498500 1.12475600 1.12412500 1.12459800 1.12774500 1.12910200
7   2         0 2.16734800 2.16580400 2.16457800 2.16382800 2.16426000 2.16629300
13  3         0 0.04647765 0.04932262 0.05245448 0.05692405 0.06184551 0.06751989

注意

可复制形式的输入:

df <- structure(list(id = c(1L,1L,2L,3L,3L),x = c(249.6,250.9,252.2,253.5,254.8,256.1,249.6,256.1),y = c(1.124985,1.124756,1.124125,1.124598,1.127745,1.129102,2.167348,2.165804,2.164578,2.163828,2.16426,2.166293,0.04647765,0.04932262,0.05245448,0.05692405,0.06184551,0.06751989),indicator = c(1L,0L,0L)),class = "data.frame",row.names = c("1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18"))
,

所以我很快就用tidyr找出了答案:

z = pivot_wider(df,id_cols = c("id",names_from = "x",values_from = "y")

瑞·巴拉达斯(Rui Barradas)的答案是相同的,而且效果很好。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。