微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

如何“反向融化”一个data.frame?

如何解决如何“反向融化”一个data.frame?

我有FilterExpression data.frame(请参见下面的代码)。我想将其转换为df1的样子(请参见下面的代码)。

也许可以使用df2 reshapecast完成此操作?但是我不理解这些功能。有人可以帮忙吗?

reverse melt

修改

有人建议我看一下这篇文章How to reshape data from long to wide format。不幸的是,这不能回答我的问题。等效的代码如下,并引发以下错误

 df1 <- data.frame(
   stringsAsFactors = FALSE,sample = c("a","a","b","c","d","e","g","g"),LETTER = c("P","R","V","Y","Q","S","T","U","W","X","Z","T")
        )

 df2 <- data.frame(
   stringsAsFactors = FALSE,"f",P = c(1L,0L,0L),Q = c(0L,1L,1L),R = c(1L,S = c(0L,T = c(0L,U = c(0L,V = c(1L,W = c(0L,X = c(0L,Y = c(1L,Z = c(0L,0L)
        )

首先使用 df2 <- reshape(df,idvar = "sample",timevar = "LETTER",direction = "wide") Error in data[,timevar] : object of type 'closure' is not subsettable 添加第三个变量也不能解决问题。

请注意,在我的数据中,数据的长度和宽度之间没有完全匹配,这与所述文章不同。请随时提供任何帮助。

解决方法

您可以使用table()创建频率表,然后将结果转换为data.frame。

x <- table(df1$sample,df1$LETTER)
df2 <- cbind(data.frame(sample = rownames(x)),as.data.frame.matrix(x))

sample P Q R S T U V W X Y Z
a      a 1 0 1 0 0 0 1 0 0 1 0
b      b 0 1 0 0 0 0 0 0 0 0 0
c      c 0 1 1 1 1 1 0 1 1 0 1
d      d 0 1 0 0 0 0 0 0 1 0 0
e      e 0 1 0 0 0 0 1 0 1 0 0
g      g 0 1 0 0 1 0 0 0 0 0 0

如果要在输出中包括sample = f(df1中不存在),则可以在调用df$sample之前将缺失值作为因子级别添加到table()

df1$sample <- factor(df1$sample,levels = letters[1:7])
x <- table(df1$sample2,df1$LETTER)
cbind(data.frame(sample = rownames(x)),as.data.frame.matrix(x))

  sample P Q R S T U V W X Y Z
a      a 1 0 1 0 0 0 1 0 0 1 0
b      b 0 1 0 0 0 0 0 0 0 0 0
c      c 0 1 1 1 1 1 0 1 1 0 1
d      d 0 1 0 0 0 0 0 0 1 0 0
e      e 0 1 0 0 0 0 1 0 1 0 0
f      f 0 0 0 0 0 0 0 0 0 0 0
g      g 0 1 0 0 1 0 0 0 0 0 0

,

您可以创建一个虚拟列并以宽格式获取数据:

library(dplyr)

df1 %>%
  mutate(n = 1) %>%
  tidyr::pivot_wider(names_from = LETTER,values_from = n,values_fill = 0)

#  sample     P     R     V     Y     Q     S     T     U     W     X     Z
#  <chr>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 a          1     1     1     1     0     0     0     0     0     0     0
#2 b          0     0     0     0     1     0     0     0     0     0     0
#3 c          0     1     0     0     1     1     1     1     1     1     1
#4 d          0     0     0     0     1     0     0     0     0     1     0
#5 e          0     0     1     0     1     0     0     0     0     1     0
#6 g          0     0     0     0     1     0     1     0     0     0     0

或在data.table中:

library(data.table)
setDT(df1)[,n := 1]
dcast(df1,sample~LETTER,value.var = 'n',fill = 0)

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。