微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

如何将长格式的数据帧转换为适当格式的列表?

如何解决如何将长格式的数据帧转换为适当格式的列表?

我有以下长格式的数据框:

1

我需要将其转换为看起来像这样的列表:

2

其中,列表的每个主要元素将是“实例号”。并且其子元素应包含其所有对应的“参数和值”对-格式为“ Parameter X” =“ abc”,如您在第二张图片中看到的那样,一个一个地列出。

是否有任何现有功能可以做到这一点?我真的找不到任何东西。任何帮助将不胜感激。

谢谢。

解决方法

require(data.table)
your_dt <- data.table(your_df)

dt_long <- melt.data.table(your_dt,id.vars='Instance No.')
class(dt_long) # for debugging
dt_long[,strVal:=paste(variable,value,sep = '=')]

result_list <- list()

for (i in unique(dt_long[['Instance No.']])){
  result_list[[as.character(i)]] <- dt_long[`Instance No.`==i,strVal]
}
,

dplyr解决方案

require(dplyr)
df_original <- data.frame("Instance No." = c(3,3,5,2,2),"Parameter" = c("age","workclass","education","occupation","age","income"),"Value" = c("Senior","Private","HS-grad","Sales","Middle-aged","Gov","Hs-grad","Masters","Large"),check.names = FALSE)
    
# the split function requires a factor to use as the grouping variable.
# Param_Value will be the properly formated vector
df_modified <- mutate(df_original,Param_Value = paste0(Parameter,"=",Value))
# drop the parameter and value columns now that the data is contained in Param_Value
df_modified <- select(df_modified,`Instance No.`,Param_Value)

# there is now a list containing dataframes with rows grouped by Instance No.
list_format <- split(df_modified,df_modified$`Instance No.`)

# The Instance No. is still in each dataframe. Loop through each and strip the column.
list_simplified <- lapply(list_format,select,-`Instance No.`)

# unlist the remaining Param_Value column and drop the names.                      
list_out <- lapply(list_simplified,unlist,use.names = F)
                     

现在应该有按要求格式化的向量列表。

$`2`
[1] "age=Middle-aged"   "workclass=Private" "education=Masters" "income=Large"     

$`3`
[1] "age=Senior"        "workclass=Private" "education=HS-grad" "occupation=Sales" 

$`5`
[1] "age=Middle-aged"   "workclass=Gov"     "education=Hs-grad"

发布的data.table解决方案更快,但是我认为这更容易理解。

,

仅供参考。这是R base oneliner要做的。 df是您的数据框。

l <- lapply(split(df,list(df["Instance No."])),function(x) paste0(x$Parameter,x$Value))

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。