使用 R 从数据框列中提取列表元素

如何解决使用 R 从数据框列中提取列表元素

我有一个数据框，其中第三列是列表列表。我希望向现有数据框中添加一列，该列仅包含列表的元素，其中 key = wb_id 和值字符串是我想要在新列中的内容。以前，我认为这始终是列表中的第 14 个元素。我错了，它似乎在四处走动，但始终以 key = wb_id 标识。

因此在下面的示例中，将有一个新列 wb_id 添加到由 2 行组成的 df：

> df[[3]][[1]][[14]][["value"]]
[1] "test1_secret_ID"

> df[[3]][[2]][[14]][["value"]]
[1] "test2_secret_ID"

这是数据框

df <- structure(list(email = list("test1@example.com","test2@example.com"),type = list("active","active"),fields = list(list(list(
                       key = "name",value = "",type = "TEXT"),list(key = "email",value = "test1@example.com",list(key = "company",list(key = "country",list(key = "city",list(key = "phone",list(
                         key = "state",list(key = "zip",list(key = "last_name",list(key = "notify_pref",value = "new_leader",list(key = "your_message",list(key = "selected",value = "Canadian Tire Bank,Bridgewater Bank,Motive Financial",list(key = "confirmed_email",list(key = "wb_id",value = "test1_secret_ID",type = "TEXT")),list(list(key = "name",value = "test2@example.com",list(
                                                                                                                                                                                                                                                                                         key = "city",list(key = "state",list(
                                                                                                                                                                                                                                                                                         key = "notify_pref",value = "test2_secret_ID",type = "TEXT"))),date_created = list("2020-10-24 01:57:10","2020-10-24 01:57:23")),row.names = 1:2,class = "data.frame")

解决方法

如果我们需要使用循环（R 4.1.0），用sapply循环第3列，从第14个元素中提取'value'组件

df$new_column <- sapply(df[[3]],\(x) x[[14]]$value)
df$new_column
#[1] "test1_secret_ID" "test2_secret_ID"

如果我们想使用'key'提取

sapply(df[[3]],function(x) 
       x[sapply(x,function(y) y$key == 'wb_id')][[1]]$value)
#[1] "test1_secret_ID" "test2_secret_ID"

或使用 Filter

sapply(df[[3]],\(x) Filter(\(y) y$key == "wb_id",x)[[1]]$value)
#[1] "test1_secret_ID" "test2_secret_ID"

根据R news，

R 现在提供用于创建函数的速记符号，例如(x) x + 1 被解析为 function(x) x + 1。

或者对早期版本的 function(x) x

使用 R

df$new_column <- sapply(df[[3]],function(x) x[[14]]$value)

或者使用 map 中的 purrr

library(dplyr)
library(purrr)
df <- df %>% 
     mutate(new_column = map_chr(fields,~keep(.x,~ .x$key == 'wb_id') %>% 
          pluck(1,'value')))

使用 R 从数据框列中提取列表元素

如何解决使用 R 从数据框列中提取列表元素

解决方法

相关推荐