将 list-col 值分离为相对于条件的奇异值

如何解决将 list-col 值分离为相对于条件的奇异值

简单说明

从长转换为宽，同时将缺失值填充为 17 的 2019 和 16 的 2010，而 2010 中的那些值与2019，然后减去他们的计划价值（即 2019-2010）。如果 2019 年没有值并且用 17 填充，则为该计划值指定一个 negative 值。同时，如果为 16 中的缺失值填充 2010，则保留计划值，positive。

这应该类似于表 2。

表 1：长格式数据帧示例

# A tibble: 10 x 4
   year  locality_id landcover  pland
   <chr> <chr>           <int>  <dbl>
 1 2010  L452817             8 0.0968
 2 2010  L452817             9 0.0323
 3 2010  L452817            12 0.613 
 4 2010  L452817            13 0.194 
 5 2010  L452817            14 0.0645
 6 2019  L452817             8 0.0645
 7 2019  L452817             9 0.0645
 8 2019  L452817            12 0.516 
 9 2019  L452817            13 0.194 
10 2019  L452817            14 0.161

表 2：表 2 的预期格式

   locality_id X2010 X2019       pland
1      L452817     8     8 -0.03225806
2      L452817     9     9  0.03225807
3      L452817    12    12 -0.09677420
4      L452817    13    13  0.00000000
5      L452817    14    14  0.09677419
6      L910180     0    17 -0.43750000
7      L910180     8    17 -0.34375000
8      L910180     9    17 -0.03125000
9      L910180    10    17 -0.03125000
10     L910180    11    17 -0.09375000
11     L910180    13    17 -0.06250000

我尝试过的：

#set the values of t inot another variable
y <- t
#remove pland from the new variable
y <- y[,-4]

#set from long to wide providing the pland differences from t as another column
y %>%
    group_by(year) %>%
    mutate(row = row_number()) %>%
    tidyr::pivot_wider(names_from = year,values_from = landcover) %>%
    select(-row) %>% mutate(across(`2010`:`2019`,~if(cur_column() == '2019') 
        replace_na(.x,17) else replace_na(.x,16))) %>% mutate(t[t$year %in% 2019,]$pland - t[t$year %in% 2010,]$pland)

# A tibble: 11 x 4
   locality_id `2010` `2019` `t[t$year %in% 2019,]$pland`
   <chr>        <dbl>  <dbl>                                                       <dbl>
 1 L452817          8      8                                                    -0.0323 
 2 L452817          9      9                                                     0.0323 
 3 L452817         12     12                                                    -0.0968 
 4 L452817         13     13                                                     0      
 5 L452817         14     14                                                     0.0968 
 6 L910180          0     17                                                    -0.373  
 7 L910180          8     17                                                    -0.279  
 8 L910180          9     17                                                     0.485  
 9 L910180         10     17                                                     0.162  
10 L910180         11     17                                                     0.0675 
11 L910180         13     17                                                     0.00202

我上面的代码的问题是它总是计算差异，它不应该计算由于缺失值而引入的那些值的差异，所以当有 16 或 {{1 }} 两边。

我尝试过的资源：One 和 two。

可重现的代码：

解决方法

我没有使用虚拟变量来识别缺失，而是使用了一种不同的方法，complete 和 df 是您的原始数据结构。

df %>%
  # fill in the data with missing year so we can compute while data in long format
  complete(year,nesting(locality_id,landcover),fill = list(pland = 0)) %>%
  arrange(desc(year)) %>%
  group_by(locality_id,landcover) %>%
  summarize(
    X2010 = if_else(pland[year == 2010] == 0,16L,first(landcover)),X2019 = if_else(pland[year == 2019] == 0,17L,pland  = pland[year == 2019] - pland[year == 2010]) %>%
  arrange(locality_id,landcover)

这是输出

   locality_id landcover X2010 X2019   pland
   <chr>           <int> <int> <int>   <dbl>
 1 L452817             8     8     8 -0.0323
 2 L452817             9     9     9  0.0323
 3 L452817            12    12    12 -0.0968
 4 L452817            13    13    13  0     
 5 L452817            14    14    14  0.0968
 6 L910180             0     0    17 -0.438 
 7 L910180             8     8    17 -0.344 
 8 L910180             9     9    17 -0.0312
 9 L910180            10    10    17 -0.0312
10 L910180            11    11    17 -0.0938
11 L910180            13    13    17 -0.0625

设法弄明白了，尽管欢迎提出更好的建议，尤其是在没有警告的情况下！

#set the values of t inot another variable
y <- t
#remove pland from the new variable
y <- y[,-4]

#set from long to wide providing the pland differences from t as another column
y %>%
group_by(year) %>%
mutate(row = row_number()) %>%
tidyr::pivot_wider(names_from = year,values_from = landcover) %>%
select(-row) %>% 
mutate(across(`2010`:`2019`,~if(cur_column() == '2019') replace_na(.x,17) else replace_na(.x,16))) %>% 
mutate(ifelse(`2019` == `2010`,t[t$year %in% 2019,]$pland - t[t$year %in% 2010,]$pland,-t$pland))

警告信息： 1：mutate() 输入 ..1 有问题。
i 较长的物体长度不是较短物体长度的倍数
i 输入 ..1 是 ifelse(...)。
2: 在 t[t$year %in% 2019,]$pland :
较长的物体长度不是较短物体长度的倍数

# A tibble: 11 x 4
   locality_id `2010` `2019` `ifelse(...)`
   <chr>        <dbl>  <dbl>         <dbl>
 1 L452817          8      8       -0.0323
 2 L452817          9      9        0.0323
 3 L452817         12     12       -0.0968
 4 L452817         13     13        0     
 5 L452817         14     14        0.0968
 6 L910180          0     17       -0.438 
 7 L910180          8     17       -0.344 
 8 L910180          9     17       -0.0312
 9 L910180         10     17       -0.0312
10 L910180         11     17       -0.0938
11 L910180         13     17       -0.0625

分解：

使用来自 here 的代码建议

这将创建一个相对于分组列的 id 列，并为 unique 中的每个 group_by() 值重复

然后使用下一个代码，来自 here

这将 NAs 中的 2010 替换为 16，将 2019 中的 17 替换为 ifelse()

最后，2019 语句，我被一个线程挂了，认为它会起作用，它确实起作用了！

它选择那些分别等于 2010 和 16 的土地覆盖值，然后通过减去这些值来计算它们的差异。最后，那些不相同的值用剩余的计划值填充，同时取负数。

然而当 2010 出现在 2019 中时，我还没有想出如何处理这些值，所以 ol.sphere.getLength() 计划值保持正值，考虑到它的始终设置为负！

将 list-col 值分离为相对于条件的奇异值

如何解决将 list-col 值分离为相对于条件的奇异值

解决方法

相关推荐