微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

如果我只有边名称,如何创建网络?

如何解决如果我只有边名称,如何创建网络?

我正在尝试连接在同一过程中被引用的作者。我的节点是作者,边缘是进程,但我不知道如何创建边缘列表。

我现在拥有的('Doutrina' 表示作者,'Numero' 表示进程号):

image of data

我想要这样的东西(这里的“N”表示这种联系发生了多少次,即它们被一起引用了多少次):

image of desired output


示例数据:

library(dplyr)

df <- tribble(
  ~Doutrina,~Numero,"MILARE,2014","1009526-53.2015.8.26.0032","SEGUIN,2000","0054387-89.2011.8.26.0224","SILVA,2009",2015","0000351-14.2013.8.26.0326",2011","MAXIMILIANO,1961","0000431-26.2013.8.26.0698","0054391-29.2011.8.26.0224","0012360-28.2010.8.26.0224","0012360-28.2010.8.26.0224"
)

df
#> # A tibble: 12 x 2
#>    Doutrina          Numero                   
#>    <chr>             <chr>                    
#>  1 MILARE,2014      1009526-53.2015.8.26.0032
#>  2 SEGUIN,2000      0054387-89.2011.8.26.0224
#>  3 SILVA,2009       0054387-89.2011.8.26.0224
#>  4 MILARE,2015      0000351-14.2013.8.26.0326
#>  5 SILVA,2011       0000351-14.2013.8.26.0326
#>  6 MAXIMILIANO,1961 0000351-14.2013.8.26.0326
#>  7 SILVA,2009       0000431-26.2013.8.26.0698
#>  8 SEGUIN,2000      0000431-26.2013.8.26.0698
#>  9 SILVA,2009       0054391-29.2011.8.26.0224
#> 10 SEGUIN,2000      0054391-29.2011.8.26.0224
#> 11 MAXIMILIANO,2015 0012360-28.2010.8.26.0224
#> 12 MILARE,2015      0012360-28.2010.8.26.0224

解决方法

我修改了您的示例数据,因此结果会更有趣。

library(dplyr)

df <- tribble(
  ~Doutrina,~Numero,"MILARE,2014","1009526-53.2015.8.26.0032","SEGUIN,2000","0054387-89.2011.8.26.0224","SILVA,2009",2015","0000351-14.2013.8.26.0326",2011","MAXIMILIANO,1961","0000431-26.2013.8.26.0698","0054391-29.2011.8.26.0224","0012360-28.2010.8.26.0224","0012360-28.2010.8.26.0224"
)

df %>% 
  mutate(Doutrina = sub(",[0-9]{4}","",Doutrina)) %>%  # remove the year
  full_join(x = .,y = .,by = "Numero") %>%  # join data to itself by Numero
  select(Doutrina = Doutrina.x,Doutrina2 = Doutrina.y) %>%  # keep only name columns
  filter(Doutrina != Doutrina2) %>%  # remove self-reference rows
  filter(Doutrina < Doutrina2) %>%  # only keep rows for one diretion of edge/link
  group_by(Doutrina,Doutrina2) %>% 
  summarise(N = n(),.groups = "drop")
#> # A tibble: 4 x 3
#>   Doutrina    Doutrina2     N
#>   <chr>       <chr>     <int>
#> 1 MAXIMILIANO MILARE        2
#> 2 MAXIMILIANO SILVA         1
#> 3 MILARE      SILVA         1
#> 4 SEGUIN      SILVA         3

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。