微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

在大矩阵中计算欧几里得距离的最有效方法

如何解决在大矩阵中计算欧几里得距离的最有效方法

我想找到在大型矩阵上计算欧几里得距离的最内存和时间有效方法是什么。我在下面比较了一些我知道的软件包:paralleldistgeodistfieldsstats。我还考虑了结合 Rcppbigmemorycustomized function 。以下是我发现的结果(下面的 reprex),但我想知道是否有其他有效的 pacakges/解决方案来完成此任务:

结果

benchmrk
#>   package   time        alloc
#>1: pardist  0.298 5.369186e-04
#>2:  fields  1.079 9.486198e-03
#>3:    rcpp 54.422 2.161113e+00
#>4:   stats  0.770 5.788603e+01
#>5: geodist  2.513 1.157635e+02

# plot
ggplot(benchmrk,aes(x=alloc,y=time,color= package,label=package)) +
  geom_label(alpha=.5) +
  coord_trans(x="log10",y="log10") +
  theme(legend.position = "none")

enter image description here

Reprex

library(paralleldist)
library(geodist)
library(fields)
library(stats)
library(bigmemory)
library(Rcpp)

library(lineprof)
library(geobr)
library(sf)
library(ggplot2)
library(data.table)


# data input
df <- geobr::read_weighting_area()
gc(reset = T)

# convert projection to UTM
df <- st_transform(df,crs = 3857)

# get spatial coordinates
coords <- suppressWarnings(st_coordinates( st_centroid(df) ))

# prepare customized rcpp function
sourceCpp("euc_dist.cpp")

bigMatrixEuc <- function(bigMat){
  zeros <- big.matrix(nrow = nrow(bigMat)-1,ncol = nrow(bigMat)-1,init = 0,type = typeof(bigMat))
  BigArmaEuc(bigMat@address,zeros@address)
  return(zeros)
}




### Start tests
perf_fields  <- lineprof(dist_fields <- fields::rdist(coords) )
perf_geodist <- lineprof(dist_geodist <- geodist::geodist(coords,measure = "cheap") )
perf_stats   <- lineprof(dist_stats <- stats::dist(coords) )
perf_pardist <- lineprof(dist_pardist <- paralleldist::pardist(coords,method = "euclidean") )
perf_rcpp <- lineprof(dist_rcpp <- bigMatrixEuc( as.big.matrix(coords) ) )

perf_fields$package  <- 'fields'
perf_geodist$package <- 'geodist'
perf_stats$package   <- 'stats'
perf_pardist$package <- 'pardist'
perf_rcpp$package <- 'rcpp'


# gather results
benchmrk <- rbind(perf_fields,perf_geodist,perf_stats,perf_pardist,perf_rcpp)
benchmrk <- setDT(benchmrk)[,.(time  =sum(time),alloc = sum(alloc)),by=package][order(alloc)]
benchmrk

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。