如何解决如何加速包含 kronecker 产品的 for 循环?
我想加快以下 for 循环。
package main
import (
"fmt"
"os"
"github.com/godbus/dbus/v5"
"github.com/godbus/dbus/v5/introspect"
)
type ping string
func (p ping) Ping() (string,*dbus.Error) {
fmt.Println(p)
return string(p),nil
}
type zing string
func (z zing) Zing() (string,*dbus.Error) {
fmt.Println(z)
return string(z),nil
}
func main() {
conn,err := dbus.SessionBus()
if err != nil {
panic(err)
}
replyP,errP := conn.RequestName("a.b.c.d.Ping",dbus.NameFlagDoNotQueue)
if errP != nil {
panic(errP)
}
if replyP != dbus.RequestNameReplyPrimaryOwner {
fmt.Fprintln(os.Stderr,"name already taken")
os.Exit(1)
}
p := ping("Pong")
var introP = &introspect.Node{
Name: "/a/b/c/d/Ping",Interfaces: []introspect.Interface{
introspect.IntrospectData,{
Name: "a.b.c.d.Ping",Methods: introspect.Methods(p),},}
conn.Export(p,"/a/b/c/d/Ping","a.b.c.d.Ping")
z := zing("Zong")
var introZ = &introspect.Node{
Name: "/a/b/c/Zing",{
Name: "a.b.c.Zing",Methods: introspect.Methods(z),}
conn.Export(z,"/a/b/c/Zing","a.b.c.Zing")
conn.Export(introspect.NewIntrospectable(&introspect.Node{
Name: "/",Children: []introspect.Node{
{
Name: "a",}),"/","org.freedesktop.DBus.Introspectable")
conn.Export(introspect.NewIntrospectable(&introspect.Node{
Name: "/a",Children: []introspect.Node{
{
Name: "b","/com","org.freedesktop.DBus.Introspectable")
conn.Export(introspect.NewIntrospectable(&introspect.Node{
Name: "/a/b",Children: []introspect.Node{
{
Name: "c","/a/b","org.freedesktop.DBus.Introspectable")
conn.Export(introspect.NewIntrospectable(&introspect.Node{
Name: "/a/b/c",Children: []introspect.Node{
{
Name: "d",{
Name: "Zing","/a/b/c","org.freedesktop.DBus.Introspectable")
conn.Export(introspect.NewIntrospectable(&introspect.Node{
Name: "/a/b/c/d",Children: []introspect.Node{
{
Name: "Ping","/a/b/c/d","org.freedesktop.DBus.Introspectable")
conn.Export(introspect.NewIntrospectable(introP),"org.freedesktop.DBus.Introspectable")
conn.Export(introspect.NewIntrospectable(introZ),"org.freedesktop.DBus.Introspectable")
fmt.Printf("Listening on %s / %s ...\n","a.b.c...","/a/b/c...")
select {}
}
我的尝试:
% Use random matrices for testing.
% Elapsed time of the following code is around 19 seconds.
% So,when N is large,it's very slow.
n= 800; N =100; k =10;
x = rand(n,N); S = rand(k,N); H = 0;
for i = 1: size(x,2)
X = x(:,i)*x(:,i)' ;
DW = diag( S(:,i) ) - S(:,i)*S(:,i)' ;
H = H + kron(X,DW);
end
我们可以使用 和 重写上面的等式。
kron(X,DW) = kron(x(:,i)',diag(S(:,i))) -
kron(x(:,S(:,i)');
(因为 kron(x(:,i))) =
kron(x(:,i),sqrt( diag(S(:,i))))*
kron(x(:,i)))' ;
是非负的,所以我们可以取 S
)
sqrt
因此,我们只需要计算 kron(x(:,i)') =
kron( x(:,i))*
kron( x(:,i))'
和 kron(x(:,i))))
。
代码如下:
kron(x(:,i))
新代码节省了大量时间。已用时间约为 x = rand(n,N);
H1= 0; K_D= zeros(n*k,k*1,N); K_S = zeros(n*k,N);
%K_D records kron( x(:,sqrt (diag(S(:,i)) ) ),K_S records kron(x(:,i));
for i = 1:N
D_half = sqrt( diag(S(:,i)));
K_D(:,:,i) = kron( x(:,D_half);
K_S(:,i) = reshape (S(:,[],1);
end
K_D = reshape(K_D,n*k,[]);
H = K_D*K_D' - K_S*K_S';
秒。但我还是想加快速度。
有人可以帮我加快上面的代码(我的尝试)吗?或者有人有新的想法/方法来加速我原来的问题吗? 非常感谢!
解决方法
表达式 H1 = A(:) .* B(:).'
产生与 H = kron(A,B)
相同的结果,只是元素的顺序发生了变化。因此可以在不使用 kron
:
X = reshape(reshape(x,n,1,N) .* reshape(x,N),[],N);
S1 = -reshape(reshape(S,k,N) .* reshape(S,N);
S1(1:k+1:end,:) = reshape(S1 (1:k+1:end,:),size(S)) + S;
S1 = reshape(S1,N);
H1 = X * S1.';
H = reshape(permute(reshape(permute(reshape(H1,k),[3 2 1]),n*k,[]),[2,3]),n*k);
如果您对求和的结果感兴趣,而不管元素的顺序如何,计算 H1
就足够了,与您的第二种方法相比,它会导致超过 3X
的加速。但是,您可以使用 reshape
和 permute
来恢复元素的顺序,从而使速度提高近 0.3X
。
如果 H
在循环中计算并且其大小在循环中保持不变,您可以使用预先计算的索引来重新排列元素:
idx = reshape(permute(reshape(permute(reshape(1:(n*k)^2,n*k);
for ....
X = ...
....
H = H1(idx);
end
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。