如何解决效率比较:函数指针与函数对象与分支代码为什么功能对象的性能最差? 分支可能更容易出错最好避免裸露新事物
我希望通过在嵌套循环内将函数指针或函数对象传递给函数调用来提高性能,以避免循环分支。下面是三个代码:一个带有函数对象,带有函数指针和分支的代码。对于任何编译器优化选项或任何问题大小,函数指针和对象版本都执行最少。这令我惊讶;为什么函数指针或对象导致的开销随问题大小扩展? 第二个问题。为什么功能对象的性能比功能指针差?
更新
最后,我还要添加同一代码的lambda表达式版本。蛮力再次获胜。与相应的蛮力代码相比,对于不同的问题大小,无论是否进行优化,lambda表达式版本所花费的时间都是其两倍以上。
以下代码。使用./a.out [SIZE] [function choice]
功能对象:
#include <iostream>
#include <chrono>
class Interpolator
{
public:
Interpolator(){};
virtual double operator()(double left,double right) = 0;
};
class FirstOrder : public Interpolator
{
public:
FirstOrder(){};
virtual double operator()(double left,double right) { return 2.0 * left * left * left + 3.0 * right; }
};
class SecondOrder : public Interpolator
{
public:
SecondOrder(){};
virtual double operator()(double left,double right) { return 2.0 * left * left + 3.0 * right * right; }
};
double kernel(double left,double right,Interpolator *int_func) { return (*int_func)(left,right); }
int main(int argc,char *argv[])
{
double *a;
int SIZE = atoi(argv[1]);
int it = atoi(argv[2]);
//initialize
a = new double[SIZE];
for (int i = 0; i < SIZE; i++)
a[i] = (double)i;
std::cout << "Initialized" << std::endl;
Interpolator *first;
switch (it)
{
case 1:
first = new FirstOrder();
break;
case 2:
first = new SecondOrder();
break;
}
std::cout << "function" << std::endl;
auto start = std::chrono::high_resolution_clock::Now();
//loop
double g;
for (int i = 0; i < SIZE; i++)
{
g = 0.0;
for (int j = 0; j < SIZE; j++)
{
g += kernel(a[i],a[j],first);
}
a[i] += g;
}
auto stop = std::chrono::high_resolution_clock::Now();
auto duration = std::chrono::duration_cast<std::chrono::microseconds>(stop - start);
std::cout << "Finalized in " << duration.count() << " ms" << std::endl;
return 0;
}
功能指针:
#include <iostream>
#include <chrono>
double firstOrder(double left,double right) { return 2.0 * left * left * left + 3.0 * right; }
double secondOrder(double left,double right) { return 2.0 * left * left + 3.0 * right * right; }
double kernel(double left,double (*f)(double,double))
{
return (*f)(left,right);
}
int main(int argc,char *argv[])
{
double *a;
int SIZE = atoi(argv[1]);
int it = atoi(argv[2]);
a = new double[SIZE];
for (int i = 0; i < SIZE; i++)
a[i] = (double)i; // initialization
std::cout << "Initialized" << std::endl;
//Func func(it);
double (*func)(double,double);
switch (it)
{
case 1:
func = &firstOrder;
break;
case 2:
func = &secondOrder;
break;
}
std::cout << "function" << std::endl;
auto start = std::chrono::high_resolution_clock::Now();
//loop
double g;
for (int i = 0; i < SIZE; i++)
{
g = 0.0;
for (int j = 0; j < SIZE; j++)
{
g += kernel(a[i],func);
}
a[i] += g;
}
auto stop = std::chrono::high_resolution_clock::Now();
auto duration = std::chrono::duration_cast<std::chrono::microseconds>(stop - start);
std::cout << "Finalized in " << duration.count() << " ms" << std::endl;
return 0;
}
分支:
#include <iostream>
#include <chrono>
double firstOrder(double left,double right) { return 2.0 * left * left + 3.0 * right * right; }
int main(int argc,char *argv[])
{
double *a;
int SIZE = atoi(argv[1]); // array size
int it = atoi(argv[2]); // function choice
//initialize
a = new double[SIZE];
double g;
for (int i = 0; i < SIZE; i++)
a[i] = (double)i; // initialization
std::cout << "Initialized" << std::endl;
auto start = std::chrono::high_resolution_clock::Now();
//loop
for (int i = 0; i < SIZE; i++)
{
g = 0.0;
for (int j = 0; j < SIZE; j++)
{
if (it == 1)
{
g += firstOrder(a[i],a[j]);
}
else if (it == 2)
{
g += secondOrder(a[i],a[j]);
}
}
a[i] += g;
}
auto stop = std::chrono::high_resolution_clock::Now();
auto duration = std::chrono::duration_cast<std::chrono::microseconds>(stop - start);
std::cout << "Finalized in " << duration.count() << " ms" << std::endl;
return 0;
}
Lambda表达式
#include <iostream>
#include <chrono>
#include<functional>
std::function<double(double,double)> makeLambda(int kind){
return [kind] (double left,double right){
if(kind == 0) return 2.0 * left * left * left + 3.0 * right;
else if (kind ==1) return 2.0 * left * left + 3.0 * right * right;
};
}
int main(int argc,char *argv[])
{
double *a;
int SIZE = atoi(argv[1]);
int it = atoi(argv[2]);
//initialize
a = new double[SIZE];
for (int i = 0; i < SIZE; i++)
a[i] = (double)i;
std::cout << "Initialized" << std::endl;
std::function<double(double,double)> interp ;
switch (it)
{
case 1:
interp = makeLambda(0);
break;
case 2:
interp = makeLambda(1);
break;
}
std::cout << "function" << std::endl;
auto start = std::chrono::high_resolution_clock::Now();
//loop
double g;
for (int i = 0; i < SIZE; i++)
{
g = 0.0;
for (int j = 0; j < SIZE; j++)
{
g += interp(a[i],a[j]);
}
a[i] += g;
}
auto stop = std::chrono::high_resolution_clock::Now();
auto duration = std::chrono::duration_cast<std::chrono::microseconds>(stop - start);
std::cout << "Finalized in " << duration.count() << " ms" << std::endl;
return 0;
}
解决方法
尽管,这个问题与性能有关,但我有一些意见可以改进代码:
分支可能更容易出错
想象一下您要添加另一个插值功能。然后,您需要定义一个新函数并添加一个新案例(用于switch)或一个新的if / else。解决方案可能是创建一个lambda向量:
std::vector<std::function<double(double,double)>> Interpolate {
[](double left,double right) {return 2.0*left*left*left + 3.0*right;},//first order
[](double left,double right) {return 2.0*left*left + 3.0*right*right;} //second order
};
或者:
double firstOrder(double left,double right) {return 2.0*left*left*left + 3.0*right;}
double secondOrder(double left,double right) {return 2.0*left*left + 3.0*right*right;}
std::array<double(*)(double,double),2> Interpolate {firstOrder,secondOrder};
进行此更改后,不需要if或switch语句。您只需写:
g += Interpolate[it-1] (x,y);
代替
if (it == 1)
g += firstOrder(a[i],a[j]);
else if (it == 2)
g += secondOrder(a[i],a[j]);
因此,需要的维护较少,并且丢失if / else语句的可能性较小。
最好避免裸露新事物
人们建议不要使用double *a = new double[SIZE];
,而应该写std::vector<double> a (SIZE);
。这样,我们就不需要释放任何资源,并且避免了代码中潜在的内存泄漏。
回到问题,我看不到lambda应该导致更好的性能的原因。特别是在这种情况下,我们无法从constexpr nature of lambdas中受益。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。