微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

如何使用 OpenACC 将 C++ 对象数组正确复制到 GPU?

如何解决如何使用 OpenACC 将 C++ 对象数组正确复制到 GPU?

我正在尝试使用 OpenACC 将一组 C++ 对象从主机复制到 GPU 设备。我试过在网上查找,但只找到了关于如何完成此操作的非常模糊的文档,因此不确定是否有人知道正确的方法?我还将 nvhpc pgi 编译器与 NVIDIA gpu 一起使用,因此我可以访问托管内存,想要使用手动传输和/或托管的示例。

假设我有一个简单的作业类(关于学校作业):

class Assignment{
Public:
   string name; //name of assignment
   double *grades; // list of grades where n is the number of students

   Assignment(string name_,double *grades_) {
      name = name_;
      grades = grades_;
   }
}


int main(){
   Assignment assignments[2]; // array of 2 assignments

   double grades1[3] = {90.0,95.0,75.0};
   assignments[0] = Assignment("Assignment1",grades1);

   double grades2[3] = {50.0,65.0,55.0};
   assignments[1] = Assignment("Assignment2",grades2);

   //Transfer to GPU below...
   #pragma ...
}

如何使用 OpenACC 手动内存管理将这个赋值 C++ 对象数组传输到 GPU?和/或还使用托管内存?

解决方法

统一内存(即托管)可用于分配的内存。因此,如果您更改代码以动态分配数组并使用标志“-gpu=managed”进行编译,则数组的数据移动将由驱动程序处理。

OpenACC 数据区域将执行浅拷贝。因此,对于具有动态数据成员的聚合数据类型,您需要进行如下所示的手动深层复制:

% cat grades.cpp
#include <iostream>
#include <string>

using namespace std;

class Assignment{
public:
   std::string name; //name of assignment
   double *grades; // list of grades where n is the number of students

   Assignment() {};
   Assignment(string name_,double *grades_) {
      name = name_;
      grades = grades_;
   }
};


int main(){
   Assignment assignments[2]; // array of 2 assignments

   double grades1[3] = {90.0,95.0,75.0};
   assignments[0] = Assignment("Assignment1",grades1);

   double grades2[3] = {50.0,65.0,55.0};
   assignments[1] = Assignment("Assignment2",grades2);

   //Transfer to GPU below...
   #pragma acc enter data copyin(assignments[:2])
   for (int i=0; i < 2; ++i) {
        #pragma acc enter data copyin(assignments[i].grades[:3])
   }
   // changes the grades on the device
   #pragma acc parallel loop present(assignments)
   for (int i=0; i < 2; ++i) {
      for (int j=0; j < 3; ++j) {
         assignments[i].grades[j] += 5.0;
      }
   }

   for (int i=0; i < 2; ++i) {
        #pragma acc update self(assignments[i].grades[:3])
        std::cout << i << " : " << assignments[i].grades[0]
                  << "," <<  assignments[i].grades[1]
                  << "," <<  assignments[i].grades[2]
                  << std::endl;
   }


   for (int i=0; i < 2; ++i) {
        #pragma acc exit data delete(assignments[i].grades)
   }
   #pragma acc exit data delete(assignments)

}
% nvc++ grades.cpp -w -Minfo=accel -acc; a.out
main:
     26,Generating enter data copyin(assignments[:])
     32,Generating enter data copyin(assignments.grades[:3])
         Generating present(assignments[:])
         Generating Tesla code
         35,#pragma acc loop gang /* blockIdx.x */
         36,#pragma acc loop seq
     36,Loop is parallelizable
     43,Generating update self(assignments.grades[:3])
     52,Generating exit data delete(assignments.grades)
     55,Generating exit data delete(assignments[:])
0 : 95,100,80
1 : 55,70,60

我个人更喜欢将数据管理封装到类本身中。这里有点棘手,因为您有一个类数组,其中每个元素都是匿名对象的副本。虽然它可能有用也可能没用,但您可以查看一个简单的向量,如类“accList”,我将其作为我在Parallel Programing with OpenACC一书中关于数据管理的章节的示例>.请参阅:https://github.com/rmfarber/ParallelProgrammingWithOpenACC/tree/master/Chapter05

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。