如何解决如何使用 OpenACC 将 C++ 对象数组正确复制到 GPU?
我正在尝试使用 OpenACC 将一组 C++ 对象从主机复制到 GPU 设备。我试过在网上查找,但只找到了关于如何完成此操作的非常模糊的文档,因此不确定是否有人知道正确的方法?我还将 nvhpc pgi 编译器与 NVIDIA gpu 一起使用,因此我可以访问托管内存,想要使用手动传输和/或托管的示例。
class Assignment{
Public:
string name; //name of assignment
double *grades; // list of grades where n is the number of students
Assignment(string name_,double *grades_) {
name = name_;
grades = grades_;
}
}
int main(){
Assignment assignments[2]; // array of 2 assignments
double grades1[3] = {90.0,95.0,75.0};
assignments[0] = Assignment("Assignment1",grades1);
double grades2[3] = {50.0,65.0,55.0};
assignments[1] = Assignment("Assignment2",grades2);
//Transfer to GPU below...
#pragma ...
}
如何使用 OpenACC 手动内存管理将这个赋值 C++ 对象数组传输到 GPU?和/或还使用托管内存?
解决方法
统一内存(即托管)可用于分配的内存。因此,如果您更改代码以动态分配数组并使用标志“-gpu=managed”进行编译,则数组的数据移动将由驱动程序处理。
OpenACC 数据区域将执行浅拷贝。因此,对于具有动态数据成员的聚合数据类型,您需要进行如下所示的手动深层复制:
% cat grades.cpp
#include <iostream>
#include <string>
using namespace std;
class Assignment{
public:
std::string name; //name of assignment
double *grades; // list of grades where n is the number of students
Assignment() {};
Assignment(string name_,double *grades_) {
name = name_;
grades = grades_;
}
};
int main(){
Assignment assignments[2]; // array of 2 assignments
double grades1[3] = {90.0,95.0,75.0};
assignments[0] = Assignment("Assignment1",grades1);
double grades2[3] = {50.0,65.0,55.0};
assignments[1] = Assignment("Assignment2",grades2);
//Transfer to GPU below...
#pragma acc enter data copyin(assignments[:2])
for (int i=0; i < 2; ++i) {
#pragma acc enter data copyin(assignments[i].grades[:3])
}
// changes the grades on the device
#pragma acc parallel loop present(assignments)
for (int i=0; i < 2; ++i) {
for (int j=0; j < 3; ++j) {
assignments[i].grades[j] += 5.0;
}
}
for (int i=0; i < 2; ++i) {
#pragma acc update self(assignments[i].grades[:3])
std::cout << i << " : " << assignments[i].grades[0]
<< "," << assignments[i].grades[1]
<< "," << assignments[i].grades[2]
<< std::endl;
}
for (int i=0; i < 2; ++i) {
#pragma acc exit data delete(assignments[i].grades)
}
#pragma acc exit data delete(assignments)
}
% nvc++ grades.cpp -w -Minfo=accel -acc; a.out
main:
26,Generating enter data copyin(assignments[:])
32,Generating enter data copyin(assignments.grades[:3])
Generating present(assignments[:])
Generating Tesla code
35,#pragma acc loop gang /* blockIdx.x */
36,#pragma acc loop seq
36,Loop is parallelizable
43,Generating update self(assignments.grades[:3])
52,Generating exit data delete(assignments.grades)
55,Generating exit data delete(assignments[:])
0 : 95,100,80
1 : 55,70,60
我个人更喜欢将数据管理封装到类本身中。这里有点棘手,因为您有一个类数组,其中每个元素都是匿名对象的副本。虽然它可能有用也可能没用,但您可以查看一个简单的向量,如类“accList”,我将其作为我在Parallel Programing with OpenACC一书中关于数据管理的章节的示例>.请参阅:https://github.com/rmfarber/ParallelProgrammingWithOpenACC/tree/master/Chapter05
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。