如何解决在 Google Coral 开发板上使用 OpenCL 和 OpenCV 的工作组大小错误
我正在尝试在 Coral 开发板上通过 OpenCV 使用 OpenCL 加速。在 UMat 对象上使用 cv2.normalize() 函数时出现以下错误:
OpenCL error CL_INVALID_WORK_GROUP_SIZE (-54) during call: clEnqueueNDRangeKernel('minmaxloc',dims=1,globalsize=1024x1x1,localsize=1024x1x1) sync=true
此外,任何涉及 UMats 的任务都运行得非常缓慢,而且 cpu 似乎比它应该更努力地工作,所以我不确定任何 GPU 加速是否有效。
我通过 Pip (python3 -m pip install opencv-contrib-python
) 为 Python 3.7 安装了 OpenCV 4.5.1 并运行 cv2.getBuildinformation()
提供以下有关 OpenCL 的信息:
OpenCL: YES (no extra features)
Include path: /tmp/pip-req-build-qmcu8eer/opencv/3rdparty/include/opencl/1.2
并且运行 clinfo
给了我这个:
Platform Name Vivante OpenCL Platform
Number of devices 1
Device Name Vivante OpenCL Device GC7000L.6214.0000
Device vendor Vivante Corporation
Device vendor ID 0x564956
Device Version OpenCL 1.2
Driver Version OpenCL 1.2 V6.4.2.256507
Device OpenCL C Version OpenCL C 1.2
Device Type GPU
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Linker Available Yes
Max compute units 1
Max clock frequency 800MHz
Device Partition (core)
Max number of sub-devices 0
Supported partition types (n/a)
Supported affinity domains (n/a)
Max work item dimensions 3
Max work item sizes 1024x1024x1024
Max work group size 1024
=== CL_PROGRAM_BUILD_LOG ===
(6:0) : error : Syntax error at 'kernel'
Preferred work group size multiple <getWGsizes:1200: create kernel : error -45>
Preferred / native vector sizes
char 4 / 4
short 4 / 4
int 4 / 4
long 4 / 4
half 0 / 0 (cl_khr_fp16)
float 4 / 4
double 0 / 0 (n/a)
Half-precision Floating-point support <printDeviceInfo:68: get CL_DEVICE_HALF_FP_CONfig : error -30>
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (n/a)
Address bits 32,Little-Endian
Global memory size 268435456 (256MiB)
Error Correction support Yes
Max memory allocation 134217728 (128MiB)
Unified memory for Host and Device Yes
Minimum alignment for any data type 128 bytes
Alignment of base address 2048 bits (256 bytes)
Global Memory cache type Read/Write
Global Memory cache size 8192 (8KiB)
Global Memory cache line size 64 bytes
Image support Yes
Max number of samplers per kernel 16
Max size for 1D images from buffer 65536 pixels
Max 1D or 2D image array size 8192 images
Max 2D image size 8192x8192 pixels
Max 3D image size 8192x8192x8192 pixels
Max number of read image args 128
Max number of write image args 8
Local memory type Global
Local memory size 32768 (32KiB)
Max number of constant args 9
Max constant buffer size 65536 (64KiB)
Max size of kernel argument 1024
Queue properties
Out-of-order execution Yes
Profiling Yes
Prefer user sync for interop Yes
Profiling timer resolution 1000ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
printf() buffer size 1048576 (1024KiB)
Built-in kernels (n/a)
Device Extensions cl_khr_byte_addressable_store cl_khr_gl_sharing cl_khr_fp16 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics
我没有从源代码或任何东西构建 OpenCL...任何未随开发板映像提供的 OpenCL 软件包,我都会在准备安装 OpenCV 时通过 apt 安装。我的深度不够——任何建议都值得赞赏!
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。