English 中文(简体)
使用 OpenCL 处理的大阵列信息错误
原标题:Wrong information on large arrays processed with OpenCL
  • 时间:2012-05-27 16:46:33
  •  标签:
  • c
  • opencl

我对内核执行有问题,因为内核执行在我使用大型阵列(1000x1000)时不会在正确的地点写信息,但对于小阵列来说,没有问题,我取回正确的脉冲。对于内核执行,我使用ATI行动RADEON HD 4300系列的GPU。

C代码样本是:

#include <stdio.h>
#include <stdlib.h>

#ifdef __APPLE__
#include <OpenCL/opencl.h>
#else
#include <CL/cl.h>
#endif

#define MAX_SOURCE_SIZE (0x100000)
#define MAX_SIZE 108
#define NCOLS 1000
#define NROWS 10000

int main(void) {
    char* source_name = "mykernel.cl";
    char* source_code;
    size_t source_size;
    cl_platform_id platformId = NULL;
    cl_uint nbplatforms;
    cl_device_id deviceId = NULL;
    cl_uint nbdevices;
    cl_context context = NULL;
    cl_int errcode;
    cl_command_queue commandQueue = NULL;
    cl_program program;
    size_t global_work_size[2];
    size_t local_work_size[2];

    FILE* fh;

    //Retrieving platform information
    errcode = clGetPlatformIDs(1, &platformId, &nbplatforms);

    //Retrieving device (GPU) information
    errcode = clGetDeviceIDs(platformId, CL_DEVICE_TYPE_GPU, 1, &deviceId, &nbdevices);

    //Creation of a working context
    context = clCreateContext(NULL, 1, &deviceId, NULL, NULL, &errcode);

    commandQueue = clCreateCommandQueue(context, deviceId, 0, &errcode);

    //Opening and reading the kernel source file
    if((fh = fopen(source_name, "r")) == NULL){
        fprintf(stderr, "Failed to open the file containing the kernel source !
");
        exit(EXIT_FAILURE);
    }
    source_code = (char*) malloc (MAX_SOURCE_SIZE * sizeof(char));
    source_size = fread(source_code, sizeof(char), MAX_SOURCE_SIZE, fh);
    fclose(fh);

    program = clCreateProgramWithSource(context, 1, (const char**) &source_code, (const size_t*) &source_size, &errcode);

    //Building kernel
    errcode = clBuildProgram(program, 1, &deviceId, NULL, NULL, NULL);

    //Creation of the kernel program
    cl_kernel kernel = clCreateKernel(program, "mykernel", &errcode);
    unsigned int *op1 = (unsigned int*) malloc (NCOLS * NROWS * sizeof(unsigned int));

    cl_mem op1buff = clCreateBuffer(context, CL_MEM_WRITE_ONLY, NCOLS * NROWS * sizeof(unsigned int), NULL, &errcode);

    clSetKernelArg(kernel, 0, sizeof(cl_mem), (void*) &op1buff);

    global_work_size[0] =  NCOLS;
    global_work_size[1] =  NROWS;
    local_work_size[0] = NCOLS;
    local_work_size[1] = 1;

    clEnqueueNDRangeKernel(commandQueue, kernel, 2, NULL, global_work_size, local_work_size, 0, NULL, NULL);

    errcode = clEnqueueReadBuffer(commandQueue, op1buff, CL_TRUE, 0, NCOLS * NROWS * sizeof(unsigned int), (void*)op1, 0, NULL, NULL);

    for(int i = 0; i < NROWS; i++){
        for(int j = 0; j < NCOLS; j++)
            printf("[index:%d - %u] ", i*NCOLS+j, op1[i*NCOLS+j]);
        printf("
");
    }

    return EXIT_SUCCESS;
}

内核源代码放置在名为 Mycernel.cl 的文件中,其列报方式如下:

__kernel void mykernel(__global unsigned int* op1buf){
    unsigned int index = get_group_id(1) * get_global_size(0) + get_local_id(0);
    op1buf[index] = index;
}

执行此程序返回在我使用大数组时从数组读取的意外值。 例如 :

[index:0 - 16777215] [index:1 - 16777215] [index:2 - 16777215] [index:3 - 16777215] ...
[index:1000 - 3438339071] [index:1001 - 3941660159] [index:1002 - 1650092117] [index:1003 - 2529976771] ...
[index:1000 - 3438339071] [index:1001 - 3941660159] [index:1002 - 1650092117] [index:1003 - 2529976771] ...
[index:3000 - 16777215] [index:3001 - 16777215] [index:3002 - 16777215] [index:3003 - 16777215] ...
[index:4000 - 3438339071] [index:4001 - 3941660159] [index:4002 - 1650092117] [index:4003 - 2529976771] ...
....

我的代码有什么问题呢? 使用 GPU 上有什么我不考虑的?

提前感谢。

最佳回答

1000 明显太大, 不适合您的设备。 使用 CL_ DEVICE_ MAX_ WORK_ GROUP_ SIZE 的 ClGet DeviceInfo 来确定您可以使用的最大值 。

问题回答

暂无回答




相关问题
Fastest method for running a binary search on a file in C?

For example, let s say I want to find a particular word or number in a file. The contents are in sorted order (obviously). Since I want to run a binary search on the file, it seems like a real waste ...

Print possible strings created from a Number

Given a 10 digit Telephone Number, we have to print all possible strings created from that. The mapping of the numbers is the one as exactly on a phone s keypad. i.e. for 1,0-> No Letter for 2->...

Tips for debugging a made-for-linux application on windows?

I m trying to find the source of a bug I have found in an open-source application. I have managed to get a build up and running on my Windows machine, but I m having trouble finding the spot in the ...

Trying to split by two delimiters and it doesn t work - C

I wrote below code to readin line by line from stdin ex. city=Boston;city=New York;city=Chicago and then split each line by ; delimiter and print each record. Then in yet another loop I try to ...

Good, free, easy-to-use C graphics libraries? [closed]

I was wondering if there were any good free graphics libraries for C that are easy to use? It s for plotting 2d and 3d graphs and then saving to a file. It s on a Linux system and there s no gnuplot ...

Encoding, decoding an integer to a char array

Please note that this is not homework and i did search before starting this new thread. I got Store an int in a char array? I was looking for an answer but didn t get any satisfactory answer in the ...

热门标签