• No results found

Loading and Compilation of an OpenCL Program

2.11 Programming and Calling Kernel

2.11.1 Loading and Compilation of an OpenCL Program

The full-profile OpenCL library must contain a compiler. This compiler is called by the host program. Before the kernels can be executed, three actions that must be taken:

1. Load the program binary or source into memory 2. Build the program

3. Create kernel objects

The application can decide for which devices the program should be built. The source format is a text string that is passed as a parameter to OpenCL API. The binary representation is implementation-dependent. Some vendors use executable program representation, some others use Assembler for a binary program. The program has to be compiled first, before any kernels can be used. The OpenCL standard defines program and program object as:

An OpenCL program consists of a set of kernels. Programs may also contain auxiliary func- tions called by kernel functions and constant data[10].

A program object encapsulates the following information: • A reference to an associated context.

• A program source or binary.

• The latest successfully built program executable, the list of devices for which the program executable is built, the build options used and a build log.

• The number of kernel objects currently attached[10].

Program Object

Load program into memory

An example of loading an OpenCL program from a source can be seen in listings 2.46 and 2.47. In both versions, the source is loaded from a file named fname. The whole file is loaded into memory. This code even allows for very big files, because it dynamically allocates the memory. The C++ version is more compact because of the language features.

cl_program clCreateProgramWithSource ( cl_context context, cl_uint count,

const char **strings, const size_t *lengths, cl_int *errcode_ret );

Creates a program object for a context and loads the source code specified by the text strings in the strings array into the program object.

clCreateProgramWithSource

cl::Program::Program ( const Context& context, const Sources& sources, cl_int * err = NULL );

This constructor creates an OpenCL program object for a context and loads the source code specified by the text strings in each element of the vector sources into the program object.

cl::Program::Program

In the C version example that can be seen in listing 2.46, the program object is created using the function clCreateProgramWithSource. The program is created in a given context. The OpenCL program source can be in multiple memory regions. This is allowed by the third parameter (here it is appsource), that is, an array of pointers to string buffers. The second parameter (here, it is 1) passes the number of program fragments. In the example, there is just a pointer to the array of characters because this code loads only one string array. The fourth parameter is an array of size_t values that determines the size of the corresponding string buffers. The last parameter is an error code.

C++ implementation (listing 2.47) also reads the whole file into a memory buffer, but this time the sizes and the strings are stored into a vector and inserted into a sources object; the OpenCL data type for sources is cl::Program::Sources. The program is created by the constructor cl::Program that takes the context and sources. It also produces the program object, which does not yet contain any kernels that can be executed. For the loading of binary program representation, please refer to section 2.12.3.

1 cl_program createProgram(cl_context context, char *fname) {

2 cl_int r;

3 size_t size = 0, partsize = 128;

4 char buffer [4096];

5 cl_program program;

6 char *appsource = ( char * )malloc(128);

7 8 FILE *f = fopen(fname, "r"); 9 if (f == NULL) { 10 exit(EXIT_FAILURE); 11 } else { 12 while (partsize == 128) {

13 partsize = fread(appsource + size, 1, 128, f);

14 size += partsize;

15 appsource = ( char* )realloc(appsource, 128 + size);

16 }

17 fclose(f);

18 }

19 program = clCreateProgramWithSource(context, 1,

20 ( const char ** )&appsource, &size, &r);

21

22 r = clBuildProgram(program, 0, NULL, NULL, NULL, NULL);

23 if (r != CL_SUCCESS) {

24 cl_device_id device_id [16];

25 clGetContextInfo(context,

26 CL_CONTEXT_DEVICES, sizeof(device_id), &device_id, NULL);

27 clGetProgramBuildInfo(program, device_id [0],

28 CL_PROGRAM_BUILD_LOG, sizeof(buffer) - 1, buffer, &size);

29 printf("Error: Could not build OpenCL program (%d)\n%s\n", r, buffer);

30 exit(EXIT_FAILURE);

31 }

32 free(appsource);

33 return program;

34 }

1 cl::Program createProgram(cl::Context &context, std::string fname,

2 const std::string params = "") {

3 cl::Program::Sources sources;

4 cl::Program program;

5 std::vector <cl::Device> devices = context.getInfo<CL_CONTEXT_DEVICES > ();

6

7 std::ifstream source_file(fname.c_str());

8 std::string source_code(

9 std::istreambuf_iterator <char> (source_file),

10 (std::istreambuf_iterator <char> ()));

11 sources.insert(sources.end(),

12 std::make_pair(source_code.c_str(), source_code.length()));

13 program = cl::Program(context, sources);

14 15 try {

16 program.build(devices, params.c_str());

17 } catch (cl::Error e) {

18 std::cout << "Compilation build error log:" << std::endl <<

19 program.getBuildInfo <CL_PROGRAM_BUILD_LOG > (devices [0]) << std::endl;

20 throw e;

21 }

22

23 return program;

24 }

OpenCL Program Build

cl_int clBuildProgram ( cl_program program, cl_uint num_devices,

const cl_device_id *device_list, const char *options, void (CL_CALLBACK *pfn_notify)(cl_program program, void *user_data), void *user_data);

Builds (compiles and links) a program executable from the program source or binary.

clBuildProgram

Consider the source code that can be seen in listing 2.46 for C or 2.47 for C++. The program is built by the function clBuildProgram for C and cl::Program::build for C++. These functions order OpenCL to run a compiler to prepare a binary program representation. Error handling checks if the kernel has been built successfully, and if not, it gets the build log for the program and displays it. The error log is for the user; the standard does not provide any hints about the content of this log. The example build logs for erroneous compilation for the same program are in listings 2.48 and 2.49.

cl_int cl::Program::build ( const VECTOR_CLASS<Device> devices,

const char *options = NULL, (CL_CALLBACK *pfn_notify) (cl_program,

void *user_data) = NULL, void *data = NULL );

This method builds (compiles and links) a program executable from the program source or binary for all devices or (a) specific device(s) in the OpenCL context associated with the program.

cl::Program::build

ptxas application ptx input, line 26;

error : Label expected for argument 0 of instruction ’call’

ptxas application ptx input, line 26; error : Call target not recognized ptxas application ptx input, line 26;

error : Function ’get_local_sizes’ not declared in this scope

ptxas application ptx input, line 26; error : Call target not recognized ptxas application ptx input, line 27; error : Unknown symbol ’get_local_sizes’ ptxas application ptx input, line 27;

error : Label expected for forward reference of ’get_local_sizes’ ptxas fatal : Ptx assembly aborted due to errors

Listing 2.48: The build log generated by an NVIDIA compiler

/ tmp / OCLoQsHmP.cl(21) : error : function "get_local_sizes" declared implicitly const size_t local_size = get_local_sizes(0);

^

1 error detected in the compilation of "/tmp/OCLoQsHmP.cl" .

1 cl_kernel kernel_x;

2 kernel_x = clCreateKernel(program, "kernel_x", NULL);

Listing 2.50: Usage of clCreateKernel to create kernel object – C code

1 cl::Kernel kernel_x;

2 kernel_x = cl::Kernel(program, "kernel_x");

Listing 2.51: Usage of cl::Kernel to create kernel object – C++ code

1 int i; 2 cl_uint num_kernels = 32; 3 cl_uint num_kernels_ret = 0; 4 cl_kernel kernels [32]; 5 char value_s [128]; 6 cl_uint value_i;

7 clCreateKernelsInProgram(program, num_kernels, kernels, &num_kernels_ret);

8 printf("The program contains %d kernels:\n", num_kernels_ret);

9 for (i = 0; i < num_kernels_ret; i++) {

10 clGetKernelInfo(kernels [i], CL_KERNEL_FUNCTION_NAME, 128, value_s, NULL);

11 clGetKernelInfo(kernels [i], CL_KERNEL_NUM_ARGS,

12 sizeof(cl_uint), &value_i, NULL);

13 printf(" - %s, %d args\n", value_s, value_i);

14 }

Listing 2.52: Usage of clCreateKernelsInProgram and clGetKernelInfo for listing available kernels – C code

Create Kernel Object

In listings 2.50 and 2.51 there are examples of kernel object creation. This is the most common way of performing this task – by giving the name of the kernel located in program. The program has to be already compiled.

The other method of kernel object creation is by using the function clCreateKernelsInProgram or cl::Program::createKernels. These functions allow for getting multiple kernels at once. The kernels are stored into an array or vector. This is very useful for programs that need automatic means of kernel cre- ation. The sample code fragment that lists kernels available for a given program can be seen in listings 2.52 and 2.53. This code first creates multiple kernel objects at once, and then it checks the kernel names and parameters. The kernel information is obtained using the functions clGetKernelInfo and cl::Kernel::getInfo for C and C++ respectively.

1 std::vector<cl::Kernel> kernels;

2 ret = program.createKernels(&kernels);

3 std::cout << "The program contains " << kernels.size() << " kernels:" << std::endl;

4 for (unsigned i = 0; i < kernels.size(); i++) {

5 std::cout << " - " <<

6 kernels [i].getInfo<CL_KERNEL_FUNCTION_NAME>() <<

7 ", " << kernels [i].getInfo<CL_KERNEL_NUM_ARGS>() <<

8 " args" << std::endl;

9 }

Listing 2.53: Usage of cl::Program::createKernels and cl::Kernel::getInfo for listing available kernels – C++ code

cl::Kernel::Kernel ( const Program& program, const char *name, cl_int *err = NULL );

This constructor will create a kernel object.

cl::Kernel::Kernel

cl_kernel clCreateKernel ( cl_program program, const char *kernel_name, cl_int *errcode_ret );

Creates a kernel object.

clCreateKernel

cl_int cl::Program::createKernels ( const VECTOR_CLASS<Kernel> *kernels );

This method creates kernel objects (objects of type cl::Kernel) for all kernels in the program.

cl::Program::createKernels

cl_int clCreateKernelsInProgram ( cl_program program, cl_uint num_kernels, cl_kernel *kernels, cl_uint *num_kernels_ret );

Creates kernel objects for all kernel functions in a program object.

clCreateKernelsInProgram

template <cl_int name> typename detail::param_traits<detail::cl_kernel_info, name>::param_type cl::Kernel::getInfo ( void );

The method gets specific information about the OpenCL kernel.

cl::Kernel::getInfo

cl_int clGetKernelInfo ( cl_kernel kernel, cl_kernel_info param_name,

size_t param_value_size, void *param_value, size_t *param_value_size_ret );

Returns information about the kernel object.

1 kernel void kernelparams(constant int *c, global float *data, local float *cache,

2 int value) {

3 // ...

4 }

Listing 2.54: Kernel with parameters from different memory pools.