• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1[/===========================================================================
2 Copyright (c) 2013-2015 Kyle Lutz <kyle.r.lutz@gmail.com>
3
4 Distributed under the Boost Software License, Version 1.0
5 See accompanying file LICENSE_1_0.txt or copy at
6 http://www.boost.org/LICENSE_1_0.txt
7=============================================================================/]
8
9[section Advanced Topics]
10
11The following topics show advanced features of the Boost Compute library.
12
13[section Vector Data Types]
14
15In addition to the built-in scalar types (e.g. `int` and `float`), OpenCL
16also provides vector data types (e.g. `int2` and `vector4`). These can be
17used with the Boost Compute library on both the host and device.
18
19Boost.Compute provides typedefs for these types which take the form:
20`boost::compute::scalarN_` where `scalar` is a scalar data type (e.g. `int`,
21`float`, `char`) and `N` is the size of the vector. Supported vector sizes
22are: 2, 4, 8, and 16.
23
24The following example shows how to transfer a set of 3D points stored as an
25array of `float`s on the host the device and then calculate the sum of the
26point coordinates using the [funcref boost::compute::accumulate accumulate()]
27function. The sum is transferred to the host and the centroid computed by
28dividing by the total number of points.
29
30Note that even though the points are in 3D, they are stored as `float4` due to
31OpenCL's alignment requirements.
32
33[import ../example/point_centroid.cpp]
34[point_centroid_example]
35
36[endsect] [/ vector data types]
37
38[section Custom Functions]
39
40The OpenCL runtime and the Boost Compute library provide a number of built-in
41functions such as sqrt() and dot() but many times these are not sufficient for
42solving the problem at hand.
43
44The Boost Compute library provides a few different ways to create custom
45functions that can be passed to the provided algorithms such as
46[funcref boost::compute::transform transform()] and
47[funcref boost::compute::reduce reduce()].
48
49The most basic method is to provide the raw source code for a function:
50
51``
52boost::compute::function<int (int)> add_four =
53    boost::compute::make_function_from_source<int (int)>(
54        "add_four",
55        "int add_four(int x) { return x + 4; }"
56    );
57
58boost::compute::transform(input.begin(), input.end(), output.begin(), add_four, queue);
59``
60
61This can also be done more succinctly using the [macroref BOOST_COMPUTE_FUNCTION
62BOOST_COMPUTE_FUNCTION()] macro:
63``
64BOOST_COMPUTE_FUNCTION(int, add_four, (int x),
65{
66    return x + 4;
67});
68
69boost::compute::transform(input.begin(), input.end(), output.begin(), add_four, queue);
70``
71
72Also see [@http://kylelutz.blogspot.com/2014/03/custom-opencl-functions-in-c-with.html
73"Custom OpenCL functions in C++ with Boost.Compute"] for more details.
74
75[endsect] [/ custom functions]
76
77[section Custom Types]
78
79Boost.Compute provides the [macroref BOOST_COMPUTE_ADAPT_STRUCT
80BOOST_COMPUTE_ADAPT_STRUCT()] macro which allows a C++ struct/class to be
81wrapped and used in OpenCL.
82
83[endsect] [/ custom types]
84
85[section Complex Values]
86
87While OpenCL itself doesn't natively support complex data types, the Boost
88Compute library provides them.
89
90To use complex values first include the following header:
91
92``
93#include <boost/compute/types/complex.hpp>
94``
95
96A vector of complex values can be created like so:
97
98``
99// create vector on device
100boost::compute::vector<std::complex<float> > vector;
101
102// insert two complex values
103vector.push_back(std::complex<float>(1.0f, 3.0f));
104vector.push_back(std::complex<float>(2.0f, 4.0f));
105``
106
107[endsect] [/ complex values]
108
109[section Lambda Expressions]
110
111The lambda expression framework allows for functions and predicates to be
112defined at the call-site of an algorithm.
113
114Lambda expressions use the placeholders `_1` and `_2` to indicate the
115arguments. The following declarations will bring the lambda placeholders into
116the current scope:
117
118``
119using boost::compute::lambda::_1;
120using boost::compute::lambda::_2;
121``
122
123The following examples show how to use lambda expressions along with the
124Boost.Compute algorithms to perform more complex operations on the device.
125
126To count the number of odd values in a vector:
127
128``
129boost::compute::count_if(vector.begin(), vector.end(), _1 % 2 == 1, queue);
130``
131
132To multiply each value in a vector by three and subtract four:
133
134``
135boost::compute::transform(vector.begin(), vector.end(), vector.begin(), _1 * 3 - 4, queue);
136``
137
138Lambda expressions can also be used to create function<> objects:
139
140``
141boost::compute::function<int(int)> add_four = _1 + 4;
142``
143
144[endsect] [/ lambda expressions]
145
146[section Asynchronous Operations]
147
148A major performance bottleneck in GPGPU applications is memory transfer. This
149can be alleviated by overlapping memory transfer with computation. The Boost
150Compute library provides the [funcref boost::compute::copy_async copy_async()]
151function which performs an asynchronous memory transfers between the host and
152the device.
153
154For example, to initiate a copy from the host to the device and then perform
155other actions:
156
157``
158// data on the host
159std::vector<float> host_vector = ...
160
161// create a vector on the device
162boost::compute::vector<float> device_vector(host_vector.size(), context);
163
164// copy data to the device asynchronously
165boost::compute::future<void> f = boost::compute::copy_async(
166    host_vector.begin(), host_vector.end(), device_vector.begin(), queue
167);
168
169// perform other work on the host or device
170// ...
171
172// ensure the copy is completed
173f.wait();
174
175// use data on the device (e.g. sort)
176boost::compute::sort(device_vector.begin(), device_vector.end(), queue);
177``
178
179[endsect] [/ asynchronous operations]
180
181[section Performance Timing]
182
183For example, to measure the time to copy a vector of data from the host to the
184device:
185
186[import ../example/time_copy.cpp]
187[time_copy_example]
188
189[endsect]
190
191[section OpenCL API Interoperability]
192
193The Boost Compute library is designed to easily interoperate with the OpenCL
194API. All of the wrapped classes have conversion operators to their underlying
195OpenCL types which allows them to be passed directly to the OpenCL functions.
196
197For example,
198``
199// create context object
200boost::compute::context ctx = boost::compute::default_context();
201
202// query number of devices using the OpenCL API
203cl_uint num_devices;
204clGetContextInfo(ctx, CL_CONTEXT_NUM_DEVICES, sizeof(cl_uint), &num_devices, 0);
205std::cout << "num_devices: " << num_devices << std::endl;
206``
207
208[endsect] [/ opencl api interoperability]
209
210[endsect] [/ advanced topics]
211