• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1<html>
2<head>
3<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
4<title>Advanced Topics</title>
5<link rel="stylesheet" href="../../../../../doc/src/boostbook.css" type="text/css">
6<meta name="generator" content="DocBook XSL Stylesheets V1.79.1">
7<link rel="home" href="../index.html" title="Chapter 1. Boost.Compute">
8<link rel="up" href="../index.html" title="Chapter 1. Boost.Compute">
9<link rel="prev" href="tutorial.html" title="Tutorial">
10<link rel="next" href="interop.html" title="Interoperability">
11</head>
12<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
13<table cellpadding="2" width="100%"><tr>
14<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../boost.png"></td>
15<td align="center"><a href="../../../../../index.html">Home</a></td>
16<td align="center"><a href="../../../../../libs/libraries.htm">Libraries</a></td>
17<td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
18<td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
19<td align="center"><a href="../../../../../more/index.htm">More</a></td>
20</tr></table>
21<hr>
22<div class="spirit-nav">
23<a accesskey="p" href="tutorial.html"><img src="../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../index.html"><img src="../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="interop.html"><img src="../../../../../doc/src/images/next.png" alt="Next"></a>
24</div>
25<div class="section">
26<div class="titlepage"><div><div><h2 class="title" style="clear: both">
27<a name="boost_compute.advanced_topics"></a><a class="link" href="advanced_topics.html" title="Advanced Topics">Advanced Topics</a>
28</h2></div></div></div>
29<div class="toc"><dl class="toc">
30<dt><span class="section"><a href="advanced_topics.html#boost_compute.advanced_topics.vector_data_types">Vector
31      Data Types</a></span></dt>
32<dt><span class="section"><a href="advanced_topics.html#boost_compute.advanced_topics.custom_functions">Custom
33      Functions</a></span></dt>
34<dt><span class="section"><a href="advanced_topics.html#boost_compute.advanced_topics.custom_types">Custom Types</a></span></dt>
35<dt><span class="section"><a href="advanced_topics.html#boost_compute.advanced_topics.complex_values">Complex
36      Values</a></span></dt>
37<dt><span class="section"><a href="advanced_topics.html#boost_compute.advanced_topics.lambda_expressions">Lambda
38      Expressions</a></span></dt>
39<dt><span class="section"><a href="advanced_topics.html#boost_compute.advanced_topics.asynchronous_operations">Asynchronous
40      Operations</a></span></dt>
41<dt><span class="section"><a href="advanced_topics.html#boost_compute.advanced_topics.performance_timing">Performance
42      Timing</a></span></dt>
43<dt><span class="section"><a href="advanced_topics.html#boost_compute.advanced_topics.opencl_api_interoperability">OpenCL
44      API Interoperability</a></span></dt>
45</dl></div>
46<p>
47      The following topics show advanced features of the Boost Compute library.
48    </p>
49<div class="section">
50<div class="titlepage"><div><div><h3 class="title">
51<a name="boost_compute.advanced_topics.vector_data_types"></a><a class="link" href="advanced_topics.html#boost_compute.advanced_topics.vector_data_types" title="Vector Data Types">Vector
52      Data Types</a>
53</h3></div></div></div>
54<p>
55        In addition to the built-in scalar types (e.g. <code class="computeroutput"><span class="keyword">int</span></code>
56        and <code class="computeroutput"><span class="keyword">float</span></code>), OpenCL also provides
57        vector data types (e.g. <code class="computeroutput"><span class="identifier">int2</span></code>
58        and <code class="computeroutput"><span class="identifier">vector4</span></code>). These can be
59        used with the Boost Compute library on both the host and device.
60      </p>
61<p>
62        Boost.Compute provides typedefs for these types which take the form: <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">scalarN_</span></code> where <code class="computeroutput"><span class="identifier">scalar</span></code>
63        is a scalar data type (e.g. <code class="computeroutput"><span class="keyword">int</span></code>,
64        <code class="computeroutput"><span class="keyword">float</span></code>, <code class="computeroutput"><span class="keyword">char</span></code>)
65        and <code class="computeroutput"><span class="identifier">N</span></code> is the size of the
66        vector. Supported vector sizes are: 2, 4, 8, and 16.
67      </p>
68<p>
69        The following example shows how to transfer a set of 3D points stored as
70        an array of <code class="computeroutput"><span class="keyword">float</span></code>s on the host
71        the device and then calculate the sum of the point coordinates using the
72        <code class="computeroutput"><a class="link" href="../boost/compute/accumulate.html" title="Function accumulate">accumulate()</a></code>
73        function. The sum is transferred to the host and the centroid computed by
74        dividing by the total number of points.
75      </p>
76<p>
77        Note that even though the points are in 3D, they are stored as <code class="computeroutput"><span class="identifier">float4</span></code> due to OpenCL's alignment requirements.
78      </p>
79<p>
80</p>
81<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">iostream</span><span class="special">&gt;</span>
82
83<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">algorithm</span><span class="special">/</span><span class="identifier">copy</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
84<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">algorithm</span><span class="special">/</span><span class="identifier">accumulate</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
85<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">container</span><span class="special">/</span><span class="identifier">vector</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
86<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">types</span><span class="special">/</span><span class="identifier">fundamental</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
87
88<span class="keyword">namespace</span> <span class="identifier">compute</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">;</span>
89
90<span class="comment">// the point centroid example calculates and displays the</span>
91<span class="comment">// centroid of a set of 3D points stored as float4's</span>
92<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
93<span class="special">{</span>
94    <span class="keyword">using</span> <span class="identifier">compute</span><span class="special">::</span><span class="identifier">float4_</span><span class="special">;</span>
95
96    <span class="comment">// get default device and setup context</span>
97    <span class="identifier">compute</span><span class="special">::</span><span class="identifier">device</span> <span class="identifier">device</span> <span class="special">=</span> <span class="identifier">compute</span><span class="special">::</span><span class="identifier">system</span><span class="special">::</span><span class="identifier">default_device</span><span class="special">();</span>
98    <span class="identifier">compute</span><span class="special">::</span><span class="identifier">context</span> <span class="identifier">context</span><span class="special">(</span><span class="identifier">device</span><span class="special">);</span>
99    <span class="identifier">compute</span><span class="special">::</span><span class="identifier">command_queue</span> <span class="identifier">queue</span><span class="special">(</span><span class="identifier">context</span><span class="special">,</span> <span class="identifier">device</span><span class="special">);</span>
100
101    <span class="comment">// point coordinates</span>
102    <span class="keyword">float</span> <span class="identifier">points</span><span class="special">[]</span> <span class="special">=</span> <span class="special">{</span> <span class="number">1.0f</span><span class="special">,</span> <span class="number">2.0f</span><span class="special">,</span> <span class="number">3.0f</span><span class="special">,</span> <span class="number">0.0f</span><span class="special">,</span>
103                       <span class="special">-</span><span class="number">2.0f</span><span class="special">,</span> <span class="special">-</span><span class="number">3.0f</span><span class="special">,</span> <span class="number">4.0f</span><span class="special">,</span> <span class="number">0.0f</span><span class="special">,</span>
104                       <span class="number">1.0f</span><span class="special">,</span> <span class="special">-</span><span class="number">2.0f</span><span class="special">,</span> <span class="number">2.5f</span><span class="special">,</span> <span class="number">0.0f</span><span class="special">,</span>
105                       <span class="special">-</span><span class="number">7.0f</span><span class="special">,</span> <span class="special">-</span><span class="number">3.0f</span><span class="special">,</span> <span class="special">-</span><span class="number">2.0f</span><span class="special">,</span> <span class="number">0.0f</span><span class="special">,</span>
106                       <span class="number">3.0f</span><span class="special">,</span> <span class="number">4.0f</span><span class="special">,</span> <span class="special">-</span><span class="number">5.0f</span><span class="special">,</span> <span class="number">0.0f</span> <span class="special">};</span>
107
108    <span class="comment">// create vector for five points</span>
109    <span class="identifier">compute</span><span class="special">::</span><span class="identifier">vector</span><span class="special">&lt;</span><span class="identifier">float4_</span><span class="special">&gt;</span> <span class="identifier">vector</span><span class="special">(</span><span class="number">5</span><span class="special">,</span> <span class="identifier">context</span><span class="special">);</span>
110
111    <span class="comment">// copy point data to the device</span>
112    <span class="identifier">compute</span><span class="special">::</span><span class="identifier">copy</span><span class="special">(</span>
113        <span class="keyword">reinterpret_cast</span><span class="special">&lt;</span><span class="identifier">float4_</span> <span class="special">*&gt;(</span><span class="identifier">points</span><span class="special">),</span>
114        <span class="keyword">reinterpret_cast</span><span class="special">&lt;</span><span class="identifier">float4_</span> <span class="special">*&gt;(</span><span class="identifier">points</span><span class="special">)</span> <span class="special">+</span> <span class="number">5</span><span class="special">,</span>
115        <span class="identifier">vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span>
116        <span class="identifier">queue</span>
117    <span class="special">);</span>
118
119    <span class="comment">// calculate sum</span>
120    <span class="identifier">float4_</span> <span class="identifier">sum</span> <span class="special">=</span> <span class="identifier">compute</span><span class="special">::</span><span class="identifier">accumulate</span><span class="special">(</span>
121        <span class="identifier">vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">vector</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">float4_</span><span class="special">(</span><span class="number">0</span><span class="special">,</span> <span class="number">0</span><span class="special">,</span> <span class="number">0</span><span class="special">,</span> <span class="number">0</span><span class="special">),</span> <span class="identifier">queue</span>
122    <span class="special">);</span>
123
124    <span class="comment">// calculate centroid</span>
125    <span class="identifier">float4_</span> <span class="identifier">centroid</span><span class="special">;</span>
126    <span class="keyword">for</span><span class="special">(</span><span class="identifier">size_t</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span> <span class="identifier">i</span> <span class="special">&lt;</span> <span class="number">3</span><span class="special">;</span> <span class="identifier">i</span><span class="special">++){</span>
127        <span class="identifier">centroid</span><span class="special">[</span><span class="identifier">i</span><span class="special">]</span> <span class="special">=</span> <span class="identifier">sum</span><span class="special">[</span><span class="identifier">i</span><span class="special">]</span> <span class="special">/</span> <span class="number">5.0f</span><span class="special">;</span>
128    <span class="special">}</span>
129
130    <span class="comment">// print centroid</span>
131    <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="string">"centroid: "</span> <span class="special">&lt;&lt;</span> <span class="identifier">centroid</span> <span class="special">&lt;&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
132
133    <span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
134<span class="special">}</span>
135</pre>
136<p>
137      </p>
138</div>
139<div class="section">
140<div class="titlepage"><div><div><h3 class="title">
141<a name="boost_compute.advanced_topics.custom_functions"></a><a class="link" href="advanced_topics.html#boost_compute.advanced_topics.custom_functions" title="Custom Functions">Custom
142      Functions</a>
143</h3></div></div></div>
144<p>
145        The OpenCL runtime and the Boost Compute library provide a number of built-in
146        functions such as sqrt() and dot() but many times these are not sufficient
147        for solving the problem at hand.
148      </p>
149<p>
150        The Boost Compute library provides a few different ways to create custom
151        functions that can be passed to the provided algorithms such as <code class="computeroutput"><a class="link" href="../boost/compute/transform.html" title="Function transform">transform()</a></code> and <code class="computeroutput"><a class="link" href="../boost/compute/reduce.html" title="Function reduce">reduce()</a></code>.
152      </p>
153<p>
154        The most basic method is to provide the raw source code for a function:
155      </p>
156<p>
157</p>
158<pre class="programlisting"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">function</span><span class="special">&lt;</span><span class="keyword">int</span> <span class="special">(</span><span class="keyword">int</span><span class="special">)&gt;</span> <span class="identifier">add_four</span> <span class="special">=</span>
159    <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">make_function_from_source</span><span class="special">&lt;</span><span class="keyword">int</span> <span class="special">(</span><span class="keyword">int</span><span class="special">)&gt;(</span>
160        <span class="string">"add_four"</span><span class="special">,</span>
161        <span class="string">"int add_four(int x) { return x + 4; }"</span>
162    <span class="special">);</span>
163
164<span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">transform</span><span class="special">(</span><span class="identifier">input</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">output</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">add_four</span><span class="special">,</span> <span class="identifier">queue</span><span class="special">);</span>
165</pre>
166<p>
167      </p>
168<p>
169        This can also be done more succinctly using the <code class="computeroutput">BOOST_COMPUTE_FUNCTION</code>
170        macro:
171</p>
172<pre class="programlisting"><span class="identifier">BOOST_COMPUTE_FUNCTION</span><span class="special">(</span><span class="keyword">int</span><span class="special">,</span> <span class="identifier">add_four</span><span class="special">,</span> <span class="special">(</span><span class="keyword">int</span> <span class="identifier">x</span><span class="special">),</span>
173<span class="special">{</span>
174    <span class="keyword">return</span> <span class="identifier">x</span> <span class="special">+</span> <span class="number">4</span><span class="special">;</span>
175<span class="special">});</span>
176
177<span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">transform</span><span class="special">(</span><span class="identifier">input</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">output</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">add_four</span><span class="special">,</span> <span class="identifier">queue</span><span class="special">);</span>
178</pre>
179<p>
180      </p>
181<p>
182        Also see <a href="http://kylelutz.blogspot.com/2014/03/custom-opencl-functions-in-c-with.html" target="_top">"Custom
183        OpenCL functions in C++ with Boost.Compute"</a> for more details.
184      </p>
185</div>
186<div class="section">
187<div class="titlepage"><div><div><h3 class="title">
188<a name="boost_compute.advanced_topics.custom_types"></a><a class="link" href="advanced_topics.html#boost_compute.advanced_topics.custom_types" title="Custom Types">Custom Types</a>
189</h3></div></div></div>
190<p>
191        Boost.Compute provides the <code class="computeroutput">BOOST_COMPUTE_ADAPT_STRUCT</code>
192        macro which allows a C++ struct/class to be wrapped and used in OpenCL.
193      </p>
194</div>
195<div class="section">
196<div class="titlepage"><div><div><h3 class="title">
197<a name="boost_compute.advanced_topics.complex_values"></a><a class="link" href="advanced_topics.html#boost_compute.advanced_topics.complex_values" title="Complex Values">Complex
198      Values</a>
199</h3></div></div></div>
200<p>
201        While OpenCL itself doesn't natively support complex data types, the Boost
202        Compute library provides them.
203      </p>
204<p>
205        To use complex values first include the following header:
206      </p>
207<p>
208</p>
209<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">types</span><span class="special">/</span><span class="identifier">complex</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
210</pre>
211<p>
212      </p>
213<p>
214        A vector of complex values can be created like so:
215      </p>
216<p>
217</p>
218<pre class="programlisting"><span class="comment">// create vector on device</span>
219<span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">vector</span><span class="special">&lt;</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">complex</span><span class="special">&lt;</span><span class="keyword">float</span><span class="special">&gt;</span> <span class="special">&gt;</span> <span class="identifier">vector</span><span class="special">;</span>
220
221<span class="comment">// insert two complex values</span>
222<span class="identifier">vector</span><span class="special">.</span><span class="identifier">push_back</span><span class="special">(</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">complex</span><span class="special">&lt;</span><span class="keyword">float</span><span class="special">&gt;(</span><span class="number">1.0f</span><span class="special">,</span> <span class="number">3.0f</span><span class="special">));</span>
223<span class="identifier">vector</span><span class="special">.</span><span class="identifier">push_back</span><span class="special">(</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">complex</span><span class="special">&lt;</span><span class="keyword">float</span><span class="special">&gt;(</span><span class="number">2.0f</span><span class="special">,</span> <span class="number">4.0f</span><span class="special">));</span>
224</pre>
225<p>
226      </p>
227</div>
228<div class="section">
229<div class="titlepage"><div><div><h3 class="title">
230<a name="boost_compute.advanced_topics.lambda_expressions"></a><a class="link" href="advanced_topics.html#boost_compute.advanced_topics.lambda_expressions" title="Lambda Expressions">Lambda
231      Expressions</a>
232</h3></div></div></div>
233<p>
234        The lambda expression framework allows for functions and predicates to be
235        defined at the call-site of an algorithm.
236      </p>
237<p>
238        Lambda expressions use the placeholders <code class="computeroutput"><span class="identifier">_1</span></code>
239        and <code class="computeroutput"><span class="identifier">_2</span></code> to indicate the arguments.
240        The following declarations will bring the lambda placeholders into the current
241        scope:
242      </p>
243<p>
244</p>
245<pre class="programlisting"><span class="keyword">using</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">lambda</span><span class="special">::</span><span class="identifier">_1</span><span class="special">;</span>
246<span class="keyword">using</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">lambda</span><span class="special">::</span><span class="identifier">_2</span><span class="special">;</span>
247</pre>
248<p>
249      </p>
250<p>
251        The following examples show how to use lambda expressions along with the
252        Boost.Compute algorithms to perform more complex operations on the device.
253      </p>
254<p>
255        To count the number of odd values in a vector:
256      </p>
257<p>
258</p>
259<pre class="programlisting"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">count_if</span><span class="special">(</span><span class="identifier">vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">vector</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">_1</span> <span class="special">%</span> <span class="number">2</span> <span class="special">==</span> <span class="number">1</span><span class="special">,</span> <span class="identifier">queue</span><span class="special">);</span>
260</pre>
261<p>
262      </p>
263<p>
264        To multiply each value in a vector by three and subtract four:
265      </p>
266<p>
267</p>
268<pre class="programlisting"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">transform</span><span class="special">(</span><span class="identifier">vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">vector</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">_1</span> <span class="special">*</span> <span class="number">3</span> <span class="special">-</span> <span class="number">4</span><span class="special">,</span> <span class="identifier">queue</span><span class="special">);</span>
269</pre>
270<p>
271      </p>
272<p>
273        Lambda expressions can also be used to create function&lt;&gt; objects:
274      </p>
275<p>
276</p>
277<pre class="programlisting"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">function</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">(</span><span class="keyword">int</span><span class="special">)&gt;</span> <span class="identifier">add_four</span> <span class="special">=</span> <span class="identifier">_1</span> <span class="special">+</span> <span class="number">4</span><span class="special">;</span>
278</pre>
279<p>
280      </p>
281</div>
282<div class="section">
283<div class="titlepage"><div><div><h3 class="title">
284<a name="boost_compute.advanced_topics.asynchronous_operations"></a><a class="link" href="advanced_topics.html#boost_compute.advanced_topics.asynchronous_operations" title="Asynchronous Operations">Asynchronous
285      Operations</a>
286</h3></div></div></div>
287<p>
288        A major performance bottleneck in GPGPU applications is memory transfer.
289        This can be alleviated by overlapping memory transfer with computation. The
290        Boost Compute library provides the <code class="computeroutput"><a class="link" href="../boost/compute/copy_async.html" title="Function template copy_async">copy_async()</a></code>
291        function which performs an asynchronous memory transfers between the host
292        and the device.
293      </p>
294<p>
295        For example, to initiate a copy from the host to the device and then perform
296        other actions:
297      </p>
298<p>
299</p>
300<pre class="programlisting"><span class="comment">// data on the host</span>
301<span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special">&lt;</span><span class="keyword">float</span><span class="special">&gt;</span> <span class="identifier">host_vector</span> <span class="special">=</span> <span class="special">...</span>
302
303<span class="comment">// create a vector on the device</span>
304<span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">vector</span><span class="special">&lt;</span><span class="keyword">float</span><span class="special">&gt;</span> <span class="identifier">device_vector</span><span class="special">(</span><span class="identifier">host_vector</span><span class="special">.</span><span class="identifier">size</span><span class="special">(),</span> <span class="identifier">context</span><span class="special">);</span>
305
306<span class="comment">// copy data to the device asynchronously</span>
307<span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">future</span><span class="special">&lt;</span><span class="keyword">void</span><span class="special">&gt;</span> <span class="identifier">f</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">copy_async</span><span class="special">(</span>
308    <span class="identifier">host_vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">host_vector</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">device_vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">queue</span>
309<span class="special">);</span>
310
311<span class="comment">// perform other work on the host or device</span>
312<span class="comment">// ...</span>
313
314<span class="comment">// ensure the copy is completed</span>
315<span class="identifier">f</span><span class="special">.</span><span class="identifier">wait</span><span class="special">();</span>
316
317<span class="comment">// use data on the device (e.g. sort)</span>
318<span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">sort</span><span class="special">(</span><span class="identifier">device_vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">device_vector</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">queue</span><span class="special">);</span>
319</pre>
320<p>
321      </p>
322</div>
323<div class="section">
324<div class="titlepage"><div><div><h3 class="title">
325<a name="boost_compute.advanced_topics.performance_timing"></a><a class="link" href="advanced_topics.html#boost_compute.advanced_topics.performance_timing" title="Performance Timing">Performance
326      Timing</a>
327</h3></div></div></div>
328<p>
329        For example, to measure the time to copy a vector of data from the host to
330        the device:
331      </p>
332<p>
333</p>
334<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">vector</span><span class="special">&gt;</span>
335<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">cstdlib</span><span class="special">&gt;</span>
336<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">iostream</span><span class="special">&gt;</span>
337
338<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">event</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
339<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">system</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
340<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">algorithm</span><span class="special">/</span><span class="identifier">copy</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
341<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">async</span><span class="special">/</span><span class="identifier">future</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
342<span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">container</span><span class="special">/</span><span class="identifier">vector</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
343
344<span class="keyword">namespace</span> <span class="identifier">compute</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">;</span>
345
346<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
347<span class="special">{</span>
348    <span class="comment">// get the default device</span>
349    <span class="identifier">compute</span><span class="special">::</span><span class="identifier">device</span> <span class="identifier">gpu</span> <span class="special">=</span> <span class="identifier">compute</span><span class="special">::</span><span class="identifier">system</span><span class="special">::</span><span class="identifier">default_device</span><span class="special">();</span>
350
351    <span class="comment">// create context for default device</span>
352    <span class="identifier">compute</span><span class="special">::</span><span class="identifier">context</span> <span class="identifier">context</span><span class="special">(</span><span class="identifier">gpu</span><span class="special">);</span>
353
354    <span class="comment">// create command queue with profiling enabled</span>
355    <span class="identifier">compute</span><span class="special">::</span><span class="identifier">command_queue</span> <span class="identifier">queue</span><span class="special">(</span>
356        <span class="identifier">context</span><span class="special">,</span> <span class="identifier">gpu</span><span class="special">,</span> <span class="identifier">compute</span><span class="special">::</span><span class="identifier">command_queue</span><span class="special">::</span><span class="identifier">enable_profiling</span>
357    <span class="special">);</span>
358
359    <span class="comment">// generate random data on the host</span>
360    <span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;</span> <span class="identifier">host_vector</span><span class="special">(</span><span class="number">16000000</span><span class="special">);</span>
361    <span class="identifier">std</span><span class="special">::</span><span class="identifier">generate</span><span class="special">(</span><span class="identifier">host_vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">host_vector</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">rand</span><span class="special">);</span>
362
363    <span class="comment">// create a vector on the device</span>
364    <span class="identifier">compute</span><span class="special">::</span><span class="identifier">vector</span><span class="special">&lt;</span><span class="keyword">int</span><span class="special">&gt;</span> <span class="identifier">device_vector</span><span class="special">(</span><span class="identifier">host_vector</span><span class="special">.</span><span class="identifier">size</span><span class="special">(),</span> <span class="identifier">context</span><span class="special">);</span>
365
366    <span class="comment">// copy data from the host to the device</span>
367    <span class="identifier">compute</span><span class="special">::</span><span class="identifier">future</span><span class="special">&lt;</span><span class="keyword">void</span><span class="special">&gt;</span> <span class="identifier">future</span> <span class="special">=</span> <span class="identifier">compute</span><span class="special">::</span><span class="identifier">copy_async</span><span class="special">(</span>
368        <span class="identifier">host_vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">host_vector</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">device_vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">queue</span>
369    <span class="special">);</span>
370
371    <span class="comment">// wait for copy to finish</span>
372    <span class="identifier">future</span><span class="special">.</span><span class="identifier">wait</span><span class="special">();</span>
373
374    <span class="comment">// get elapsed time from event profiling information</span>
375    <span class="identifier">boost</span><span class="special">::</span><span class="identifier">chrono</span><span class="special">::</span><span class="identifier">milliseconds</span> <span class="identifier">duration</span> <span class="special">=</span>
376        <span class="identifier">future</span><span class="special">.</span><span class="identifier">get_event</span><span class="special">().</span><span class="identifier">duration</span><span class="special">&lt;</span><span class="identifier">boost</span><span class="special">::</span><span class="identifier">chrono</span><span class="special">::</span><span class="identifier">milliseconds</span><span class="special">&gt;();</span>
377
378    <span class="comment">// print elapsed time in milliseconds</span>
379    <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="string">"time: "</span> <span class="special">&lt;&lt;</span> <span class="identifier">duration</span><span class="special">.</span><span class="identifier">count</span><span class="special">()</span> <span class="special">&lt;&lt;</span> <span class="string">" ms"</span> <span class="special">&lt;&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
380
381    <span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
382<span class="special">}</span>
383</pre>
384<p>
385      </p>
386</div>
387<div class="section">
388<div class="titlepage"><div><div><h3 class="title">
389<a name="boost_compute.advanced_topics.opencl_api_interoperability"></a><a class="link" href="advanced_topics.html#boost_compute.advanced_topics.opencl_api_interoperability" title="OpenCL API Interoperability">OpenCL
390      API Interoperability</a>
391</h3></div></div></div>
392<p>
393        The Boost Compute library is designed to easily interoperate with the OpenCL
394        API. All of the wrapped classes have conversion operators to their underlying
395        OpenCL types which allows them to be passed directly to the OpenCL functions.
396      </p>
397<p>
398        For example,
399</p>
400<pre class="programlisting"><span class="comment">// create context object</span>
401<span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">context</span> <span class="identifier">ctx</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">default_context</span><span class="special">();</span>
402
403<span class="comment">// query number of devices using the OpenCL API</span>
404<span class="identifier">cl_uint</span> <span class="identifier">num_devices</span><span class="special">;</span>
405<span class="identifier">clGetContextInfo</span><span class="special">(</span><span class="identifier">ctx</span><span class="special">,</span> <span class="identifier">CL_CONTEXT_NUM_DEVICES</span><span class="special">,</span> <span class="keyword">sizeof</span><span class="special">(</span><span class="identifier">cl_uint</span><span class="special">),</span> <span class="special">&amp;</span><span class="identifier">num_devices</span><span class="special">,</span> <span class="number">0</span><span class="special">);</span>
406<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="string">"num_devices: "</span> <span class="special">&lt;&lt;</span> <span class="identifier">num_devices</span> <span class="special">&lt;&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
407</pre>
408<p>
409      </p>
410</div>
411</div>
412<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
413<td align="left"></td>
414<td align="right"><div class="copyright-footer">Copyright © 2013, 2014 Kyle Lutz<p>
415        Distributed under the Boost Software License, Version 1.0. (See accompanying
416        file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
417      </p>
418</div></td>
419</tr></table>
420<hr>
421<div class="spirit-nav">
422<a accesskey="p" href="tutorial.html"><img src="../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../index.html"><img src="../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="interop.html"><img src="../../../../../doc/src/images/next.png" alt="Next"></a>
423</div>
424</body>
425</html>
426