1<html> 2<head> 3<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> 4<title>Advanced Topics</title> 5<link rel="stylesheet" href="../../../../../doc/src/boostbook.css" type="text/css"> 6<meta name="generator" content="DocBook XSL Stylesheets V1.79.1"> 7<link rel="home" href="../index.html" title="Chapter 1. Boost.Compute"> 8<link rel="up" href="../index.html" title="Chapter 1. Boost.Compute"> 9<link rel="prev" href="tutorial.html" title="Tutorial"> 10<link rel="next" href="interop.html" title="Interoperability"> 11</head> 12<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"> 13<table cellpadding="2" width="100%"><tr> 14<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../boost.png"></td> 15<td align="center"><a href="../../../../../index.html">Home</a></td> 16<td align="center"><a href="../../../../../libs/libraries.htm">Libraries</a></td> 17<td align="center"><a href="http://www.boost.org/users/people.html">People</a></td> 18<td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td> 19<td align="center"><a href="../../../../../more/index.htm">More</a></td> 20</tr></table> 21<hr> 22<div class="spirit-nav"> 23<a accesskey="p" href="tutorial.html"><img src="../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../index.html"><img src="../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="interop.html"><img src="../../../../../doc/src/images/next.png" alt="Next"></a> 24</div> 25<div class="section"> 26<div class="titlepage"><div><div><h2 class="title" style="clear: both"> 27<a name="boost_compute.advanced_topics"></a><a class="link" href="advanced_topics.html" title="Advanced Topics">Advanced Topics</a> 28</h2></div></div></div> 29<div class="toc"><dl class="toc"> 30<dt><span class="section"><a href="advanced_topics.html#boost_compute.advanced_topics.vector_data_types">Vector 31 Data Types</a></span></dt> 32<dt><span class="section"><a href="advanced_topics.html#boost_compute.advanced_topics.custom_functions">Custom 33 Functions</a></span></dt> 34<dt><span class="section"><a href="advanced_topics.html#boost_compute.advanced_topics.custom_types">Custom Types</a></span></dt> 35<dt><span class="section"><a href="advanced_topics.html#boost_compute.advanced_topics.complex_values">Complex 36 Values</a></span></dt> 37<dt><span class="section"><a href="advanced_topics.html#boost_compute.advanced_topics.lambda_expressions">Lambda 38 Expressions</a></span></dt> 39<dt><span class="section"><a href="advanced_topics.html#boost_compute.advanced_topics.asynchronous_operations">Asynchronous 40 Operations</a></span></dt> 41<dt><span class="section"><a href="advanced_topics.html#boost_compute.advanced_topics.performance_timing">Performance 42 Timing</a></span></dt> 43<dt><span class="section"><a href="advanced_topics.html#boost_compute.advanced_topics.opencl_api_interoperability">OpenCL 44 API Interoperability</a></span></dt> 45</dl></div> 46<p> 47 The following topics show advanced features of the Boost Compute library. 48 </p> 49<div class="section"> 50<div class="titlepage"><div><div><h3 class="title"> 51<a name="boost_compute.advanced_topics.vector_data_types"></a><a class="link" href="advanced_topics.html#boost_compute.advanced_topics.vector_data_types" title="Vector Data Types">Vector 52 Data Types</a> 53</h3></div></div></div> 54<p> 55 In addition to the built-in scalar types (e.g. <code class="computeroutput"><span class="keyword">int</span></code> 56 and <code class="computeroutput"><span class="keyword">float</span></code>), OpenCL also provides 57 vector data types (e.g. <code class="computeroutput"><span class="identifier">int2</span></code> 58 and <code class="computeroutput"><span class="identifier">vector4</span></code>). These can be 59 used with the Boost Compute library on both the host and device. 60 </p> 61<p> 62 Boost.Compute provides typedefs for these types which take the form: <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">scalarN_</span></code> where <code class="computeroutput"><span class="identifier">scalar</span></code> 63 is a scalar data type (e.g. <code class="computeroutput"><span class="keyword">int</span></code>, 64 <code class="computeroutput"><span class="keyword">float</span></code>, <code class="computeroutput"><span class="keyword">char</span></code>) 65 and <code class="computeroutput"><span class="identifier">N</span></code> is the size of the 66 vector. Supported vector sizes are: 2, 4, 8, and 16. 67 </p> 68<p> 69 The following example shows how to transfer a set of 3D points stored as 70 an array of <code class="computeroutput"><span class="keyword">float</span></code>s on the host 71 the device and then calculate the sum of the point coordinates using the 72 <code class="computeroutput"><a class="link" href="../boost/compute/accumulate.html" title="Function accumulate">accumulate()</a></code> 73 function. The sum is transferred to the host and the centroid computed by 74 dividing by the total number of points. 75 </p> 76<p> 77 Note that even though the points are in 3D, they are stored as <code class="computeroutput"><span class="identifier">float4</span></code> due to OpenCL's alignment requirements. 78 </p> 79<p> 80</p> 81<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span> 82 83<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">algorithm</span><span class="special">/</span><span class="identifier">copy</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> 84<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">algorithm</span><span class="special">/</span><span class="identifier">accumulate</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> 85<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">container</span><span class="special">/</span><span class="identifier">vector</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> 86<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">types</span><span class="special">/</span><span class="identifier">fundamental</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> 87 88<span class="keyword">namespace</span> <span class="identifier">compute</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">;</span> 89 90<span class="comment">// the point centroid example calculates and displays the</span> 91<span class="comment">// centroid of a set of 3D points stored as float4's</span> 92<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span> 93<span class="special">{</span> 94 <span class="keyword">using</span> <span class="identifier">compute</span><span class="special">::</span><span class="identifier">float4_</span><span class="special">;</span> 95 96 <span class="comment">// get default device and setup context</span> 97 <span class="identifier">compute</span><span class="special">::</span><span class="identifier">device</span> <span class="identifier">device</span> <span class="special">=</span> <span class="identifier">compute</span><span class="special">::</span><span class="identifier">system</span><span class="special">::</span><span class="identifier">default_device</span><span class="special">();</span> 98 <span class="identifier">compute</span><span class="special">::</span><span class="identifier">context</span> <span class="identifier">context</span><span class="special">(</span><span class="identifier">device</span><span class="special">);</span> 99 <span class="identifier">compute</span><span class="special">::</span><span class="identifier">command_queue</span> <span class="identifier">queue</span><span class="special">(</span><span class="identifier">context</span><span class="special">,</span> <span class="identifier">device</span><span class="special">);</span> 100 101 <span class="comment">// point coordinates</span> 102 <span class="keyword">float</span> <span class="identifier">points</span><span class="special">[]</span> <span class="special">=</span> <span class="special">{</span> <span class="number">1.0f</span><span class="special">,</span> <span class="number">2.0f</span><span class="special">,</span> <span class="number">3.0f</span><span class="special">,</span> <span class="number">0.0f</span><span class="special">,</span> 103 <span class="special">-</span><span class="number">2.0f</span><span class="special">,</span> <span class="special">-</span><span class="number">3.0f</span><span class="special">,</span> <span class="number">4.0f</span><span class="special">,</span> <span class="number">0.0f</span><span class="special">,</span> 104 <span class="number">1.0f</span><span class="special">,</span> <span class="special">-</span><span class="number">2.0f</span><span class="special">,</span> <span class="number">2.5f</span><span class="special">,</span> <span class="number">0.0f</span><span class="special">,</span> 105 <span class="special">-</span><span class="number">7.0f</span><span class="special">,</span> <span class="special">-</span><span class="number">3.0f</span><span class="special">,</span> <span class="special">-</span><span class="number">2.0f</span><span class="special">,</span> <span class="number">0.0f</span><span class="special">,</span> 106 <span class="number">3.0f</span><span class="special">,</span> <span class="number">4.0f</span><span class="special">,</span> <span class="special">-</span><span class="number">5.0f</span><span class="special">,</span> <span class="number">0.0f</span> <span class="special">};</span> 107 108 <span class="comment">// create vector for five points</span> 109 <span class="identifier">compute</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="identifier">float4_</span><span class="special">></span> <span class="identifier">vector</span><span class="special">(</span><span class="number">5</span><span class="special">,</span> <span class="identifier">context</span><span class="special">);</span> 110 111 <span class="comment">// copy point data to the device</span> 112 <span class="identifier">compute</span><span class="special">::</span><span class="identifier">copy</span><span class="special">(</span> 113 <span class="keyword">reinterpret_cast</span><span class="special"><</span><span class="identifier">float4_</span> <span class="special">*>(</span><span class="identifier">points</span><span class="special">),</span> 114 <span class="keyword">reinterpret_cast</span><span class="special"><</span><span class="identifier">float4_</span> <span class="special">*>(</span><span class="identifier">points</span><span class="special">)</span> <span class="special">+</span> <span class="number">5</span><span class="special">,</span> 115 <span class="identifier">vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> 116 <span class="identifier">queue</span> 117 <span class="special">);</span> 118 119 <span class="comment">// calculate sum</span> 120 <span class="identifier">float4_</span> <span class="identifier">sum</span> <span class="special">=</span> <span class="identifier">compute</span><span class="special">::</span><span class="identifier">accumulate</span><span class="special">(</span> 121 <span class="identifier">vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">vector</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">float4_</span><span class="special">(</span><span class="number">0</span><span class="special">,</span> <span class="number">0</span><span class="special">,</span> <span class="number">0</span><span class="special">,</span> <span class="number">0</span><span class="special">),</span> <span class="identifier">queue</span> 122 <span class="special">);</span> 123 124 <span class="comment">// calculate centroid</span> 125 <span class="identifier">float4_</span> <span class="identifier">centroid</span><span class="special">;</span> 126 <span class="keyword">for</span><span class="special">(</span><span class="identifier">size_t</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span> <span class="identifier">i</span> <span class="special"><</span> <span class="number">3</span><span class="special">;</span> <span class="identifier">i</span><span class="special">++){</span> 127 <span class="identifier">centroid</span><span class="special">[</span><span class="identifier">i</span><span class="special">]</span> <span class="special">=</span> <span class="identifier">sum</span><span class="special">[</span><span class="identifier">i</span><span class="special">]</span> <span class="special">/</span> <span class="number">5.0f</span><span class="special">;</span> 128 <span class="special">}</span> 129 130 <span class="comment">// print centroid</span> 131 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"centroid: "</span> <span class="special"><<</span> <span class="identifier">centroid</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span> 132 133 <span class="keyword">return</span> <span class="number">0</span><span class="special">;</span> 134<span class="special">}</span> 135</pre> 136<p> 137 </p> 138</div> 139<div class="section"> 140<div class="titlepage"><div><div><h3 class="title"> 141<a name="boost_compute.advanced_topics.custom_functions"></a><a class="link" href="advanced_topics.html#boost_compute.advanced_topics.custom_functions" title="Custom Functions">Custom 142 Functions</a> 143</h3></div></div></div> 144<p> 145 The OpenCL runtime and the Boost Compute library provide a number of built-in 146 functions such as sqrt() and dot() but many times these are not sufficient 147 for solving the problem at hand. 148 </p> 149<p> 150 The Boost Compute library provides a few different ways to create custom 151 functions that can be passed to the provided algorithms such as <code class="computeroutput"><a class="link" href="../boost/compute/transform.html" title="Function transform">transform()</a></code> and <code class="computeroutput"><a class="link" href="../boost/compute/reduce.html" title="Function reduce">reduce()</a></code>. 152 </p> 153<p> 154 The most basic method is to provide the raw source code for a function: 155 </p> 156<p> 157</p> 158<pre class="programlisting"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">function</span><span class="special"><</span><span class="keyword">int</span> <span class="special">(</span><span class="keyword">int</span><span class="special">)></span> <span class="identifier">add_four</span> <span class="special">=</span> 159 <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">make_function_from_source</span><span class="special"><</span><span class="keyword">int</span> <span class="special">(</span><span class="keyword">int</span><span class="special">)>(</span> 160 <span class="string">"add_four"</span><span class="special">,</span> 161 <span class="string">"int add_four(int x) { return x + 4; }"</span> 162 <span class="special">);</span> 163 164<span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">transform</span><span class="special">(</span><span class="identifier">input</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">output</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">add_four</span><span class="special">,</span> <span class="identifier">queue</span><span class="special">);</span> 165</pre> 166<p> 167 </p> 168<p> 169 This can also be done more succinctly using the <code class="computeroutput">BOOST_COMPUTE_FUNCTION</code> 170 macro: 171</p> 172<pre class="programlisting"><span class="identifier">BOOST_COMPUTE_FUNCTION</span><span class="special">(</span><span class="keyword">int</span><span class="special">,</span> <span class="identifier">add_four</span><span class="special">,</span> <span class="special">(</span><span class="keyword">int</span> <span class="identifier">x</span><span class="special">),</span> 173<span class="special">{</span> 174 <span class="keyword">return</span> <span class="identifier">x</span> <span class="special">+</span> <span class="number">4</span><span class="special">;</span> 175<span class="special">});</span> 176 177<span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">transform</span><span class="special">(</span><span class="identifier">input</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">output</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">add_four</span><span class="special">,</span> <span class="identifier">queue</span><span class="special">);</span> 178</pre> 179<p> 180 </p> 181<p> 182 Also see <a href="http://kylelutz.blogspot.com/2014/03/custom-opencl-functions-in-c-with.html" target="_top">"Custom 183 OpenCL functions in C++ with Boost.Compute"</a> for more details. 184 </p> 185</div> 186<div class="section"> 187<div class="titlepage"><div><div><h3 class="title"> 188<a name="boost_compute.advanced_topics.custom_types"></a><a class="link" href="advanced_topics.html#boost_compute.advanced_topics.custom_types" title="Custom Types">Custom Types</a> 189</h3></div></div></div> 190<p> 191 Boost.Compute provides the <code class="computeroutput">BOOST_COMPUTE_ADAPT_STRUCT</code> 192 macro which allows a C++ struct/class to be wrapped and used in OpenCL. 193 </p> 194</div> 195<div class="section"> 196<div class="titlepage"><div><div><h3 class="title"> 197<a name="boost_compute.advanced_topics.complex_values"></a><a class="link" href="advanced_topics.html#boost_compute.advanced_topics.complex_values" title="Complex Values">Complex 198 Values</a> 199</h3></div></div></div> 200<p> 201 While OpenCL itself doesn't natively support complex data types, the Boost 202 Compute library provides them. 203 </p> 204<p> 205 To use complex values first include the following header: 206 </p> 207<p> 208</p> 209<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">types</span><span class="special">/</span><span class="identifier">complex</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> 210</pre> 211<p> 212 </p> 213<p> 214 A vector of complex values can be created like so: 215 </p> 216<p> 217</p> 218<pre class="programlisting"><span class="comment">// create vector on device</span> 219<span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">complex</span><span class="special"><</span><span class="keyword">float</span><span class="special">></span> <span class="special">></span> <span class="identifier">vector</span><span class="special">;</span> 220 221<span class="comment">// insert two complex values</span> 222<span class="identifier">vector</span><span class="special">.</span><span class="identifier">push_back</span><span class="special">(</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">complex</span><span class="special"><</span><span class="keyword">float</span><span class="special">>(</span><span class="number">1.0f</span><span class="special">,</span> <span class="number">3.0f</span><span class="special">));</span> 223<span class="identifier">vector</span><span class="special">.</span><span class="identifier">push_back</span><span class="special">(</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">complex</span><span class="special"><</span><span class="keyword">float</span><span class="special">>(</span><span class="number">2.0f</span><span class="special">,</span> <span class="number">4.0f</span><span class="special">));</span> 224</pre> 225<p> 226 </p> 227</div> 228<div class="section"> 229<div class="titlepage"><div><div><h3 class="title"> 230<a name="boost_compute.advanced_topics.lambda_expressions"></a><a class="link" href="advanced_topics.html#boost_compute.advanced_topics.lambda_expressions" title="Lambda Expressions">Lambda 231 Expressions</a> 232</h3></div></div></div> 233<p> 234 The lambda expression framework allows for functions and predicates to be 235 defined at the call-site of an algorithm. 236 </p> 237<p> 238 Lambda expressions use the placeholders <code class="computeroutput"><span class="identifier">_1</span></code> 239 and <code class="computeroutput"><span class="identifier">_2</span></code> to indicate the arguments. 240 The following declarations will bring the lambda placeholders into the current 241 scope: 242 </p> 243<p> 244</p> 245<pre class="programlisting"><span class="keyword">using</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">lambda</span><span class="special">::</span><span class="identifier">_1</span><span class="special">;</span> 246<span class="keyword">using</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">lambda</span><span class="special">::</span><span class="identifier">_2</span><span class="special">;</span> 247</pre> 248<p> 249 </p> 250<p> 251 The following examples show how to use lambda expressions along with the 252 Boost.Compute algorithms to perform more complex operations on the device. 253 </p> 254<p> 255 To count the number of odd values in a vector: 256 </p> 257<p> 258</p> 259<pre class="programlisting"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">count_if</span><span class="special">(</span><span class="identifier">vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">vector</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">_1</span> <span class="special">%</span> <span class="number">2</span> <span class="special">==</span> <span class="number">1</span><span class="special">,</span> <span class="identifier">queue</span><span class="special">);</span> 260</pre> 261<p> 262 </p> 263<p> 264 To multiply each value in a vector by three and subtract four: 265 </p> 266<p> 267</p> 268<pre class="programlisting"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">transform</span><span class="special">(</span><span class="identifier">vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">vector</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">_1</span> <span class="special">*</span> <span class="number">3</span> <span class="special">-</span> <span class="number">4</span><span class="special">,</span> <span class="identifier">queue</span><span class="special">);</span> 269</pre> 270<p> 271 </p> 272<p> 273 Lambda expressions can also be used to create function<> objects: 274 </p> 275<p> 276</p> 277<pre class="programlisting"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">function</span><span class="special"><</span><span class="keyword">int</span><span class="special">(</span><span class="keyword">int</span><span class="special">)></span> <span class="identifier">add_four</span> <span class="special">=</span> <span class="identifier">_1</span> <span class="special">+</span> <span class="number">4</span><span class="special">;</span> 278</pre> 279<p> 280 </p> 281</div> 282<div class="section"> 283<div class="titlepage"><div><div><h3 class="title"> 284<a name="boost_compute.advanced_topics.asynchronous_operations"></a><a class="link" href="advanced_topics.html#boost_compute.advanced_topics.asynchronous_operations" title="Asynchronous Operations">Asynchronous 285 Operations</a> 286</h3></div></div></div> 287<p> 288 A major performance bottleneck in GPGPU applications is memory transfer. 289 This can be alleviated by overlapping memory transfer with computation. The 290 Boost Compute library provides the <code class="computeroutput"><a class="link" href="../boost/compute/copy_async.html" title="Function template copy_async">copy_async()</a></code> 291 function which performs an asynchronous memory transfers between the host 292 and the device. 293 </p> 294<p> 295 For example, to initiate a copy from the host to the device and then perform 296 other actions: 297 </p> 298<p> 299</p> 300<pre class="programlisting"><span class="comment">// data on the host</span> 301<span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="keyword">float</span><span class="special">></span> <span class="identifier">host_vector</span> <span class="special">=</span> <span class="special">...</span> 302 303<span class="comment">// create a vector on the device</span> 304<span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="keyword">float</span><span class="special">></span> <span class="identifier">device_vector</span><span class="special">(</span><span class="identifier">host_vector</span><span class="special">.</span><span class="identifier">size</span><span class="special">(),</span> <span class="identifier">context</span><span class="special">);</span> 305 306<span class="comment">// copy data to the device asynchronously</span> 307<span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">future</span><span class="special"><</span><span class="keyword">void</span><span class="special">></span> <span class="identifier">f</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">copy_async</span><span class="special">(</span> 308 <span class="identifier">host_vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">host_vector</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">device_vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">queue</span> 309<span class="special">);</span> 310 311<span class="comment">// perform other work on the host or device</span> 312<span class="comment">// ...</span> 313 314<span class="comment">// ensure the copy is completed</span> 315<span class="identifier">f</span><span class="special">.</span><span class="identifier">wait</span><span class="special">();</span> 316 317<span class="comment">// use data on the device (e.g. sort)</span> 318<span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">sort</span><span class="special">(</span><span class="identifier">device_vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">device_vector</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">queue</span><span class="special">);</span> 319</pre> 320<p> 321 </p> 322</div> 323<div class="section"> 324<div class="titlepage"><div><div><h3 class="title"> 325<a name="boost_compute.advanced_topics.performance_timing"></a><a class="link" href="advanced_topics.html#boost_compute.advanced_topics.performance_timing" title="Performance Timing">Performance 326 Timing</a> 327</h3></div></div></div> 328<p> 329 For example, to measure the time to copy a vector of data from the host to 330 the device: 331 </p> 332<p> 333</p> 334<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">vector</span><span class="special">></span> 335<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">cstdlib</span><span class="special">></span> 336<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span> 337 338<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">event</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> 339<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">system</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> 340<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">algorithm</span><span class="special">/</span><span class="identifier">copy</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> 341<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">async</span><span class="special">/</span><span class="identifier">future</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> 342<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">compute</span><span class="special">/</span><span class="identifier">container</span><span class="special">/</span><span class="identifier">vector</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span> 343 344<span class="keyword">namespace</span> <span class="identifier">compute</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">;</span> 345 346<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span> 347<span class="special">{</span> 348 <span class="comment">// get the default device</span> 349 <span class="identifier">compute</span><span class="special">::</span><span class="identifier">device</span> <span class="identifier">gpu</span> <span class="special">=</span> <span class="identifier">compute</span><span class="special">::</span><span class="identifier">system</span><span class="special">::</span><span class="identifier">default_device</span><span class="special">();</span> 350 351 <span class="comment">// create context for default device</span> 352 <span class="identifier">compute</span><span class="special">::</span><span class="identifier">context</span> <span class="identifier">context</span><span class="special">(</span><span class="identifier">gpu</span><span class="special">);</span> 353 354 <span class="comment">// create command queue with profiling enabled</span> 355 <span class="identifier">compute</span><span class="special">::</span><span class="identifier">command_queue</span> <span class="identifier">queue</span><span class="special">(</span> 356 <span class="identifier">context</span><span class="special">,</span> <span class="identifier">gpu</span><span class="special">,</span> <span class="identifier">compute</span><span class="special">::</span><span class="identifier">command_queue</span><span class="special">::</span><span class="identifier">enable_profiling</span> 357 <span class="special">);</span> 358 359 <span class="comment">// generate random data on the host</span> 360 <span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">host_vector</span><span class="special">(</span><span class="number">16000000</span><span class="special">);</span> 361 <span class="identifier">std</span><span class="special">::</span><span class="identifier">generate</span><span class="special">(</span><span class="identifier">host_vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">host_vector</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">rand</span><span class="special">);</span> 362 363 <span class="comment">// create a vector on the device</span> 364 <span class="identifier">compute</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">device_vector</span><span class="special">(</span><span class="identifier">host_vector</span><span class="special">.</span><span class="identifier">size</span><span class="special">(),</span> <span class="identifier">context</span><span class="special">);</span> 365 366 <span class="comment">// copy data from the host to the device</span> 367 <span class="identifier">compute</span><span class="special">::</span><span class="identifier">future</span><span class="special"><</span><span class="keyword">void</span><span class="special">></span> <span class="identifier">future</span> <span class="special">=</span> <span class="identifier">compute</span><span class="special">::</span><span class="identifier">copy_async</span><span class="special">(</span> 368 <span class="identifier">host_vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">host_vector</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">device_vector</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">queue</span> 369 <span class="special">);</span> 370 371 <span class="comment">// wait for copy to finish</span> 372 <span class="identifier">future</span><span class="special">.</span><span class="identifier">wait</span><span class="special">();</span> 373 374 <span class="comment">// get elapsed time from event profiling information</span> 375 <span class="identifier">boost</span><span class="special">::</span><span class="identifier">chrono</span><span class="special">::</span><span class="identifier">milliseconds</span> <span class="identifier">duration</span> <span class="special">=</span> 376 <span class="identifier">future</span><span class="special">.</span><span class="identifier">get_event</span><span class="special">().</span><span class="identifier">duration</span><span class="special"><</span><span class="identifier">boost</span><span class="special">::</span><span class="identifier">chrono</span><span class="special">::</span><span class="identifier">milliseconds</span><span class="special">>();</span> 377 378 <span class="comment">// print elapsed time in milliseconds</span> 379 <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"time: "</span> <span class="special"><<</span> <span class="identifier">duration</span><span class="special">.</span><span class="identifier">count</span><span class="special">()</span> <span class="special"><<</span> <span class="string">" ms"</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span> 380 381 <span class="keyword">return</span> <span class="number">0</span><span class="special">;</span> 382<span class="special">}</span> 383</pre> 384<p> 385 </p> 386</div> 387<div class="section"> 388<div class="titlepage"><div><div><h3 class="title"> 389<a name="boost_compute.advanced_topics.opencl_api_interoperability"></a><a class="link" href="advanced_topics.html#boost_compute.advanced_topics.opencl_api_interoperability" title="OpenCL API Interoperability">OpenCL 390 API Interoperability</a> 391</h3></div></div></div> 392<p> 393 The Boost Compute library is designed to easily interoperate with the OpenCL 394 API. All of the wrapped classes have conversion operators to their underlying 395 OpenCL types which allows them to be passed directly to the OpenCL functions. 396 </p> 397<p> 398 For example, 399</p> 400<pre class="programlisting"><span class="comment">// create context object</span> 401<span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">context</span> <span class="identifier">ctx</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">compute</span><span class="special">::</span><span class="identifier">default_context</span><span class="special">();</span> 402 403<span class="comment">// query number of devices using the OpenCL API</span> 404<span class="identifier">cl_uint</span> <span class="identifier">num_devices</span><span class="special">;</span> 405<span class="identifier">clGetContextInfo</span><span class="special">(</span><span class="identifier">ctx</span><span class="special">,</span> <span class="identifier">CL_CONTEXT_NUM_DEVICES</span><span class="special">,</span> <span class="keyword">sizeof</span><span class="special">(</span><span class="identifier">cl_uint</span><span class="special">),</span> <span class="special">&</span><span class="identifier">num_devices</span><span class="special">,</span> <span class="number">0</span><span class="special">);</span> 406<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"num_devices: "</span> <span class="special"><<</span> <span class="identifier">num_devices</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span> 407</pre> 408<p> 409 </p> 410</div> 411</div> 412<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr> 413<td align="left"></td> 414<td align="right"><div class="copyright-footer">Copyright © 2013, 2014 Kyle Lutz<p> 415 Distributed under the Boost Software License, Version 1.0. (See accompanying 416 file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>) 417 </p> 418</div></td> 419</tr></table> 420<hr> 421<div class="spirit-nav"> 422<a accesskey="p" href="tutorial.html"><img src="../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../index.html"><img src="../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="interop.html"><img src="../../../../../doc/src/images/next.png" alt="Next"></a> 423</div> 424</body> 425</html> 426