• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1Tutorial: Image Gradient
2========================
3
4.. contents::
5   :local:
6   :depth: 1
7
8This comprehensive (and long) tutorial will walk you through an example of
9using GIL to compute the image gradients.
10
11We will start with some very simple and non-generic code and make it more
12generic as we go along.  Let us start with a horizontal gradient and use the
13simplest possible approximation to a gradient - central difference.
14
15The gradient at pixel x can be approximated with the half-difference of its
16two neighboring pixels::
17
18  D[x] = (I[x-1] - I[x+1]) / 2
19
20For simplicity, we will also ignore the boundary cases - the pixels along the
21edges of the image for which one of the neighbors is not defined. The focus of
22this document is how to use GIL, not how to create a good gradient generation
23algorithm.
24
25Interface and Glue Code
26-----------------------
27
28Let us first start with 8-bit unsigned grayscale image as the input and 8-bit
29signed grayscale image as the output.
30
31Here is how the interface to our algorithm looks like:
32
33.. code-block:: cpp
34
35  #include <boost/gil.hpp>
36  using namespace boost::gil;
37
38  void x_gradient(gray8c_view_t const& src, gray8s_view_t const& dst)
39  {
40    assert(src.dimensions() == dst.dimensions());
41    ...    // compute the gradient
42  }
43
44``gray8c_view_t`` is the type of the source image view - an 8-bit grayscale
45view, whose pixels are read-only (denoted by the "c").
46
47The output is a grayscale view with a 8-bit signed (denoted by the "s")
48integer channel type. See Appendix 1 for the complete convention GIL uses to
49name concrete types.
50
51GIL makes a distinction between an image and an image view.
52A GIL **image view**, is a shallow, lightweight view of a rectangular grid of
53pixels. It provides access to the pixels but does not own the pixels.
54Copy-constructing a view does not deep-copy the pixels. Image views do not
55propagate their constness to the pixels and should always be taken by a const
56reference. Whether a view is mutable or read-only (immutable) is a property of
57the view type.
58
59A GIL `image`, on the other hand, is a view with associated ownership.
60It is a container of pixels; its constructor/destructor allocates/deallocates
61the pixels, its copy-constructor performs deep-copy of the pixels and its
62``operator==`` performs deep-compare of the pixels. Images also propagate
63their constness to their pixels - a constant reference to an image will not
64allow for modifying its pixels.
65
66Most GIL algorithms operate on image views; images are rarely
67needed. GIL's design is very similar to that of the STL. The STL
68equivalent of GIL's image is a container, like ``std::vector``,
69whereas GIL's image view corresponds to STL range, which is often
70represented with a pair of iterators. STL algorithms operate on
71ranges, just like GIL algorithms operate on image views.
72
73GIL's image views can be constructed from raw data - the dimensions,
74the number of bytes per row and the pixels, which for chunky views are
75represented with one pointer. Here is how to provide the glue between
76your code and GIL:
77
78.. code-block:: cpp
79
80  void ComputeXGradientGray8(
81      unsigned char const* src_pixels, ptrdiff_t src_row_bytes,
82      int w, int h,
83      signed char* dst_pixels, ptrdiff_t dst_row_bytes)
84  {
85    gray8c_view_t src = interleaved_view(w, h, (gray8_pixel_t const*)src_pixels, src_row_bytes);
86    gray8s_view_t dst = interleaved_view(w, h, (gray8s_pixel_t*)dst_pixels, dst_row_bytes);
87    x_gradient(src, dst);
88  }
89
90This glue code is very fast and views are lightweight - in the above example
91the views have a size of 16 bytes. They consist of a pointer to the top left
92pixel and three integers - the width, height, and number of bytes per row.
93
94First Implementation
95--------------------
96
97Focusing on simplicity at the expense of speed, we can compute the horizontal
98gradient like this:
99
100.. code-block:: cpp
101
102  void x_gradient(gray8c_view_t const& src, gray8s_view_t const& dst)
103  {
104    for (int y = 0; y < src.height(); ++y)
105        for (int x = 1; x < src.width() - 1; ++x)
106            dst(x, y) = (src(x-1, y) - src(x+1, y)) / 2;
107  }
108
109We use image view's ``operator(x,y)`` to get a reference to the pixel at a
110given location and we set it to the half-difference of its left and right
111neighbors.  ``operator()`` returns a reference to a grayscale pixel.
112A grayscale pixel is convertible to its channel type (``unsigned char`` for
113``src``) and it can be copy-constructed from a channel.
114(This is only true for grayscale pixels).
115
116While the above code is easy to read, it is not very fast, because the binary
117``operator()`` computes the location of the pixel in a 2D grid, which involves
118addition and multiplication. Here is a faster version of the above:
119
120.. code-block:: cpp
121
122  void x_gradient(gray8c_view_t const& src, gray8s_view_t const& dst)
123  {
124    for (int y = 0; y < src.height(); ++y)
125    {
126        gray8c_view_t::x_iterator src_it = src.row_begin(y);
127        gray8s_view_t::x_iterator dst_it = dst.row_begin(y);
128
129        for (int x=1; x < src.width() - 1; ++x)
130            dst_it[x] = (src_it[x-1] - src_it[x+1]) / 2;
131    }
132  }
133
134We use pixel iterators initialized at the beginning of each row. GIL's
135iterators are Random Access Traversal iterators. If you are not
136familiar with random access iterators, think of them as if they were
137pointers. In fact, in the above example the two iterator types are raw
138C pointers and their ``operator[]`` is a fast pointer indexing
139operator.
140
141The code to compute gradient in the vertical direction is very
142similar:
143
144.. code-block: cpp
145
146  void y_gradient(gray8c_view_t const& src, gray8s_view_t const& dst)
147  {
148    for (int x = 0; x < src.width(); ++x)
149    {
150        gray8c_view_t::y_iterator src_it = src.col_begin(x);
151        gray8s_view_t::y_iterator dst_it = dst.col_begin(x);
152
153        for (int y = 1; y < src.height() - 1; ++y)
154            dst_it[y] = (src_it[y-1] - src_it[y+1]) / 2;
155    }
156  }
157
158Instead of looping over the rows, we loop over each column and create a
159``y_iterator``, an iterator moving vertically. In this case a simple pointer
160cannot be used because the distance between two adjacent pixels equals the
161number of bytes in each row of the image. GIL uses here a special step
162iterator class whose size is 8 bytes - it contains a raw C pointer and a step.
163Its ``operator[]`` multiplies the index by its step.
164
165The above version of ``y_gradient``, however, is much slower (easily an order
166of magnitude slower) than ``x_gradient`` because of the memory access pattern;
167traversing an image vertically results in lots of cache misses. A much more
168efficient and cache-friendly version will iterate over the columns in the inner
169loop:
170
171.. code-block:: cpp
172
173  void y_gradient(gray8c_view_t const& src, gray8s_view_t const& dst)
174  {
175    for (int y = 1; y < src.height() - 1; ++y)
176    {
177        gray8c_view_t::x_iterator src1_it = src.row_begin(y-1);
178        gray8c_view_t::x_iterator src2_it = src.row_begin(y+1);
179        gray8s_view_t::x_iterator dst_it = dst.row_begin(y);
180
181        for (int x = 0; x < src.width(); ++x)
182        {
183            *dst_it = ((*src1_it) - (*src2_it)) / 2;
184            ++dst_it;
185            ++src1_it;
186            ++src2_it;
187        }
188    }
189  }
190
191This sample code also shows an alternative way of using pixel iterators -
192instead of ``operator[]`` one could use increments and dereferences.
193
194Using Locators
195--------------
196
197Unfortunately this cache-friendly version requires the extra hassle of
198maintaining two separate iterators in the source view. For every pixel, we
199want to access its neighbors above and below it. Such relative access can be
200done with GIL locators:
201
202.. code-block:: cpp
203
204  void y_gradient(gray8c_view_t const& src, gray8s_view_t const& dst)
205  {
206    gray8c_view_t::xy_locator src_loc = src.xy_at(0,1);
207    for (int y = 1; y < src.height() - 1; ++y)
208    {
209        gray8s_view_t::x_iterator dst_it  = dst.row_begin(y);
210
211        for (int x = 0; x < src.width(); ++x)
212    {
213            (*dst_it) = (src_loc(0,-1) - src_loc(0,1)) / 2;
214            ++dst_it;
215            ++src_loc.x(); // each dimension can be advanced separately
216        }
217        src_loc+=point<std::ptrdiff_t>(-src.width(), 1); // carriage return
218    }
219  }
220
221The first line creates a locator pointing to the first pixel of the
222second row of the source view. A GIL pixel locator is very similar to
223an iterator, except that it can move both horizontally and
224vertically. ``src_loc.x()`` and ``src_loc.y()`` return references to a
225horizontal and a vertical iterator respectively, which can be used to
226move the locator along the desired dimension, as shown
227above. Additionally, the locator can be advanced in both dimensions
228simultaneously using its ``operator+=`` and ``operator-=``. Similar to
229image views, locators provide binary ``operator()`` which returns a
230reference to a pixel with a relative offset to the current locator
231position. For example, ``src_loc(0,1)`` returns a reference to the
232neighbor below the current pixel.  Locators are very lightweight
233objects - in the above example the locator has a size of 8 bytes - it
234consists of a raw pointer to the current pixel and an int indicating
235the number of bytes from one row to the next (which is the step when
236moving vertically). The call to ``++src_loc.x()`` corresponds to a
237single C pointer increment.  However, the example above performs more
238computations than necessary. The code ``src_loc(0,1)`` has to compute
239the offset of the pixel in two dimensions, which is slow.  Notice
240though that the offset of the two neighbors is the same, regardless of
241the pixel location. To improve the performance, GIL can cache and
242reuse this offset::
243
244  void y_gradient(gray8c_view_t const& src, gray8s_view_t const& dst)
245  {
246    gray8c_view_t::xy_locator src_loc = src.xy_at(0,1);
247    gray8c_view_t::xy_locator::cached_location_t above = src_loc.cache_location(0,-1);
248    gray8c_view_t::xy_locator::cached_location_t below = src_loc.cache_location(0, 1);
249
250    for (int y = 1; y < src.height() - 1; ++y)
251    {
252        gray8s_view_t::x_iterator dst_it = dst.row_begin(y);
253
254        for (int x = 0; x < src.width(); ++x)
255    {
256            (*dst_it) = (src_loc[above] - src_loc[below]) / 2;
257            ++dst_it;
258            ++src_loc.x();
259        }
260        src_loc+=point<std::ptrdiff_t>(-src.width(), 1);
261    }
262  }
263
264In this example ``src_loc[above]`` corresponds to a fast pointer indexing
265operation and the code is efficient.
266
267Creating a Generic Version of GIL Algorithms
268--------------------------------------------
269
270Let us make our ``x_gradient`` more generic. It should work with any image
271views, as long as they have the same number of channels. The gradient
272operation is to be computed for each channel independently.
273
274Here is how the new interface looks like:
275
276.. code-block:: cpp
277
278  template <typename SrcView, typename DstView>
279  void x_gradient(const SrcView& src, const DstView& dst)
280  {
281    gil_function_requires<ImageViewConcept<SrcView> >();
282    gil_function_requires<MutableImageViewConcept<DstView> >();
283    gil_function_requires
284    <
285      ColorSpacesCompatibleConcept
286      <
287        typename color_space_type<SrcView>::type,
288        typename color_space_type<DstView>::type
289      >
290    >();
291
292    ... // compute the gradient
293  }
294
295The new algorithm now takes the types of the input and output image
296views as template parameters.  That allows using both built-in GIL
297image views, as well as any user-defined image view classes.  The
298first three lines are optional; they use ``boost::concept_check`` to
299ensure that the two arguments are valid GIL image views, that the
300second one is mutable and that their color spaces are compatible
301(i.e. have the same set of channels).
302
303GIL does not require using its own built-in constructs. You are free
304to use your own channels, color spaces, iterators, locators, views and
305images.  However, to work with the rest of GIL they have to satisfy a
306set of requirements; in other words, they have to \e model the
307corresponding GIL _concept_.  GIL's concepts are defined in the user
308guide.
309
310One of the biggest drawbacks of using templates and generic
311programming in C++ is that compile errors can be very difficult to
312comprehend.  This is a side-effect of the lack of early type
313checking - a generic argument may not satisfy the requirements of a
314function, but the incompatibility may be triggered deep into a nested
315call, in code unfamiliar and hardly related to the problem.  GIL uses
316``boost::concept_check`` to mitigate this problem. The above three
317lines of code check whether the template parameters are valid models
318of their corresponding concepts.  If a model is incorrect, the compile
319error will be inside ``gil_function_requires``, which is much closer
320to the problem and easier to track. Furthermore, such checks get
321compiled out and have zero performance overhead. The disadvantage of
322using concept checks is the sometimes severe impact they have on
323compile time. This is why GIL performs concept checks only in debug
324mode, and only if ``BOOST_GIL_USE_CONCEPT_CHECK`` is defined (off by
325default).
326
327The body of the generic function is very similar to that of the
328concrete one. The biggest difference is that we need to loop over the
329channels of the pixel and compute the gradient for each channel:
330
331.. code-block:: cpp
332
333  template <typename SrcView, typename DstView>
334  void x_gradient(const SrcView& src, const DstView& dst)
335  {
336    for (int y=0; y < src.height(); ++y)
337    {
338        typename SrcView::x_iterator src_it = src.row_begin(y);
339        typename DstView::x_iterator dst_it = dst.row_begin(y);
340
341        for (int x = 1; x < src.width() - 1; ++x)
342            for (int c = 0; c < num_channels<SrcView>::value; ++c)
343                dst_it[x][c] = (src_it[x-1][c]- src_it[x+1][c]) / 2;
344    }
345  }
346
347Having an explicit loop for each channel could be a performance problem.
348GIL allows us to abstract out such per-channel operations:
349
350.. code-block:: cpp
351
352  template <typename Out>
353  struct halfdiff_cast_channels
354  {
355    template <typename T> Out operator()(T const& in1, T const& in2) const
356    {
357        return Out((in1 - in2) / 2);
358    }
359  };
360
361  template <typename SrcView, typename DstView>
362  void x_gradient(const SrcView& src, const DstView& dst)
363  {
364    typedef typename channel_type<DstView>::type dst_channel_t;
365
366    for (int y=0; y < src.height(); ++y)
367    {
368        typename SrcView::x_iterator src_it = src.row_begin(y);
369        typename DstView::x_iterator dst_it = dst.row_begin(y);
370
371        for (int x=1; x < src.width() - 1; ++x)
372        {
373            static_transform(src_it[x-1], src_it[x+1], dst_it[x],
374                halfdiff_cast_channels<dst_channel_t>());
375        }
376    }
377  }
378
379The ``static_transform`` is an example of a channel-level GIL algorithm.
380Other such algorithms are ``static_generate``, ``static_fill`` and
381``static_for_each``. They are the channel-level equivalents of STL
382``generate``, ``transform``, ``fill`` and ``for_each`` respectively.
383GIL channel algorithms use static recursion to unroll the loops; they never
384loop over the channels explicitly.
385
386Note that sometimes modern compilers (at least Visual Studio 8) already unroll
387channel-level loops, such as the one above. However, another advantage of
388using GIL's channel-level algorithms is that they pair the channels
389semantically, not based on their order in memory. For example, the above
390example will properly match an RGB source with a BGR destination.
391
392Here is how we can use our generic version with images of different types:
393
394.. code-block:: cpp
395
396  // Calling with 16-bit grayscale data
397  void XGradientGray16_Gray32(
398      unsigned short const* src_pixels, ptrdiff_t src_row_bytes,
399      int w, int h,
400      signed int* dst_pixels, ptrdiff_t dst_row_bytes)
401  {
402    gray16c_view_t src=interleaved_view(w, h, (gray16_pixel_t const*)src_pixels, src_row_bytes);
403    gray32s_view_t dst=interleaved_view(w, h, (gray32s_pixel_t*)dst_pixels, dst_row_bytes);
404    x_gradient(src,dst);
405  }
406
407  // Calling with 8-bit RGB data into 16-bit BGR
408  void XGradientRGB8_BGR16(
409      unsigned char const* src_pixels, ptrdiff_t src_row_bytes,
410      int w, int h,
411      signed short* dst_pixels, ptrdiff_t dst_row_bytes)
412  {
413    rgb8c_view_t  src = interleaved_view(w, h, (rgb8_pixel_t const*)src_pixels, src_row_bytes);
414    bgr16s_view_t dst = interleaved_view(w, h, (bgr16s_pixel_t*)dst_pixels, dst_row_bytes);
415    x_gradient(src, dst);
416  }
417
418  // Either or both the source and the destination could be planar - the gradient code does not change
419  void XGradientPlanarRGB8_RGB32(
420      unsigned short const* src_r, unsigned short const* src_g, unsigned short const* src_b,
421      ptrdiff_t src_row_bytes, int w, int h,
422      signed int* dst_pixels, ptrdiff_t dst_row_bytes)
423  {
424    rgb16c_planar_view_t src = planar_rgb_view (w, h, src_r, src_g, src_b,        src_row_bytes);
425    rgb32s_view_t        dst = interleaved_view(w, h,(rgb32s_pixel_t*)dst_pixels, dst_row_bytes);
426    x_gradient(src,dst);
427  }
428
429As these examples illustrate, both the source and the destination can be
430interleaved or planar, of any channel depth (assuming the destination channel
431is assignable to the source), and of any compatible color spaces.
432
433GIL 2.1 can also natively represent images whose channels are not
434byte-aligned, such as 6-bit RGB222 image or a 1-bit Gray1 image.
435GIL algorithms apply to these images natively. See the design guide or sample
436files for more on using such images.
437
438Image View Transformations
439--------------------------
440
441One way to compute the y-gradient is to rotate the image by 90 degrees,
442compute the x-gradient and rotate the result back.
443Here is how to do this in GIL:
444
445.. code-block:: cpp
446
447  template <typename SrcView, typename DstView>
448  void y_gradient(const SrcView& src, const DstView& dst)
449  {
450    x_gradient(rotated90ccw_view(src), rotated90ccw_view(dst));
451  }
452
453``rotated90ccw_view`` takes an image view and returns an image view
454representing 90-degrees counter-clockwise rotation of its input. It is
455an example of a GIL view transformation function. GIL provides a
456variety of transformation functions that can perform any axis-aligned
457rotation, transpose the view, flip it vertically or horizontally,
458extract a rectangular subimage, perform color conversion, subsample
459view, etc. The view transformation functions are fast and shallow -
460they don't copy the pixels, they just change the "coordinate system"
461of accessing the pixels. ``rotated90cw_view``, for example, returns a
462view whose horizontal iterators are the vertical iterators of the
463original view. The above code to compute ``y_gradient`` is slow
464because of the memory access pattern; using ``rotated90cw_view`` does
465not make it any slower.
466
467Another example: suppose we want to compute the gradient of the N-th
468channel of a color image. Here is how to do that:
469
470.. code-block:: cpp
471
472  template <typename SrcView, typename DstView>
473  void nth_channel_x_gradient(const SrcView& src, int n, const DstView& dst)
474  {
475    x_gradient(nth_channel_view(src, n), dst);
476  }
477
478``nth_channel_view`` is a view transformation function that takes any
479view and returns a single-channel (grayscale) view of its N-th
480channel.  For interleaved RGB view, for example, the returned view is
481a step view - a view whose horizontal iterator skips over two channels
482when incremented.  If applied on a planar RGB view, the returned type
483is a simple grayscale view whose horizontal iterator is a C pointer.
484Image view transformation functions can be piped together. For
485example, to compute the y gradient of the second channel of the even
486pixels in the view, use:
487
488.. code-block:: cpp
489
490  y_gradient(subsampled_view(nth_channel_view(src, 1), 2,2), dst);
491
492GIL can sometimes simplify piped views. For example, two nested
493subsampled views (views that skip over pixels in X and in Y) can be
494represented as a single subsampled view whose step is the product of
495the steps of the two views.
496
4971D pixel iterators
498------------------
499
500Let's go back to ``x_gradient`` one more time.  Many image view
501algorithms apply the same operation for each pixel and GIL provides an
502abstraction to handle them. However, our algorithm has an unusual
503access pattern, as it skips the first and the last column. It would be
504nice and instructional to see how we can rewrite it in canonical
505form. The way to do that in GIL is to write a version that works for
506every pixel, but apply it only on the subimage that excludes the first
507and last column:
508
509.. code-block:: cpp
510
511  void x_gradient_unguarded(gray8c_view_t const& src, gray8s_view_t const& dst)
512  {
513    for (int y=0; y < src.height(); ++y)
514    {
515        gray8c_view_t::x_iterator src_it = src.row_begin(y);
516        gray8s_view_t::x_iterator dst_it = dst.row_begin(y);
517
518        for (int x = 0; x < src.width(); ++x)
519            dst_it[x] = (src_it[x-1] - src_it[x+1]) / 2;
520    }
521  }
522
523  void x_gradient(gray8c_view_t const& src, gray8s_view_t const& dst)
524  {
525    assert(src.width()>=2);
526    x_gradient_unguarded(subimage_view(src, 1, 0, src.width()-2, src.height()),
527                         subimage_view(dst, 1, 0, src.width()-2, src.height()));
528  }
529
530``subimage_view`` is another example of a GIL view transformation
531function. It takes a source view and a rectangular region (in this
532case, defined as x_min,y_min,width,height) and returns a view
533operating on that region of the source view. The above implementation
534has no measurable performance degradation from the version that
535operates on the original views.
536
537Now that ``x_gradient_unguarded`` operates on every pixel, we can
538rewrite it more compactly:
539
540.. code-block:: cpp
541
542  void x_gradient_unguarded(gray8c_view_t const& src, gray8s_view_t const& dst)
543  {
544    gray8c_view_t::iterator src_it = src.begin();
545    for (gray8s_view_t::iterator dst_it = dst.begin(); dst_it!=dst.end(); ++dst_it, ++src_it)
546        *dst_it = (src_it.x()[-1] - src_it.x()[1]) / 2;
547  }
548
549GIL image views provide ``begin()`` and ``end()`` methods that return
550one dimensional pixel iterators which iterate over each pixel in the
551view, left to right and top to bottom. They do a proper "carriage
552return" - they skip any unused bytes at the end of a row. As such,
553they are slightly suboptimal, because they need to keep track of their
554current position with respect to the end of the row. Their increment
555operator performs one extra check (are we at the end of the row?), a
556check that is avoided if two nested loops are used instead. These
557iterators have a method ``x()`` which returns the more lightweight
558horizontal iterator that we used previously. Horizontal iterators have
559no notion of the end of rows. In this case, the horizontal iterators
560are raw C pointers. In our example, we must use the horizontal
561iterators to access the two neighbors properly, since they could
562reside outside the image view.
563
564STL Equivalent Algorithms
565-------------------------
566
567GIL provides STL equivalents of many algorithms. For example,
568``std::transform`` is an STL algorithm that sets each element in a
569destination range the result of a generic function taking the
570corresponding element of the source range. In our example, we want to
571assign to each destination pixel the value of the half-difference of
572the horizontal neighbors of the corresponding source pixel.  If we
573abstract that operation in a function object, we can use GIL's
574``transform_pixel_positions`` to do that:
575
576.. code-block:: cpp
577
578  struct half_x_difference
579  {
580    int operator()(const gray8c_loc_t& src_loc) const
581    {
582        return (src_loc.x()[-1] - src_loc.x()[1]) / 2;
583    }
584  };
585
586  void x_gradient_unguarded(gray8c_view_t const& src, gray8s_view_t const& dst)
587  {
588    transform_pixel_positions(src, dst, half_x_difference());
589  }
590
591GIL provides the algorithms ``for_each_pixel`` and
592``transform_pixels`` which are image view equivalents of STL
593``std::for_each`` and ``std::transform``. It also provides
594``for_each_pixel_position`` and ``transform_pixel_positions``, which
595instead of references to pixels, pass to the generic function pixel
596locators. This allows for more powerful functions that can use the
597pixel neighbors through the passed locators.  GIL algorithms iterate
598through the pixels using the more efficient two nested loops (as
599opposed to the single loop using 1-D iterators)
600
601Color Conversion
602----------------
603
604Instead of computing the gradient of each color plane of an image, we
605often want to compute the gradient of the luminosity. In other words,
606we want to convert the color image to grayscale and compute the
607gradient of the result. Here how to compute the luminosity gradient of
608a 32-bit float RGB image:
609
610.. code-block:: cpp
611
612  void x_gradient_rgb_luminosity(rgb32fc_view_t const& src, gray8s_view_t const& dst)
613  {
614    x_gradient(color_converted_view<gray8_pixel_t>(src), dst);
615  }
616
617``color_converted_view`` is a GIL view transformation function that
618takes any image view and returns a view in a target color space and
619channel depth (specified as template parameters). In our example, it
620constructs an 8-bit integer grayscale view over 32-bit float RGB
621pixels. Like all other view transformation functions,
622``color_converted_view`` is very fast and shallow. It doesn't copy the
623data or perform any color conversion. Instead it returns a view that
624performs color conversion every time its pixels are accessed.
625
626In the generic version of this algorithm we might like to convert the
627color space to grayscale, but keep the channel depth the same. We do
628that by constructing the type of a GIL grayscale pixel with the same
629channel as the source, and color convert to that pixel type:
630
631.. code-block:: cpp
632
633  template <typename SrcView, typename DstView>
634  void x_luminosity_gradient(SrcView const& src, DstView const& dst)
635  {
636    using gray_pixel_t = pixel<typename channel_type<SrcView>::type, gray_layout_t>;
637    x_gradient(color_converted_view<gray_pixel_t>(src), dst);
638  }
639
640When the destination color space and channel type happens to be the
641same as the source one, color conversion is unnecessary. GIL detects
642this case and avoids calling the color conversion code at all -
643i.e. ``color_converted_view`` returns back the source view unchanged.
644
645Image
646-----
647
648The above example has a performance problem - ``x_gradient``
649dereferences most source pixels twice, which will cause the above code
650to perform color conversion twice.  Sometimes it may be more efficient
651to copy the color converted image into a temporary buffer and use it
652to compute the gradient - that way color conversion is invoked once
653per pixel. Using our non-generic version we can do it like this:
654
655.. code-block:: cpp
656
657  void x_luminosity_gradient(rgb32fc_view_t const& src, gray8s_view_t const& dst)
658  {
659    gray8_image_t ccv_image(src.dimensions());
660    copy_pixels(color_converted_view<gray8_pixel_t>(src), view(ccv_image));
661
662    x_gradient(const_view(ccv_image), dst);
663  }
664
665First we construct an 8-bit grayscale image with the same dimensions
666as our source. Then we copy a color-converted view of the source into
667the temporary image.  Finally we use a read-only view of the temporary
668image in our ``x_gradient algorithm``. As the example shows, GIL
669provides global functions ``view`` and ``const_view`` that take an
670image and return a mutable or an immutable view of its pixels.
671
672Creating a generic version of the above is a bit trickier:
673
674.. code-block:: cpp
675
676  template <typename SrcView, typename DstView>
677  void x_luminosity_gradient(const SrcView& src, const DstView& dst)
678  {
679    using d_channel_t = typename channel_type<DstView>::type;
680    using channel_t = typename channel_convert_to_unsigned<d_channel_t>::type;
681    using gray_pixel_t = pixel<channel_t, gray_layout_t>;
682    using gray_image_t = image<gray_pixel_t, false>;
683
684    gray_image_t ccv_image(src.dimensions());
685    copy_pixels(color_converted_view<gray_pixel_t>(src), view(ccv_image));
686    x_gradient(const_view(ccv_image), dst);
687  }
688
689First we use the ``channel_type`` metafunction to get the channel type
690of the destination view. A metafunction is a function operating on
691types. In GIL metafunctions are class templates (declared with
692``struct`` type specifier) which take their parameters as template
693parameters and return their result in a nested typedef called
694``type``. In this case, ``channel_type`` is a unary metafunction which
695in this example is called with the type of an image view and returns
696the type of the channel associated with that image view.
697
698GIL constructs that have an associated pixel type, such as pixels,
699pixel iterators, locators, views and images, all model
700``PixelBasedConcept``, which means that they provide a set of
701metafunctions to query the pixel properties, such as ``channel_type``,
702``color_space_type``, ``channel_mapping_type``, and ``num_channels``.
703
704After we get the channel type of the destination view, we use another
705metafunction to remove its sign (if it is a signed integral type) and
706then use it to generate the type of a grayscale pixel. From the pixel
707type we create the image type. GIL's image class is specialized over
708the pixel type and a boolean indicating whether the image should be
709planar or interleaved.  Single-channel (grayscale) images in GIL must
710always be interleaved. There are multiple ways of constructing types
711in GIL. Instead of instantiating the classes directly we could have
712used type factory metafunctions. The following code is equivalent:
713
714.. code-block:: cpp
715
716  template <typename SrcView, typename DstView>
717  void x_luminosity_gradient(SrcView const& src, DstView const& dst)
718  {
719    typedef typename channel_type<DstView>::type d_channel_t;
720    typedef typename channel_convert_to_unsigned<d_channel_t>::type channel_t;
721    typedef typename image_type<channel_t, gray_layout_t>::type gray_image_t;
722    typedef typename gray_image_t::value_type gray_pixel_t;
723
724    gray_image_t ccv_image(src.dimensions());
725    copy_and_convert_pixels(src, view(ccv_image));
726    x_gradient(const_view(ccv_image), dst);
727  }
728
729GIL provides a set of metafunctions that generate GIL types -
730``image_type`` is one such meta-function that constructs the type of
731an image from a given channel type, color layout, and
732planar/interleaved option (the default is interleaved). There are also
733similar meta-functions to construct the types of pixel references,
734iterators, locators and image views. GIL also has metafunctions
735``derived_pixel_reference_type``, ``derived_iterator_type``,
736``derived_view_type`` and ``derived_image_type`` that construct the
737type of a GIL construct from a given source one by changing one or
738more properties of the type and keeping the rest.
739
740From the image type we can use the nested typedef ``value_type`` to
741obtain the type of a pixel. GIL images, image views and locators have
742nested typedefs ``value_type`` and ``reference`` to obtain the type of
743the pixel and a reference to the pixel. If you have a pixel iterator,
744you can get these types from its ``iterator_traits``. Note also the
745algorithm ``copy_and_convert_pixels``, which is an abbreviated version
746of ``copy_pixels`` with a color converted source view.
747
748Virtual Image Views
749-------------------
750
751So far we have been dealing with images that have pixels stored in
752memory. GIL allows you to create an image view of an arbitrary image,
753including a synthetic function. To demonstrate this, let us create a
754view of the Mandelbrot set.  First, we need to create a function
755object that computes the value of the Mandelbrot set at a given
756location (x,y) in the image:
757
758.. code-block:: cpp
759
760  // models PixelDereferenceAdaptorConcept
761  struct mandelbrot_fn
762  {
763    typedef point<ptrdiff_t>   point_t;
764
765    typedef mandelbrot_fn       const_t;
766    typedef gray8_pixel_t       value_type;
767    typedef value_type          reference;
768    typedef value_type          const_reference;
769    typedef point_t             argument_type;
770    typedef reference           result_type;
771    static bool constexpr is_mutable = false;
772
773    mandelbrot_fn() {}
774    mandelbrot_fn(const point_t& sz) : _img_size(sz) {}
775
776    result_type operator()(const point_t& p) const
777    {
778        // normalize the coords to (-2..1, -1.5..1.5)
779        double t=get_num_iter(point<double>(p.x/(double)_img_size.x*3-2, p.y/(double)_img_size.y*3-1.5f));
780        return value_type((bits8)(pow(t,0.2)*255));   // raise to power suitable for viewing
781    }
782  private:
783    point_t _img_size;
784
785    double get_num_iter(const point<double>& p) const
786    {
787        point<double> Z(0,0);
788        for (int i=0; i<100; ++i)  // 100 iterations
789    {
790            Z = point<double>(Z.x*Z.x - Z.y*Z.y + p.x, 2*Z.x*Z.y + p.y);
791            if (Z.x*Z.x + Z.y*Z.y > 4)
792                return i/(double)100;
793        }
794        return 0;
795    }
796  };
797
798We can now use GIL's ``virtual_2d_locator`` with this function object
799to construct a Mandelbrot view of size 200x200 pixels:
800
801.. code-block:: cpp
802
803  typedef mandelbrot_fn::point_t point_t;
804  typedef virtual_2d_locator<mandelbrot_fn,false> locator_t;
805  typedef image_view<locator_t> my_virt_view_t;
806
807  point_t dims(200,200);
808
809  // Construct a Mandelbrot view with a locator, taking top-left corner (0,0) and step (1,1)
810  my_virt_view_t mandel(dims, locator_t(point_t(0,0), point_t(1,1), mandelbrot_fn(dims)));
811
812We can treat the synthetic view just like a real one. For example,
813let's invoke our ``x_gradient`` algorithm to compute the gradient of
814the 90-degree rotated view of the Mandelbrot set and save the original
815and the result:
816
817.. code-block:: cpp
818
819  gray8s_image_t img(dims);
820  x_gradient(rotated90cw_view(mandel), view(img));
821
822  // Save the Mandelbrot set and its 90-degree rotated gradient (jpeg cannot save signed char; must convert to unsigned char)
823  jpeg_write_view("mandel.jpg",mandel);
824  jpeg_write_view("mandel_grad.jpg",color_converted_view<gray8_pixel_t>(const_view(img)));
825
826Here is what the two files look like:
827
828.. image:: ../images/mandel.jpg
829
830Run-Time Specified Images and Image Views
831-----------------------------------------
832
833So far we have created a generic function that computes the image
834gradient of an image view template specialization.  Sometimes,
835however, the properties of an image view, such as its color space and
836channel depth, may not be available at compile time.  GIL's
837``dynamic_image`` extension allows for working with GIL constructs
838that are specified at run time, also called _variants_. GIL provides
839models of a run-time instantiated image, ``any_image``, and a run-time
840instantiated image view, ``any_image_view``. The mechanisms are in
841place to create other variants, such as ``any_pixel``,
842``any_pixel_iterator``, etc.  Most of GIL's algorithms and all of the
843view transformation functions also work with run-time instantiated
844image views and binary algorithms, such as ``copy_pixels`` can have
845either or both arguments be variants.
846
847Lets make our ``x_luminosity_gradient`` algorithm take a variant image
848view. For simplicity, let's assume that only the source view can be a
849variant.  (As an example of using multiple variants, see GIL's image
850view algorithm overloads taking multiple variants.)
851
852First, we need to make a function object that contains the templated
853destination view and has an application operator taking a templated
854source view:
855
856.. code-block:: cpp
857
858  #include <boost/gil/extension/dynamic_image/dynamic_image_all.hpp>
859
860  template <typename DstView>
861  struct x_gradient_obj
862  {
863    typedef void result_type;        // required typedef
864
865    const DstView& _dst;
866    x_gradient_obj(const DstView& dst) : _dst(dst) {}
867
868    template <typename SrcView>
869    void operator()(const SrcView& src) const { x_luminosity_gradient(src, _dst); }
870  };
871
872The second step is to provide an overload of ``x_luminosity_gradient`` that
873takes image view variant and calls GIL's ``apply_operation`` passing it the
874function object:
875
876.. code-block:: cpp
877
878  template <typename SrcViews, typename DstView>
879  void x_luminosity_gradient(const any_image_view<SrcViews>& src, const DstView& dst)
880  {
881    apply_operation(src, x_gradient_obj<DstView>(dst));
882  }
883
884``any_image_view<SrcViews>`` is the image view variant. It is
885templated over ``SrcViews``, an enumeration of all possible view types
886the variant can take.  ``src`` contains inside an index of the
887currently instantiated type, as well as a block of memory containing
888the instance.  ``apply_operation`` goes through a switch statement
889over the index, each case of which casts the memory to the correct
890view type and invokes the function object with it. Invoking an
891algorithm on a variant has the overhead of one switch
892statement. Algorithms that perform an operation for each pixel in an
893image view have practically no performance degradation when used with
894a variant.
895
896Here is how we can construct a variant and invoke the algorithm:
897
898.. code-block:: cpp
899
900  #include <boost/mp11.hpp>
901  #include <boost/gil/extension/io/jpeg/old.hpp>
902
903  typedef mp11::mp_list<gray8_image_t, gray16_image_t, rgb8_image_t, rgb16_image_t> my_img_types;
904  any_image<my_img_types> runtime_image;
905  jpeg_read_image("input.jpg", runtime_image);
906
907  gray8s_image_t gradient(runtime_image.dimensions());
908  x_luminosity_gradient(const_view(runtime_image), view(gradient));
909  jpeg_write_view("x_gradient.jpg", color_converted_view<gray8_pixel_t>(const_view(gradient)));
910
911In this example, we create an image variant that could be 8-bit or
91216-bit RGB or grayscale image. We then use GIL's I/O extension to load
913the image from file in its native color space and channel depth. If
914none of the allowed image types matches the image on disk, an
915exception will be thrown.  We then construct a 8 bit signed
916(i.e. ``char``) image to store the gradient and invoke ``x_gradient``
917on it. Finally we save the result into another file.  We save the view
918converted to 8-bit unsigned, because JPEG I/O does not support signed
919char.
920
921Note how free functions and methods such as ``jpeg_read_image``,
922``dimensions``, ``view`` and ``const_view`` work on both templated and
923variant types.  For templated images ``view(img)`` returns a templated
924view, whereas for image variants it returns a view variant.  For
925example, the return type of ``view(runtime_image)`` is
926``any_image_view<Views>`` where ``Views`` enumerates four views
927corresponding to the four image types.  ``const_view(runtime_image)``
928returns a ``any_image_view`` of the four read-only view types, etc.
929
930A warning about using variants: instantiating an algorithm with a
931variant effectively instantiates it with every possible type the
932variant can take.  For binary algorithms, the algorithm is
933instantiated with every possible combination of the two input types!
934This can take a toll on both the compile time and the executable size.
935
936Conclusion
937----------
938
939This tutorial provides a glimpse at the challenges associated with
940writing generic and efficient image processing algorithms in GIL.  We
941have taken a simple algorithm and shown how to make it work with image
942representations that vary in bit depth, color space, ordering of the
943channels, and planar/interleaved structure. We have demonstrated that
944the algorithm can work with fully abstracted virtual images, and even
945images whose type is specified at run time. The associated video
946presentation also demonstrates that even for complex scenarios the
947generated assembly is comparable to that of a C version of the
948algorithm, hand-written for the specific image types.
949
950Yet, even for such a simple algorithm, we are far from making a fully
951generic and optimized code. In particular, the presented algorithms
952work on homogeneous images, i.e. images whose pixels have channels
953that are all of the same type. There are examples of images, such as a
954packed 565 RGB format, which contain channels of different
955types. While GIL provides concepts and algorithms operating on
956heterogeneous pixels, we leave the task of extending x_gradient as an
957exercise for the reader.  Second, after computing the value of the
958gradient we are simply casting it to the destination channel
959type. This may not always be the desired operation. For example, if
960the source channel is a float with range [0..1] and the destination is
961unsigned char, casting the half-difference to unsigned char will
962result in either 0 or 1. Instead, what we might want to do is scale
963the result into the range of the destination channel. GIL's
964channel-level algorithms might be useful in such cases. For example,
965\p channel_convert converts between channels by linearly scaling the
966source channel value into the range of the destination channel.
967
968There is a lot to be done in improving the performance as
969well. Channel-level operations, such as the half-difference, could be
970abstracted out into atomic channel-level algorithms and performance
971overloads could be provided for concrete channel
972types. Processor-specific operations could be used, for example, to
973perform the operation over an entire row of pixels simultaneously, or
974the data could be pre-fetched. All of these optimizations can be
975realized as performance specializations of the generic
976algorithm. Finally, compilers, while getting better over time, are
977still failing to fully optimize generic code in some cases, such as
978failing to inline some functions or put some variables into
979registers. If performance is an issue, it might be worth trying your
980code with different compilers.
981