• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1.. default-domain:: cpp
2
3.. cpp:namespace:: ceres
4
5.. _`chapter-modeling`:
6
7========
8Modeling
9========
10
11Ceres solver consists of two distinct parts. A modeling API which
12provides a rich set of tools to construct an optimization problem one
13term at a time and a solver API that controls the minimization
14algorithm. This chapter is devoted to the task of modeling
15optimization problems using Ceres. :ref:`chapter-solving` discusses
16the various ways in which an optimization problem can be solved using
17Ceres.
18
19Ceres solves robustified bounds constrained non-linear least squares
20problems of the form:
21
22.. math:: :label: ceresproblem
23
24   \min_{\mathbf{x}} &\quad \frac{1}{2}\sum_{i}
25   \rho_i\left(\left\|f_i\left(x_{i_1},
26   ... ,x_{i_k}\right)\right\|^2\right)  \\
27   \text{s.t.} &\quad l_j \le x_j \le u_j
28
29In Ceres parlance, the expression
30:math:`\rho_i\left(\left\|f_i\left(x_{i_1},...,x_{i_k}\right)\right\|^2\right)`
31is known as a **residual block**, where :math:`f_i(\cdot)` is a
32:class:`CostFunction` that depends on the **parameter blocks**
33:math:`\left\{x_{i_1},... , x_{i_k}\right\}`.
34
35In most optimization problems small groups of scalars occur
36together. For example the three components of a translation vector and
37the four components of the quaternion that define the pose of a
38camera. We refer to such a group of scalars as a **parameter block**. Of
39course a parameter block can be just a single scalar too.
40
41:math:`\rho_i` is a :class:`LossFunction`. A :class:`LossFunction` is
42a scalar valued function that is used to reduce the influence of
43outliers on the solution of non-linear least squares problems.
44
45:math:`l_j` and :math:`u_j` are lower and upper bounds on the
46parameter block :math:`x_j`.
47
48As a special case, when :math:`\rho_i(x) = x`, i.e., the identity
49function, and :math:`l_j = -\infty` and :math:`u_j = \infty` we get
50the more familiar unconstrained `non-linear least squares problem
51<http://en.wikipedia.org/wiki/Non-linear_least_squares>`_.
52
53.. math:: :label: ceresproblemunconstrained
54
55   \frac{1}{2}\sum_{i} \left\|f_i\left(x_{i_1}, ... ,x_{i_k}\right)\right\|^2.
56
57:class:`CostFunction`
58---------------------
59
60For each term in the objective function, a :class:`CostFunction` is
61responsible for computing a vector of residuals and if asked a vector
62of Jacobian matrices, i.e., given :math:`\left[x_{i_1}, ... ,
63x_{i_k}\right]`, compute the vector
64:math:`f_i\left(x_{i_1},...,x_{i_k}\right)` and the matrices
65
66 .. math:: J_{ij} = \frac{\partial}{\partial
67	   x_{i_j}}f_i\left(x_{i_1},...,x_{i_k}\right),\quad \forall j
68	   \in \{1, \ldots, k\}
69
70.. class:: CostFunction
71
72   .. code-block:: c++
73
74    class CostFunction {
75     public:
76      virtual bool Evaluate(double const* const* parameters,
77                            double* residuals,
78                            double** jacobians) = 0;
79      const vector<int32>& parameter_block_sizes();
80      int num_residuals() const;
81
82     protected:
83      vector<int32>* mutable_parameter_block_sizes();
84      void set_num_residuals(int num_residuals);
85    };
86
87
88The signature of the :class:`CostFunction` (number and sizes of input
89parameter blocks and number of outputs) is stored in
90:member:`CostFunction::parameter_block_sizes_` and
91:member:`CostFunction::num_residuals_` respectively. User code
92inheriting from this class is expected to set these two members with
93the corresponding accessors. This information will be verified by the
94:class:`Problem` when added with :func:`Problem::AddResidualBlock`.
95
96.. function:: bool CostFunction::Evaluate(double const* const* parameters, double* residuals, double** jacobians)
97
98   Compute the residual vector and the Jacobian matrices.
99
100   ``parameters`` is an array of pointers to arrays containing the
101   various parameter blocks. ``parameters`` has the same number of
102   elements as :member:`CostFunction::parameter_block_sizes_` and the
103   parameter blocks are in the same order as
104   :member:`CostFunction::parameter_block_sizes_`.
105
106   ``residuals`` is an array of size ``num_residuals_``.
107
108   ``jacobians`` is an array of size
109   :member:`CostFunction::parameter_block_sizes_` containing pointers
110   to storage for Jacobian matrices corresponding to each parameter
111   block. The Jacobian matrices are in the same order as
112   :member:`CostFunction::parameter_block_sizes_`. ``jacobians[i]`` is
113   an array that contains :member:`CostFunction::num_residuals_` x
114   :member:`CostFunction::parameter_block_sizes_` ``[i]``
115   elements. Each Jacobian matrix is stored in row-major order, i.e.,
116   ``jacobians[i][r * parameter_block_size_[i] + c]`` =
117   :math:`\frac{\partial residual[r]}{\partial parameters[i][c]}`
118
119
120   If ``jacobians`` is ``NULL``, then no derivatives are returned;
121   this is the case when computing cost only. If ``jacobians[i]`` is
122   ``NULL``, then the Jacobian matrix corresponding to the
123   :math:`i^{\textrm{th}}` parameter block must not be returned, this
124   is the case when a parameter block is marked constant.
125
126   **NOTE** The return value indicates whether the computation of the
127   residuals and/or jacobians was successful or not.
128
129   This can be used to communicate numerical failures in Jacobian
130   computations for instance.
131
132:class:`SizedCostFunction`
133--------------------------
134
135.. class:: SizedCostFunction
136
137   If the size of the parameter blocks and the size of the residual
138   vector is known at compile time (this is the common case),
139   :class:`SizeCostFunction` can be used where these values can be
140   specified as template parameters and the user only needs to
141   implement :func:`CostFunction::Evaluate`.
142
143   .. code-block:: c++
144
145    template<int kNumResiduals,
146             int N0 = 0, int N1 = 0, int N2 = 0, int N3 = 0, int N4 = 0,
147             int N5 = 0, int N6 = 0, int N7 = 0, int N8 = 0, int N9 = 0>
148    class SizedCostFunction : public CostFunction {
149     public:
150      virtual bool Evaluate(double const* const* parameters,
151                            double* residuals,
152                            double** jacobians) const = 0;
153    };
154
155
156:class:`AutoDiffCostFunction`
157-----------------------------
158
159.. class:: AutoDiffCostFunction
160
161   Defining a :class:`CostFunction` or a :class:`SizedCostFunction`
162   can be a tedious and error prone especially when computing
163   derivatives.  To this end Ceres provides `automatic differentiation
164   <http://en.wikipedia.org/wiki/Automatic_differentiation>`_.
165
166   .. code-block:: c++
167
168     template <typename CostFunctor,
169            int kNumResiduals,  // Number of residuals, or ceres::DYNAMIC.
170            int N0,       // Number of parameters in block 0.
171            int N1 = 0,   // Number of parameters in block 1.
172            int N2 = 0,   // Number of parameters in block 2.
173            int N3 = 0,   // Number of parameters in block 3.
174            int N4 = 0,   // Number of parameters in block 4.
175            int N5 = 0,   // Number of parameters in block 5.
176            int N6 = 0,   // Number of parameters in block 6.
177            int N7 = 0,   // Number of parameters in block 7.
178            int N8 = 0,   // Number of parameters in block 8.
179            int N9 = 0>   // Number of parameters in block 9.
180     class AutoDiffCostFunction : public
181     SizedCostFunction<kNumResiduals, N0, N1, N2, N3, N4, N5, N6, N7, N8, N9> {
182      public:
183       explicit AutoDiffCostFunction(CostFunctor* functor);
184       // Ignore the template parameter kNumResiduals and use
185       // num_residuals instead.
186       AutoDiffCostFunction(CostFunctor* functor, int num_residuals);
187     }
188
189   To get an auto differentiated cost function, you must define a
190   class with a templated ``operator()`` (a functor) that computes the
191   cost function in terms of the template parameter ``T``. The
192   autodiff framework substitutes appropriate ``Jet`` objects for
193   ``T`` in order to compute the derivative when necessary, but this
194   is hidden, and you should write the function as if ``T`` were a
195   scalar type (e.g. a double-precision floating point number).
196
197   The function must write the computed value in the last argument
198   (the only non-``const`` one) and return true to indicate success.
199
200   For example, consider a scalar error :math:`e = k - x^\top y`,
201   where both :math:`x` and :math:`y` are two-dimensional vector
202   parameters and :math:`k` is a constant. The form of this error,
203   which is the difference between a constant and an expression, is a
204   common pattern in least squares problems. For example, the value
205   :math:`x^\top y` might be the model expectation for a series of
206   measurements, where there is an instance of the cost function for
207   each measurement :math:`k`.
208
209   The actual cost added to the total problem is :math:`e^2`, or
210   :math:`(k - x^\top y)^2`; however, the squaring is implicitly done
211   by the optimization framework.
212
213   To write an auto-differentiable cost function for the above model,
214   first define the object
215
216   .. code-block:: c++
217
218    class MyScalarCostFunctor {
219      MyScalarCostFunctor(double k): k_(k) {}
220
221      template <typename T>
222      bool operator()(const T* const x , const T* const y, T* e) const {
223        e[0] = T(k_) - x[0] * y[0] - x[1] * y[1];
224        return true;
225      }
226
227     private:
228      double k_;
229    };
230
231
232   Note that in the declaration of ``operator()`` the input parameters
233   ``x`` and ``y`` come first, and are passed as const pointers to arrays
234   of ``T``. If there were three input parameters, then the third input
235   parameter would come after ``y``. The output is always the last
236   parameter, and is also a pointer to an array. In the example above,
237   ``e`` is a scalar, so only ``e[0]`` is set.
238
239   Then given this class definition, the auto differentiated cost
240   function for it can be constructed as follows.
241
242   .. code-block:: c++
243
244    CostFunction* cost_function
245        = new AutoDiffCostFunction<MyScalarCostFunctor, 1, 2, 2>(
246            new MyScalarCostFunctor(1.0));              ^  ^  ^
247                                                        |  |  |
248                            Dimension of residual ------+  |  |
249                            Dimension of x ----------------+  |
250                            Dimension of y -------------------+
251
252
253   In this example, there is usually an instance for each measurement
254   of ``k``.
255
256   In the instantiation above, the template parameters following
257   ``MyScalarCostFunction``, ``<1, 2, 2>`` describe the functor as
258   computing a 1-dimensional output from two arguments, both
259   2-dimensional.
260
261   :class:`AutoDiffCostFunction` also supports cost functions with a
262   runtime-determined number of residuals. For example:
263
264   .. code-block:: c++
265
266     CostFunction* cost_function
267         = new AutoDiffCostFunction<MyScalarCostFunctor, DYNAMIC, 2, 2>(
268             new CostFunctorWithDynamicNumResiduals(1.0),   ^     ^  ^
269             runtime_number_of_residuals); <----+           |     |  |
270                                                |           |     |  |
271                                                |           |     |  |
272               Actual number of residuals ------+           |     |  |
273               Indicate dynamic number of residuals --------+     |  |
274               Dimension of x ------------------------------------+  |
275               Dimension of y ---------------------------------------+
276
277   The framework can currently accommodate cost functions of up to 10
278   independent variables, and there is no limit on the dimensionality
279   of each of them.
280
281   **WARNING 1** Since the functor will get instantiated with
282   different types for ``T``, you must convert from other numeric
283   types to ``T`` before mixing computations with other variables
284   of type ``T``. In the example above, this is seen where instead of
285   using ``k_`` directly, ``k_`` is wrapped with ``T(k_)``.
286
287   **WARNING 2** A common beginner's error when first using
288   :class:`AutoDiffCostFunction` is to get the sizing wrong. In particular,
289   there is a tendency to set the template parameters to (dimension of
290   residual, number of parameters) instead of passing a dimension
291   parameter for *every parameter block*. In the example above, that
292   would be ``<MyScalarCostFunction, 1, 2>``, which is missing the 2
293   as the last template argument.
294
295
296:class:`DynamicAutoDiffCostFunction`
297------------------------------------
298
299.. class:: DynamicAutoDiffCostFunction
300
301   :class:`AutoDiffCostFunction` requires that the number of parameter
302   blocks and their sizes be known at compile time. It also has an
303   upper limit of 10 parameter blocks. In a number of applications,
304   this is not enough e.g., Bezier curve fitting, Neural Network
305   training etc.
306
307     .. code-block:: c++
308
309      template <typename CostFunctor, int Stride = 4>
310      class DynamicAutoDiffCostFunction : public CostFunction {
311      };
312
313   In such cases :class:`DynamicAutoDiffCostFunction` can be
314   used. Like :class:`AutoDiffCostFunction` the user must define a
315   templated functor, but the signature of the functor differs
316   slightly. The expected interface for the cost functors is:
317
318     .. code-block:: c++
319
320       struct MyCostFunctor {
321         template<typename T>
322         bool operator()(T const* const* parameters, T* residuals) const {
323         }
324       }
325
326   Since the sizing of the parameters is done at runtime, you must
327   also specify the sizes after creating the dynamic autodiff cost
328   function. For example:
329
330     .. code-block:: c++
331
332       DynamicAutoDiffCostFunction<MyCostFunctor, 4> cost_function(
333           new MyCostFunctor());
334       cost_function.AddParameterBlock(5);
335       cost_function.AddParameterBlock(10);
336       cost_function.SetNumResiduals(21);
337
338   Under the hood, the implementation evaluates the cost function
339   multiple times, computing a small set of the derivatives (four by
340   default, controlled by the ``Stride`` template parameter) with each
341   pass. There is a performance tradeoff with the size of the passes;
342   Smaller sizes are more cache efficient but result in larger number
343   of passes, and larger stride lengths can destroy cache-locality
344   while reducing the number of passes over the cost function. The
345   optimal value depends on the number and sizes of the various
346   parameter blocks.
347
348   As a rule of thumb, try using :class:`AutoDiffCostFunction` before
349   you use :class:`DynamicAutoDiffCostFunction`.
350
351:class:`NumericDiffCostFunction`
352--------------------------------
353
354.. class:: NumericDiffCostFunction
355
356  In some cases, its not possible to define a templated cost functor,
357  for example when the evaluation of the residual involves a call to a
358  library function that you do not have control over.  In such a
359  situation, `numerical differentiation
360  <http://en.wikipedia.org/wiki/Numerical_differentiation>`_ can be
361  used.
362
363    .. code-block:: c++
364
365      template <typename CostFunctor,
366                NumericDiffMethod method = CENTRAL,
367                int kNumResiduals,  // Number of residuals, or ceres::DYNAMIC.
368                int N0,       // Number of parameters in block 0.
369                int N1 = 0,   // Number of parameters in block 1.
370                int N2 = 0,   // Number of parameters in block 2.
371                int N3 = 0,   // Number of parameters in block 3.
372                int N4 = 0,   // Number of parameters in block 4.
373                int N5 = 0,   // Number of parameters in block 5.
374                int N6 = 0,   // Number of parameters in block 6.
375                int N7 = 0,   // Number of parameters in block 7.
376                int N8 = 0,   // Number of parameters in block 8.
377                int N9 = 0>   // Number of parameters in block 9.
378      class NumericDiffCostFunction : public
379      SizedCostFunction<kNumResiduals, N0, N1, N2, N3, N4, N5, N6, N7, N8, N9> {
380      };
381
382   To get a numerically differentiated :class:`CostFunction`, you must
383   define a class with a ``operator()`` (a functor) that computes the
384   residuals. The functor must write the computed value in the last
385   argument (the only non-``const`` one) and return ``true`` to
386   indicate success.  Please see :class:`CostFunction` for details on
387   how the return value may be used to impose simple constraints on
388   the parameter block. e.g., an object of the form
389
390   .. code-block:: c++
391
392     struct ScalarFunctor {
393      public:
394       bool operator()(const double* const x1,
395                       const double* const x2,
396                       double* residuals) const;
397     }
398
399   For example, consider a scalar error :math:`e = k - x'y`, where
400   both :math:`x` and :math:`y` are two-dimensional column vector
401   parameters, the prime sign indicates transposition, and :math:`k`
402   is a constant. The form of this error, which is the difference
403   between a constant and an expression, is a common pattern in least
404   squares problems. For example, the value :math:`x'y` might be the
405   model expectation for a series of measurements, where there is an
406   instance of the cost function for each measurement :math:`k`.
407
408   To write an numerically-differentiable class:`CostFunction` for the
409   above model, first define the object
410
411   .. code-block::  c++
412
413     class MyScalarCostFunctor {
414       MyScalarCostFunctor(double k): k_(k) {}
415
416       bool operator()(const double* const x,
417                       const double* const y,
418                       double* residuals) const {
419         residuals[0] = k_ - x[0] * y[0] + x[1] * y[1];
420         return true;
421       }
422
423      private:
424       double k_;
425     };
426
427   Note that in the declaration of ``operator()`` the input parameters
428   ``x`` and ``y`` come first, and are passed as const pointers to
429   arrays of ``double`` s. If there were three input parameters, then
430   the third input parameter would come after ``y``. The output is
431   always the last parameter, and is also a pointer to an array. In
432   the example above, the residual is a scalar, so only
433   ``residuals[0]`` is set.
434
435   Then given this class definition, the numerically differentiated
436   :class:`CostFunction` with central differences used for computing
437   the derivative can be constructed as follows.
438
439   .. code-block:: c++
440
441     CostFunction* cost_function
442         = new NumericDiffCostFunction<MyScalarCostFunctor, CENTRAL, 1, 2, 2>(
443             new MyScalarCostFunctor(1.0));                    ^     ^  ^  ^
444                                                               |     |  |  |
445                                   Finite Differencing Scheme -+     |  |  |
446                                   Dimension of residual ------------+  |  |
447                                   Dimension of x ----------------------+  |
448                                   Dimension of y -------------------------+
449
450   In this example, there is usually an instance for each measurement
451   of `k`.
452
453   In the instantiation above, the template parameters following
454   ``MyScalarCostFunctor``, ``1, 2, 2``, describe the functor as
455   computing a 1-dimensional output from two arguments, both
456   2-dimensional.
457
458   NumericDiffCostFunction also supports cost functions with a
459   runtime-determined number of residuals. For example:
460
461   .. code-block:: c++
462
463     CostFunction* cost_function
464         = new NumericDiffCostFunction<MyScalarCostFunctor, CENTRAL, DYNAMIC, 2, 2>(
465             new CostFunctorWithDynamicNumResiduals(1.0),               ^     ^  ^
466             TAKE_OWNERSHIP,                                            |     |  |
467             runtime_number_of_residuals); <----+                       |     |  |
468                                                |                       |     |  |
469                                                |                       |     |  |
470               Actual number of residuals ------+                       |     |  |
471               Indicate dynamic number of residuals --------------------+     |  |
472               Dimension of x ------------------------------------------------+  |
473               Dimension of y ---------------------------------------------------+
474
475
476   The framework can currently accommodate cost functions of up to 10
477   independent variables, and there is no limit on the dimensionality
478   of each of them.
479
480   The ``CENTRAL`` difference method is considerably more accurate at
481   the cost of twice as many function evaluations than forward
482   difference. Consider using central differences begin with, and only
483   after that works, trying forward difference to improve performance.
484
485   **WARNING** A common beginner's error when first using
486   NumericDiffCostFunction is to get the sizing wrong. In particular,
487   there is a tendency to set the template parameters to (dimension of
488   residual, number of parameters) instead of passing a dimension
489   parameter for *every parameter*. In the example above, that would
490   be ``<MyScalarCostFunctor, 1, 2>``, which is missing the last ``2``
491   argument. Please be careful when setting the size parameters.
492
493
494   **Alternate Interface**
495
496   For a variety of reason, including compatibility with legacy code,
497   :class:`NumericDiffCostFunction` can also take
498   :class:`CostFunction` objects as input. The following describes
499   how.
500
501   To get a numerically differentiated cost function, define a
502   subclass of :class:`CostFunction` such that the
503   :func:`CostFunction::Evaluate` function ignores the ``jacobians``
504   parameter. The numeric differentiation wrapper will fill in the
505   jacobian parameter if necessary by repeatedly calling the
506   :func:`CostFunction::Evaluate` with small changes to the
507   appropriate parameters, and computing the slope. For performance,
508   the numeric differentiation wrapper class is templated on the
509   concrete cost function, even though it could be implemented only in
510   terms of the :class:`CostFunction` interface.
511
512   The numerically differentiated version of a cost function for a
513   cost function can be constructed as follows:
514
515   .. code-block:: c++
516
517     CostFunction* cost_function
518         = new NumericDiffCostFunction<MyCostFunction, CENTRAL, 1, 4, 8>(
519             new MyCostFunction(...), TAKE_OWNERSHIP);
520
521   where ``MyCostFunction`` has 1 residual and 2 parameter blocks with
522   sizes 4 and 8 respectively. Look at the tests for a more detailed
523   example.
524
525:class:`DynamicNumericDiffCostFunction`
526---------------------------------------
527
528.. class:: DynamicNumericDiffCostFunction
529
530   Like :class:`AutoDiffCostFunction` :class:`NumericDiffCostFunction`
531   requires that the number of parameter blocks and their sizes be
532   known at compile time. It also has an upper limit of 10 parameter
533   blocks. In a number of applications, this is not enough.
534
535     .. code-block:: c++
536
537      template <typename CostFunctor, NumericDiffMethod method = CENTRAL>
538      class DynamicNumericDiffCostFunction : public CostFunction {
539      };
540
541   In such cases when numeric differentiation is desired,
542   :class:`DynamicNumericDiffCostFunction` can be used.
543
544   Like :class:`NumericDiffCostFunction` the user must define a
545   functor, but the signature of the functor differs slightly. The
546   expected interface for the cost functors is:
547
548     .. code-block:: c++
549
550       struct MyCostFunctor {
551         bool operator()(double const* const* parameters, double* residuals) const {
552         }
553       }
554
555   Since the sizing of the parameters is done at runtime, you must
556   also specify the sizes after creating the dynamic numeric diff cost
557   function. For example:
558
559     .. code-block:: c++
560
561       DynamicNumericDiffCostFunction<MyCostFunctor> cost_function(
562           new MyCostFunctor());
563       cost_function.AddParameterBlock(5);
564       cost_function.AddParameterBlock(10);
565       cost_function.SetNumResiduals(21);
566
567   As a rule of thumb, try using :class:`NumericDiffCostFunction` before
568   you use :class:`DynamicNumericDiffCostFunction`.
569
570
571:class:`NumericDiffFunctor`
572---------------------------
573
574.. class:: NumericDiffFunctor
575
576   Sometimes parts of a cost function can be differentiated
577   automatically or analytically but others require numeric
578   differentiation. :class:`NumericDiffFunctor` is a wrapper class
579   that takes a variadic functor evaluating a function, numerically
580   differentiates it and makes it available as a templated functor so
581   that it can be easily used as part of Ceres' automatic
582   differentiation framework.
583
584   For example, let us assume that
585
586   .. code-block:: c++
587
588     struct IntrinsicProjection
589       IntrinsicProjection(const double* observations);
590       bool operator()(const double* calibration,
591                       const double* point,
592                       double* residuals);
593     };
594
595   is a functor that implements the projection of a point in its local
596   coordinate system onto its image plane and subtracts it from the
597   observed point projection.
598
599   Now we would like to compose the action of this functor with the
600   action of camera extrinsics, i.e., rotation and translation, which
601   is given by the following templated function
602
603   .. code-block:: c++
604
605     template<typename T>
606     void RotateAndTranslatePoint(const T* rotation,
607                                  const T* translation,
608                                  const T* point,
609                                  T* result);
610
611   To compose the extrinsics and intrinsics, we can construct a
612   ``CameraProjection`` functor as follows.
613
614   .. code-block:: c++
615
616    struct CameraProjection {
617       typedef NumericDiffFunctor<IntrinsicProjection, CENTRAL, 2, 5, 3>
618          IntrinsicProjectionFunctor;
619
620      CameraProjection(double* observation) {
621        intrinsic_projection_.reset(
622            new IntrinsicProjectionFunctor(observation)) {
623      }
624
625      template <typename T>
626      bool operator()(const T* rotation,
627                      const T* translation,
628                      const T* intrinsics,
629                      const T* point,
630                      T* residuals) const {
631        T transformed_point[3];
632        RotateAndTranslatePoint(rotation, translation, point, transformed_point);
633        return (*intrinsic_projection_)(intrinsics, transformed_point, residual);
634      }
635
636     private:
637      scoped_ptr<IntrinsicProjectionFunctor> intrinsic_projection_;
638    };
639
640   Here, we made the choice of using ``CENTRAL`` differences to compute
641   the jacobian of ``IntrinsicProjection``.
642
643   Now, we are ready to construct an automatically differentiated cost
644   function as
645
646   .. code-block:: c++
647
648    CostFunction* cost_function =
649        new AutoDiffCostFunction<CameraProjection, 2, 3, 3, 5>(
650           new CameraProjection(observations));
651
652   ``cost_function`` now seamlessly integrates automatic
653   differentiation of ``RotateAndTranslatePoint`` with a numerically
654   differentiated version of ``IntrinsicProjection``.
655
656
657:class:`CostFunctionToFunctor`
658------------------------------
659
660.. class:: CostFunctionToFunctor
661
662   Just like :class:`NumericDiffFunctor` allows numeric
663   differentiation to be mixed with automatic differentiation,
664   :class:`CostFunctionToFunctor` provides an even more general
665   mechanism.  :class:`CostFunctionToFunctor` is an adapter class that
666   allows users to use :class:`CostFunction` objects in templated
667   functors which are to be used for automatic differentiation.  This
668   allows the user to seamlessly mix analytic, numeric and automatic
669   differentiation.
670
671   For example, let us assume that
672
673   .. code-block:: c++
674
675     class IntrinsicProjection : public SizedCostFunction<2, 5, 3> {
676       public:
677         IntrinsicProjection(const double* observations);
678         virtual bool Evaluate(double const* const* parameters,
679                               double* residuals,
680                               double** jacobians) const;
681     };
682
683   is a :class:`CostFunction` that implements the projection of a
684   point in its local coordinate system onto its image plane and
685   subtracts it from the observed point projection. It can compute its
686   residual and either via analytic or numerical differentiation can
687   compute its jacobians.
688
689   Now we would like to compose the action of this
690   :class:`CostFunction` with the action of camera extrinsics, i.e.,
691   rotation and translation. Say we have a templated function
692
693   .. code-block:: c++
694
695      template<typename T>
696      void RotateAndTranslatePoint(const T* rotation,
697                                   const T* translation,
698                                   const T* point,
699                                   T* result);
700
701
702   Then we can now do the following,
703
704   .. code-block:: c++
705
706    struct CameraProjection {
707      CameraProjection(double* observation) {
708        intrinsic_projection_.reset(
709            new CostFunctionToFunctor<2, 5, 3>(new IntrinsicProjection(observation_)));
710      }
711      template <typename T>
712      bool operator()(const T* rotation,
713                      const T* translation,
714                      const T* intrinsics,
715                      const T* point,
716                      T* residual) const {
717        T transformed_point[3];
718        RotateAndTranslatePoint(rotation, translation, point, transformed_point);
719
720        // Note that we call intrinsic_projection_, just like it was
721        // any other templated functor.
722        return (*intrinsic_projection_)(intrinsics, transformed_point, residual);
723      }
724
725     private:
726      scoped_ptr<CostFunctionToFunctor<2,5,3> > intrinsic_projection_;
727    };
728
729
730
731:class:`ConditionedCostFunction`
732--------------------------------
733
734.. class:: ConditionedCostFunction
735
736   This class allows you to apply different conditioning to the residual
737   values of a wrapped cost function. An example where this is useful is
738   where you have an existing cost function that produces N values, but you
739   want the total cost to be something other than just the sum of these
740   squared values - maybe you want to apply a different scaling to some
741   values, to change their contribution to the cost.
742
743   Usage:
744
745   .. code-block:: c++
746
747       //  my_cost_function produces N residuals
748       CostFunction* my_cost_function = ...
749       CHECK_EQ(N, my_cost_function->num_residuals());
750       vector<CostFunction*> conditioners;
751
752       //  Make N 1x1 cost functions (1 parameter, 1 residual)
753       CostFunction* f_1 = ...
754       conditioners.push_back(f_1);
755
756       CostFunction* f_N = ...
757       conditioners.push_back(f_N);
758       ConditionedCostFunction* ccf =
759         new ConditionedCostFunction(my_cost_function, conditioners);
760
761
762   Now ``ccf`` 's ``residual[i]`` (i=0..N-1) will be passed though the
763   :math:`i^{\text{th}}` conditioner.
764
765   .. code-block:: c++
766
767      ccf_residual[i] = f_i(my_cost_function_residual[i])
768
769   and the Jacobian will be affected appropriately.
770
771
772:class:`NormalPrior`
773--------------------
774
775.. class:: NormalPrior
776
777   .. code-block:: c++
778
779     class NormalPrior: public CostFunction {
780      public:
781       // Check that the number of rows in the vector b are the same as the
782       // number of columns in the matrix A, crash otherwise.
783       NormalPrior(const Matrix& A, const Vector& b);
784
785       virtual bool Evaluate(double const* const* parameters,
786                             double* residuals,
787                             double** jacobians) const;
788      };
789
790   Implements a cost function of the form
791
792   .. math::  cost(x) = ||A(x - b)||^2
793
794   where, the matrix A and the vector b are fixed and x is the
795   variable. In case the user is interested in implementing a cost
796   function of the form
797
798  .. math::  cost(x) = (x - \mu)^T S^{-1} (x - \mu)
799
800  where, :math:`\mu` is a vector and :math:`S` is a covariance matrix,
801  then, :math:`A = S^{-1/2}`, i.e the matrix :math:`A` is the square
802  root of the inverse of the covariance, also known as the stiffness
803  matrix. There are however no restrictions on the shape of
804  :math:`A`. It is free to be rectangular, which would be the case if
805  the covariance matrix :math:`S` is rank deficient.
806
807
808
809.. _`section-loss_function`:
810
811:class:`LossFunction`
812---------------------
813
814.. class:: LossFunction
815
816   For least squares problems where the minimization may encounter
817   input terms that contain outliers, that is, completely bogus
818   measurements, it is important to use a loss function that reduces
819   their influence.
820
821   Consider a structure from motion problem. The unknowns are 3D
822   points and camera parameters, and the measurements are image
823   coordinates describing the expected reprojected position for a
824   point in a camera. For example, we want to model the geometry of a
825   street scene with fire hydrants and cars, observed by a moving
826   camera with unknown parameters, and the only 3D points we care
827   about are the pointy tippy-tops of the fire hydrants. Our magic
828   image processing algorithm, which is responsible for producing the
829   measurements that are input to Ceres, has found and matched all
830   such tippy-tops in all image frames, except that in one of the
831   frame it mistook a car's headlight for a hydrant. If we didn't do
832   anything special the residual for the erroneous measurement will
833   result in the entire solution getting pulled away from the optimum
834   to reduce the large error that would otherwise be attributed to the
835   wrong measurement.
836
837   Using a robust loss function, the cost for large residuals is
838   reduced. In the example above, this leads to outlier terms getting
839   down-weighted so they do not overly influence the final solution.
840
841   .. code-block:: c++
842
843    class LossFunction {
844     public:
845      virtual void Evaluate(double s, double out[3]) const = 0;
846    };
847
848
849   The key method is :func:`LossFunction::Evaluate`, which given a
850   non-negative scalar ``s``, computes
851
852   .. math:: out = \begin{bmatrix}\rho(s), & \rho'(s), & \rho''(s)\end{bmatrix}
853
854   Here the convention is that the contribution of a term to the cost
855   function is given by :math:`\frac{1}{2}\rho(s)`, where :math:`s
856   =\|f_i\|^2`. Calling the method with a negative value of :math:`s`
857   is an error and the implementations are not required to handle that
858   case.
859
860   Most sane choices of :math:`\rho` satisfy:
861
862   .. math::
863
864      \rho(0) &= 0\\
865      \rho'(0) &= 1\\
866      \rho'(s) &< 1 \text{ in the outlier region}\\
867      \rho''(s) &< 0 \text{ in the outlier region}
868
869   so that they mimic the squared cost for small residuals.
870
871   **Scaling**
872
873   Given one robustifier :math:`\rho(s)` one can change the length
874   scale at which robustification takes place, by adding a scale
875   factor :math:`a > 0` which gives us :math:`\rho(s,a) = a^2 \rho(s /
876   a^2)` and the first and second derivatives as :math:`\rho'(s /
877   a^2)` and :math:`(1 / a^2) \rho''(s / a^2)` respectively.
878
879
880   The reason for the appearance of squaring is that :math:`a` is in
881   the units of the residual vector norm whereas :math:`s` is a squared
882   norm. For applications it is more convenient to specify :math:`a` than
883   its square.
884
885Instances
886^^^^^^^^^
887
888Ceres includes a number of predefined loss functions. For simplicity
889we described their unscaled versions. The figure below illustrates
890their shape graphically. More details can be found in
891``include/ceres/loss_function.h``.
892
893.. figure:: loss.png
894   :figwidth: 500px
895   :height: 400px
896   :align: center
897
898   Shape of the various common loss functions.
899
900.. class:: TrivialLoss
901
902      .. math:: \rho(s) = s
903
904.. class:: HuberLoss
905
906   .. math:: \rho(s) = \begin{cases} s & s \le 1\\ 2 \sqrt{s} - 1 & s > 1 \end{cases}
907
908.. class:: SoftLOneLoss
909
910   .. math:: \rho(s) = 2 (\sqrt{1+s} - 1)
911
912.. class:: CauchyLoss
913
914   .. math:: \rho(s) = \log(1 + s)
915
916.. class:: ArctanLoss
917
918   .. math:: \rho(s) = \arctan(s)
919
920.. class:: TolerantLoss
921
922   .. math:: \rho(s,a,b) = b \log(1 + e^{(s - a) / b}) - b \log(1 + e^{-a / b})
923
924.. class:: ComposedLoss
925
926   Given two loss functions ``f`` and ``g``, implements the loss
927   function ``h(s) = f(g(s))``.
928
929   .. code-block:: c++
930
931      class ComposedLoss : public LossFunction {
932       public:
933        explicit ComposedLoss(const LossFunction* f,
934                              Ownership ownership_f,
935                              const LossFunction* g,
936                              Ownership ownership_g);
937      };
938
939.. class:: ScaledLoss
940
941   Sometimes you want to simply scale the output value of the
942   robustifier. For example, you might want to weight different error
943   terms differently (e.g., weight pixel reprojection errors
944   differently from terrain errors).
945
946   Given a loss function :math:`\rho(s)` and a scalar :math:`a`, :class:`ScaledLoss`
947   implements the function :math:`a \rho(s)`.
948
949   Since we treat the a ``NULL`` Loss function as the Identity loss
950   function, :math:`rho` = ``NULL``: is a valid input and will result
951   in the input being scaled by :math:`a`. This provides a simple way
952   of implementing a scaled ResidualBlock.
953
954.. class:: LossFunctionWrapper
955
956   Sometimes after the optimization problem has been constructed, we
957   wish to mutate the scale of the loss function. For example, when
958   performing estimation from data which has substantial outliers,
959   convergence can be improved by starting out with a large scale,
960   optimizing the problem and then reducing the scale. This can have
961   better convergence behavior than just using a loss function with a
962   small scale.
963
964   This templated class allows the user to implement a loss function
965   whose scale can be mutated after an optimization problem has been
966   constructed. e.g,
967
968   .. code-block:: c++
969
970     Problem problem;
971
972     // Add parameter blocks
973
974     CostFunction* cost_function =
975         new AutoDiffCostFunction < UW_Camera_Mapper, 2, 9, 3>(
976             new UW_Camera_Mapper(feature_x, feature_y));
977
978     LossFunctionWrapper* loss_function(new HuberLoss(1.0), TAKE_OWNERSHIP);
979     problem.AddResidualBlock(cost_function, loss_function, parameters);
980
981     Solver::Options options;
982     Solver::Summary summary;
983     Solve(options, &problem, &summary);
984
985     loss_function->Reset(new HuberLoss(1.0), TAKE_OWNERSHIP);
986     Solve(options, &problem, &summary);
987
988
989Theory
990^^^^^^
991
992Let us consider a problem with a single problem and a single parameter
993block.
994
995.. math::
996
997 \min_x \frac{1}{2}\rho(f^2(x))
998
999
1000Then, the robustified gradient and the Gauss-Newton Hessian are
1001
1002.. math::
1003
1004        g(x) &= \rho'J^\top(x)f(x)\\
1005        H(x) &= J^\top(x)\left(\rho' + 2 \rho''f(x)f^\top(x)\right)J(x)
1006
1007where the terms involving the second derivatives of :math:`f(x)` have
1008been ignored. Note that :math:`H(x)` is indefinite if
1009:math:`\rho''f(x)^\top f(x) + \frac{1}{2}\rho' < 0`. If this is not
1010the case, then its possible to re-weight the residual and the Jacobian
1011matrix such that the corresponding linear least squares problem for
1012the robustified Gauss-Newton step.
1013
1014
1015Let :math:`\alpha` be a root of
1016
1017.. math:: \frac{1}{2}\alpha^2 - \alpha - \frac{\rho''}{\rho'}\|f(x)\|^2 = 0.
1018
1019
1020Then, define the rescaled residual and Jacobian as
1021
1022.. math::
1023
1024        \tilde{f}(x) &= \frac{\sqrt{\rho'}}{1 - \alpha} f(x)\\
1025        \tilde{J}(x) &= \sqrt{\rho'}\left(1 - \alpha
1026                        \frac{f(x)f^\top(x)}{\left\|f(x)\right\|^2} \right)J(x)
1027
1028
1029In the case :math:`2 \rho''\left\|f(x)\right\|^2 + \rho' \lesssim 0`,
1030we limit :math:`\alpha \le 1- \epsilon` for some small
1031:math:`\epsilon`. For more details see [Triggs]_.
1032
1033With this simple rescaling, one can use any Jacobian based non-linear
1034least squares algorithm to robustified non-linear least squares
1035problems.
1036
1037
1038:class:`LocalParameterization`
1039------------------------------
1040
1041.. class:: LocalParameterization
1042
1043   .. code-block:: c++
1044
1045     class LocalParameterization {
1046      public:
1047       virtual ~LocalParameterization() {}
1048       virtual bool Plus(const double* x,
1049                         const double* delta,
1050                         double* x_plus_delta) const = 0;
1051       virtual bool ComputeJacobian(const double* x, double* jacobian) const = 0;
1052       virtual int GlobalSize() const = 0;
1053       virtual int LocalSize() const = 0;
1054     };
1055
1056   Sometimes the parameters :math:`x` can overparameterize a
1057   problem. In that case it is desirable to choose a parameterization
1058   to remove the null directions of the cost. More generally, if
1059   :math:`x` lies on a manifold of a smaller dimension than the
1060   ambient space that it is embedded in, then it is numerically and
1061   computationally more effective to optimize it using a
1062   parameterization that lives in the tangent space of that manifold
1063   at each point.
1064
1065   For example, a sphere in three dimensions is a two dimensional
1066   manifold, embedded in a three dimensional space. At each point on
1067   the sphere, the plane tangent to it defines a two dimensional
1068   tangent space. For a cost function defined on this sphere, given a
1069   point :math:`x`, moving in the direction normal to the sphere at
1070   that point is not useful. Thus a better way to parameterize a point
1071   on a sphere is to optimize over two dimensional vector
1072   :math:`\Delta x` in the tangent space at the point on the sphere
1073   point and then "move" to the point :math:`x + \Delta x`, where the
1074   move operation involves projecting back onto the sphere. Doing so
1075   removes a redundant dimension from the optimization, making it
1076   numerically more robust and efficient.
1077
1078   More generally we can define a function
1079
1080   .. math:: x' = \boxplus(x, \Delta x),
1081
1082   where :math:`x'` has the same size as :math:`x`, and :math:`\Delta
1083   x` is of size less than or equal to :math:`x`. The function
1084   :math:`\boxplus`, generalizes the definition of vector
1085   addition. Thus it satisfies the identity
1086
1087   .. math:: \boxplus(x, 0) = x,\quad \forall x.
1088
1089   Instances of :class:`LocalParameterization` implement the
1090   :math:`\boxplus` operation and its derivative with respect to
1091   :math:`\Delta x` at :math:`\Delta x = 0`.
1092
1093
1094.. function:: int LocalParameterization::GlobalSize()
1095
1096   The dimension of the ambient space in which the parameter block
1097   :math:`x` lives.
1098
1099.. function:: int LocalParamterization::LocaLocalSize()
1100
1101   The size of the tangent space
1102   that :math:`\Delta x` lives in.
1103
1104.. function:: bool LocalParameterization::Plus(const double* x, const double* delta, double* x_plus_delta) const
1105
1106    :func:`LocalParameterization::Plus` implements :math:`\boxplus(x,\Delta x)`.
1107
1108.. function:: bool LocalParameterization::ComputeJacobian(const double* x, double* jacobian) const
1109
1110   Computes the Jacobian matrix
1111
1112   .. math:: J = \left . \frac{\partial }{\partial \Delta x} \boxplus(x,\Delta x)\right|_{\Delta x = 0}
1113
1114   in row major form.
1115
1116Instances
1117^^^^^^^^^
1118
1119.. class:: IdentityParameterization
1120
1121   A trivial version of :math:`\boxplus` is when :math:`\Delta x` is
1122   of the same size as :math:`x` and
1123
1124   .. math::  \boxplus(x, \Delta x) = x + \Delta x
1125
1126.. class:: SubsetParameterization
1127
1128   A more interesting case if :math:`x` is a two dimensional vector,
1129   and the user wishes to hold the first coordinate constant. Then,
1130   :math:`\Delta x` is a scalar and :math:`\boxplus` is defined as
1131
1132   .. math::
1133
1134      \boxplus(x, \Delta x) = x + \left[ \begin{array}{c} 0 \\ 1
1135                                  \end{array} \right] \Delta x
1136
1137   :class:`SubsetParameterization` generalizes this construction to
1138   hold any part of a parameter block constant.
1139
1140.. class:: QuaternionParameterization
1141
1142   Another example that occurs commonly in Structure from Motion
1143   problems is when camera rotations are parameterized using a
1144   quaternion. There, it is useful only to make updates orthogonal to
1145   that 4-vector defining the quaternion. One way to do this is to let
1146   :math:`\Delta x` be a 3 dimensional vector and define
1147   :math:`\boxplus` to be
1148
1149    .. math:: \boxplus(x, \Delta x) = \left[ \cos(|\Delta x|), \frac{\sin\left(|\Delta x|\right)}{|\Delta x|} \Delta x \right] * x
1150      :label: quaternion
1151
1152   The multiplication between the two 4-vectors on the right hand side
1153   is the standard quaternion
1154   product. :class:`QuaternionParameterization` is an implementation
1155   of :eq:`quaternion`.
1156
1157
1158
1159:class:`AutoDiffLocalParameterization`
1160--------------------------------------
1161
1162.. class:: AutoDiffLocalParameterization
1163
1164  :class:`AutoDiffLocalParameterization` does for
1165  :class:`LocalParameterization` what :class:`AutoDiffCostFunction`
1166  does for :class:`CostFunction`. It allows the user to define a
1167  templated functor that implements the
1168  :func:`LocalParameterization::Plus` operation and it uses automatic
1169  differentiation to implement the computation of the Jacobian.
1170
1171  To get an auto differentiated local parameterization, you must
1172  define a class with a templated operator() (a functor) that computes
1173
1174     .. math:: x' = \boxplus(x, \Delta x),
1175
1176  For example, Quaternions have a three dimensional local
1177  parameterization. It's plus operation can be implemented as (taken
1178  from `internal/ceres/autodiff_local_parameterization_test.cc
1179  <https://ceres-solver.googlesource.com/ceres-solver/+/master/internal/ceres/autodiff_local_parameterization_test.cc>`_
1180  )
1181
1182    .. code-block:: c++
1183
1184      struct QuaternionPlus {
1185        template<typename T>
1186        bool operator()(const T* x, const T* delta, T* x_plus_delta) const {
1187          const T squared_norm_delta =
1188              delta[0] * delta[0] + delta[1] * delta[1] + delta[2] * delta[2];
1189
1190          T q_delta[4];
1191          if (squared_norm_delta > T(0.0)) {
1192            T norm_delta = sqrt(squared_norm_delta);
1193            const T sin_delta_by_delta = sin(norm_delta) / norm_delta;
1194            q_delta[0] = cos(norm_delta);
1195            q_delta[1] = sin_delta_by_delta * delta[0];
1196            q_delta[2] = sin_delta_by_delta * delta[1];
1197            q_delta[3] = sin_delta_by_delta * delta[2];
1198          } else {
1199            // We do not just use q_delta = [1,0,0,0] here because that is a
1200            // constant and when used for automatic differentiation will
1201            // lead to a zero derivative. Instead we take a first order
1202            // approximation and evaluate it at zero.
1203            q_delta[0] = T(1.0);
1204            q_delta[1] = delta[0];
1205            q_delta[2] = delta[1];
1206            q_delta[3] = delta[2];
1207          }
1208
1209          Quaternionproduct(q_delta, x, x_plus_delta);
1210          return true;
1211        }
1212      };
1213
1214  Then given this struct, the auto differentiated local
1215  parameterization can now be constructed as
1216
1217  .. code-block:: c++
1218
1219     LocalParameterization* local_parameterization =
1220         new AutoDiffLocalParameterization<QuaternionPlus, 4, 3>;
1221                                                           |  |
1222                                Global Size ---------------+  |
1223                                Local Size -------------------+
1224
1225  **WARNING:** Since the functor will get instantiated with different
1226  types for ``T``, you must to convert from other numeric types to
1227  ``T`` before mixing computations with other variables of type
1228  ``T``. In the example above, this is seen where instead of using
1229  ``k_`` directly, ``k_`` is wrapped with ``T(k_)``.
1230
1231
1232:class:`Problem`
1233----------------
1234
1235.. class:: Problem
1236
1237   :class:`Problem` holds the robustified bounds constrained
1238   non-linear least squares problem :eq:`ceresproblem`. To create a
1239   least squares problem, use the :func:`Problem::AddResidualBlock`
1240   and :func:`Problem::AddParameterBlock` methods.
1241
1242   For example a problem containing 3 parameter blocks of sizes 3, 4
1243   and 5 respectively and two residual blocks of size 2 and 6:
1244
1245   .. code-block:: c++
1246
1247     double x1[] = { 1.0, 2.0, 3.0 };
1248     double x2[] = { 1.0, 2.0, 3.0, 5.0 };
1249     double x3[] = { 1.0, 2.0, 3.0, 6.0, 7.0 };
1250
1251     Problem problem;
1252     problem.AddResidualBlock(new MyUnaryCostFunction(...), x1);
1253     problem.AddResidualBlock(new MyBinaryCostFunction(...), x2, x3);
1254
1255   :func:`Problem::AddResidualBlock` as the name implies, adds a
1256   residual block to the problem. It adds a :class:`CostFunction`, an
1257   optional :class:`LossFunction` and connects the
1258   :class:`CostFunction` to a set of parameter block.
1259
1260   The cost function carries with it information about the sizes of
1261   the parameter blocks it expects. The function checks that these
1262   match the sizes of the parameter blocks listed in
1263   ``parameter_blocks``. The program aborts if a mismatch is
1264   detected. ``loss_function`` can be ``NULL``, in which case the cost
1265   of the term is just the squared norm of the residuals.
1266
1267   The user has the option of explicitly adding the parameter blocks
1268   using :func:`Problem::AddParameterBlock`. This causes additional
1269   correctness checking; however, :func:`Problem::AddResidualBlock`
1270   implicitly adds the parameter blocks if they are not present, so
1271   calling :func:`Problem::AddParameterBlock` explicitly is not
1272   required.
1273
1274   :func:`Problem::AddParameterBlock` explicitly adds a parameter
1275   block to the :class:`Problem`. Optionally it allows the user to
1276   associate a :class:`LocalParameterization` object with the
1277   parameter block too. Repeated calls with the same arguments are
1278   ignored. Repeated calls with the same double pointer but a
1279   different size results in undefined behavior.
1280
1281   You can set any parameter block to be constant using
1282   :func:`Problem::SetParameterBlockConstant` and undo this using
1283   :func:`SetParameterBlockVariable`.
1284
1285   In fact you can set any number of parameter blocks to be constant,
1286   and Ceres is smart enough to figure out what part of the problem
1287   you have constructed depends on the parameter blocks that are free
1288   to change and only spends time solving it. So for example if you
1289   constructed a problem with a million parameter blocks and 2 million
1290   residual blocks, but then set all but one parameter blocks to be
1291   constant and say only 10 residual blocks depend on this one
1292   non-constant parameter block. Then the computational effort Ceres
1293   spends in solving this problem will be the same if you had defined
1294   a problem with one parameter block and 10 residual blocks.
1295
1296   **Ownership**
1297
1298   :class:`Problem` by default takes ownership of the
1299   ``cost_function``, ``loss_function`` and ``local_parameterization``
1300   pointers. These objects remain live for the life of the
1301   :class:`Problem`. If the user wishes to keep control over the
1302   destruction of these objects, then they can do this by setting the
1303   corresponding enums in the :class:`Problem::Options` struct.
1304
1305   Note that even though the Problem takes ownership of ``cost_function``
1306   and ``loss_function``, it does not preclude the user from re-using
1307   them in another residual block. The destructor takes care to call
1308   delete on each ``cost_function`` or ``loss_function`` pointer only
1309   once, regardless of how many residual blocks refer to them.
1310
1311.. function:: ResidualBlockId Problem::AddResidualBlock(CostFunction* cost_function, LossFunction* loss_function, const vector<double*> parameter_blocks)
1312
1313   Add a residual block to the overall cost function. The cost
1314   function carries with it information about the sizes of the
1315   parameter blocks it expects. The function checks that these match
1316   the sizes of the parameter blocks listed in parameter_blocks. The
1317   program aborts if a mismatch is detected. loss_function can be
1318   NULL, in which case the cost of the term is just the squared norm
1319   of the residuals.
1320
1321   The user has the option of explicitly adding the parameter blocks
1322   using AddParameterBlock. This causes additional correctness
1323   checking; however, AddResidualBlock implicitly adds the parameter
1324   blocks if they are not present, so calling AddParameterBlock
1325   explicitly is not required.
1326
1327   The Problem object by default takes ownership of the
1328   cost_function and loss_function pointers. These objects remain
1329   live for the life of the Problem object. If the user wishes to
1330   keep control over the destruction of these objects, then they can
1331   do this by setting the corresponding enums in the Options struct.
1332
1333   Note: Even though the Problem takes ownership of cost_function
1334   and loss_function, it does not preclude the user from re-using
1335   them in another residual block. The destructor takes care to call
1336   delete on each cost_function or loss_function pointer only once,
1337   regardless of how many residual blocks refer to them.
1338
1339   Example usage:
1340
1341   .. code-block:: c++
1342
1343      double x1[] = {1.0, 2.0, 3.0};
1344      double x2[] = {1.0, 2.0, 5.0, 6.0};
1345      double x3[] = {3.0, 6.0, 2.0, 5.0, 1.0};
1346
1347      Problem problem;
1348
1349      problem.AddResidualBlock(new MyUnaryCostFunction(...), NULL, x1);
1350      problem.AddResidualBlock(new MyBinaryCostFunction(...), NULL, x2, x1);
1351
1352
1353.. function:: void Problem::AddParameterBlock(double* values, int size, LocalParameterization* local_parameterization)
1354
1355   Add a parameter block with appropriate size to the problem.
1356   Repeated calls with the same arguments are ignored. Repeated calls
1357   with the same double pointer but a different size results in
1358   undefined behavior.
1359
1360.. function:: void Problem::AddParameterBlock(double* values, int size)
1361
1362   Add a parameter block with appropriate size and parameterization to
1363   the problem. Repeated calls with the same arguments are
1364   ignored. Repeated calls with the same double pointer but a
1365   different size results in undefined behavior.
1366
1367.. function:: void Problem::RemoveResidualBlock(ResidualBlockId residual_block)
1368
1369   Remove a residual block from the problem. Any parameters that the residual
1370   block depends on are not removed. The cost and loss functions for the
1371   residual block will not get deleted immediately; won't happen until the
1372   problem itself is deleted.  If Problem::Options::enable_fast_removal is
1373   true, then the removal is fast (almost constant time). Otherwise, removing a
1374   residual block will incur a scan of the entire Problem object to verify that
1375   the residual_block represents a valid residual in the problem.
1376
1377   **WARNING:** Removing a residual or parameter block will destroy
1378   the implicit ordering, rendering the jacobian or residuals returned
1379   from the solver uninterpretable. If you depend on the evaluated
1380   jacobian, do not use remove! This may change in a future release.
1381   Hold the indicated parameter block constant during optimization.
1382
1383.. function:: void Problem::RemoveParameterBlock(double* values)
1384
1385   Remove a parameter block from the problem. The parameterization of
1386   the parameter block, if it exists, will persist until the deletion
1387   of the problem (similar to cost/loss functions in residual block
1388   removal). Any residual blocks that depend on the parameter are also
1389   removed, as described above in RemoveResidualBlock().  If
1390   Problem::Options::enable_fast_removal is true, then
1391   the removal is fast (almost constant time). Otherwise, removing a
1392   parameter block will incur a scan of the entire Problem object.
1393
1394   **WARNING:** Removing a residual or parameter block will destroy
1395   the implicit ordering, rendering the jacobian or residuals returned
1396   from the solver uninterpretable. If you depend on the evaluated
1397   jacobian, do not use remove! This may change in a future release.
1398
1399.. function:: void Problem::SetParameterBlockConstant(double* values)
1400
1401   Hold the indicated parameter block constant during optimization.
1402
1403.. function:: void Problem::SetParameterBlockVariable(double* values)
1404
1405   Allow the indicated parameter to vary during optimization.
1406
1407.. function:: void Problem::SetParameterization(double* values, LocalParameterization* local_parameterization)
1408
1409   Set the local parameterization for one of the parameter blocks.
1410   The local_parameterization is owned by the Problem by default. It
1411   is acceptable to set the same parameterization for multiple
1412   parameters; the destructor is careful to delete local
1413   parameterizations only once. The local parameterization can only be
1414   set once per parameter, and cannot be changed once set.
1415
1416.. function:: LocalParameterization* Problem::GetParameterization(double* values) const
1417
1418   Get the local parameterization object associated with this
1419   parameter block. If there is no parameterization object associated
1420   then `NULL` is returned
1421
1422.. function:: void Problem::SetParameterLowerBound(double* values, int index, double lower_bound)
1423
1424   Set the lower bound for the parameter at position `index` in the
1425   parameter block corresponding to `values`. By default the lower
1426   bound is :math:`-\infty`.
1427
1428.. function:: void Problem::SetParameterUpperBound(double* values, int index, double upper_bound)
1429
1430   Set the upper bound for the parameter at position `index` in the
1431   parameter block corresponding to `values`. By default the value is
1432   :math:`\infty`.
1433
1434.. function:: int Problem::NumParameterBlocks() const
1435
1436   Number of parameter blocks in the problem. Always equals
1437   parameter_blocks().size() and parameter_block_sizes().size().
1438
1439.. function:: int Problem::NumParameters() const
1440
1441   The size of the parameter vector obtained by summing over the sizes
1442   of all the parameter blocks.
1443
1444.. function:: int Problem::NumResidualBlocks() const
1445
1446   Number of residual blocks in the problem. Always equals
1447   residual_blocks().size().
1448
1449.. function:: int Problem::NumResiduals() const
1450
1451   The size of the residual vector obtained by summing over the sizes
1452   of all of the residual blocks.
1453
1454.. function:: int Problem::ParameterBlockSize(const double* values) const
1455
1456   The size of the parameter block.
1457
1458.. function:: int Problem::ParameterBlockLocalSize(const double* values) const
1459
1460   The size of local parameterization for the parameter block. If
1461   there is no local parameterization associated with this parameter
1462   block, then ``ParameterBlockLocalSize`` = ``ParameterBlockSize``.
1463
1464.. function:: bool Problem::HasParameterBlock(const double* values) const
1465
1466   Is the given parameter block present in the problem or not?
1467
1468.. function:: void Problem::GetParameterBlocks(vector<double*>* parameter_blocks) const
1469
1470   Fills the passed ``parameter_blocks`` vector with pointers to the
1471   parameter blocks currently in the problem. After this call,
1472   ``parameter_block.size() == NumParameterBlocks``.
1473
1474.. function:: void Problem::GetResidualBlocks(vector<ResidualBlockId>* residual_blocks) const
1475
1476   Fills the passed `residual_blocks` vector with pointers to the
1477   residual blocks currently in the problem. After this call,
1478   `residual_blocks.size() == NumResidualBlocks`.
1479
1480.. function:: void Problem::GetParameterBlocksForResidualBlock(const ResidualBlockId residual_block, vector<double*>* parameter_blocks) const
1481
1482   Get all the parameter blocks that depend on the given residual
1483   block.
1484
1485.. function:: void Problem::GetResidualBlocksForParameterBlock(const double* values, vector<ResidualBlockId>* residual_blocks) const
1486
1487   Get all the residual blocks that depend on the given parameter
1488   block.
1489
1490   If `Problem::Options::enable_fast_removal` is
1491   `true`, then getting the residual blocks is fast and depends only
1492   on the number of residual blocks. Otherwise, getting the residual
1493   blocks for a parameter block will incur a scan of the entire
1494   :class:`Problem` object.
1495
1496.. function:: bool Problem::Evaluate(const Problem::EvaluateOptions& options, double* cost, vector<double>* residuals, vector<double>* gradient, CRSMatrix* jacobian)
1497
1498   Evaluate a :class:`Problem`. Any of the output pointers can be
1499   `NULL`. Which residual blocks and parameter blocks are used is
1500   controlled by the :class:`Problem::EvaluateOptions` struct below.
1501
1502   .. code-block:: c++
1503
1504     Problem problem;
1505     double x = 1;
1506     problem.Add(new MyCostFunction, NULL, &x);
1507
1508     double cost = 0.0;
1509     problem.Evaluate(Problem::EvaluateOptions(), &cost, NULL, NULL, NULL);
1510
1511   The cost is evaluated at `x = 1`. If you wish to evaluate the
1512   problem at `x = 2`, then
1513
1514   .. code-block:: c++
1515
1516      x = 2;
1517      problem.Evaluate(Problem::EvaluateOptions(), &cost, NULL, NULL, NULL);
1518
1519   is the way to do so.
1520
1521   **NOTE** If no local parameterizations are used, then the size of
1522   the gradient vector is the sum of the sizes of all the parameter
1523   blocks. If a parameter block has a local parameterization, then
1524   it contributes "LocalSize" entries to the gradient vector.
1525
1526.. class:: Problem::EvaluateOptions
1527
1528   Options struct that is used to control :func:`Problem::Evaluate`.
1529
1530.. member:: vector<double*> Problem::EvaluateOptions::parameter_blocks
1531
1532   The set of parameter blocks for which evaluation should be
1533   performed. This vector determines the order in which parameter
1534   blocks occur in the gradient vector and in the columns of the
1535   jacobian matrix. If parameter_blocks is empty, then it is assumed
1536   to be equal to a vector containing ALL the parameter
1537   blocks. Generally speaking the ordering of the parameter blocks in
1538   this case depends on the order in which they were added to the
1539   problem and whether or not the user removed any parameter blocks.
1540
1541   **NOTE** This vector should contain the same pointers as the ones
1542   used to add parameter blocks to the Problem. These parameter block
1543   should NOT point to new memory locations. Bad things will happen if
1544   you do.
1545
1546.. member:: vector<ResidualBlockId> Problem::EvaluateOptions::residual_blocks
1547
1548   The set of residual blocks for which evaluation should be
1549   performed. This vector determines the order in which the residuals
1550   occur, and how the rows of the jacobian are ordered. If
1551   residual_blocks is empty, then it is assumed to be equal to the
1552   vector containing all the parameter blocks.
1553
1554``rotation.h``
1555--------------
1556
1557Many applications of Ceres Solver involve optimization problems where
1558some of the variables correspond to rotations. To ease the pain of
1559work with the various representations of rotations (angle-axis,
1560quaternion and matrix) we provide a handy set of templated
1561functions. These functions are templated so that the user can use them
1562within Ceres Solver's automatic differentiation framework.
1563
1564.. function:: void AngleAxisToQuaternion<T>(T const* angle_axis, T* quaternion)
1565
1566   Convert a value in combined axis-angle representation to a
1567   quaternion.
1568
1569   The value ``angle_axis`` is a triple whose norm is an angle in radians,
1570   and whose direction is aligned with the axis of rotation, and
1571   ``quaternion`` is a 4-tuple that will contain the resulting quaternion.
1572
1573.. function:: void QuaternionToAngleAxis<T>(T const* quaternion, T* angle_axis)
1574
1575   Convert a quaternion to the equivalent combined axis-angle
1576   representation.
1577
1578   The value ``quaternion`` must be a unit quaternion - it is not
1579   normalized first, and ``angle_axis`` will be filled with a value
1580   whose norm is the angle of rotation in radians, and whose direction
1581   is the axis of rotation.
1582
1583.. function:: void RotationMatrixToAngleAxis<T, row_stride, col_stride>(const MatrixAdapter<const T, row_stride, col_stride>& R, T * angle_axis)
1584.. function:: void AngleAxisToRotationMatrix<T, row_stride, col_stride>(T const * angle_axis, const MatrixAdapter<T, row_stride, col_stride>& R)
1585.. function:: void RotationMatrixToAngleAxis<T>(T const * R, T * angle_axis)
1586.. function:: void AngleAxisToRotationMatrix<T>(T const * angle_axis, T * R)
1587
1588   Conversions between 3x3 rotation matrix with given column and row strides and
1589   axis-angle rotation representations. The functions that take a pointer to T instead
1590   of a MatrixAdapter assume a column major representation with unit row stride and a column stride of 3.
1591
1592.. function:: void EulerAnglesToRotationMatrix<T, row_stride, col_stride>(const T* euler, const MatrixAdapter<T, row_stride, col_stride>& R)
1593.. function:: void EulerAnglesToRotationMatrix<T>(const T* euler, int row_stride, T* R)
1594
1595   Conversions between 3x3 rotation matrix with given column and row strides and
1596   Euler angle (in degrees) rotation representations.
1597
1598   The {pitch,roll,yaw} Euler angles are rotations around the {x,y,z}
1599   axes, respectively.  They are applied in that same order, so the
1600   total rotation R is Rz * Ry * Rx.
1601
1602   The function that takes a pointer to T as the rotation matrix assumes a row
1603   major representation with unit column stride and a row stride of 3.
1604   The additional parameter row_stride is required to be 3.
1605
1606.. function:: void QuaternionToScaledRotation<T, row_stride, col_stride>(const T q[4], const MatrixAdapter<T, row_stride, col_stride>& R)
1607.. function:: void QuaternionToScaledRotation<T>(const T q[4], T R[3 * 3])
1608
1609   Convert a 4-vector to a 3x3 scaled rotation matrix.
1610
1611   The choice of rotation is such that the quaternion
1612   :math:`\begin{bmatrix} 1 &0 &0 &0\end{bmatrix}` goes to an identity
1613   matrix and for small :math:`a, b, c` the quaternion
1614   :math:`\begin{bmatrix}1 &a &b &c\end{bmatrix}` goes to the matrix
1615
1616   .. math::
1617
1618     I + 2 \begin{bmatrix} 0 & -c & b \\ c & 0 & -a\\ -b & a & 0
1619           \end{bmatrix} + O(q^2)
1620
1621   which corresponds to a Rodrigues approximation, the last matrix
1622   being the cross-product matrix of :math:`\begin{bmatrix} a& b&
1623   c\end{bmatrix}`. Together with the property that :math:`R(q1 * q2)
1624   = R(q1) * R(q2)` this uniquely defines the mapping from :math:`q` to
1625   :math:`R`.
1626
1627   In the function that accepts a pointer to T instead of a MatrixAdapter,
1628   the rotation matrix ``R`` is a row-major matrix with unit column stride
1629   and a row stride of 3.
1630
1631   No normalization of the quaternion is performed, i.e.
1632   :math:`R = \|q\|^2  Q`, where :math:`Q` is an orthonormal matrix
1633   such that :math:`\det(Q) = 1` and :math:`Q*Q' = I`.
1634
1635
1636.. function:: void QuaternionToRotation<T>(const T q[4], const MatrixAdapter<T, row_stride, col_stride>& R)
1637.. function:: void QuaternionToRotation<T>(const T q[4], T R[3 * 3])
1638
1639   Same as above except that the rotation matrix is normalized by the
1640   Frobenius norm, so that :math:`R R' = I` (and :math:`\det(R) = 1`).
1641
1642.. function:: void UnitQuaternionRotatePoint<T>(const T q[4], const T pt[3], T result[3])
1643
1644   Rotates a point pt by a quaternion q:
1645
1646   .. math:: \text{result} = R(q)  \text{pt}
1647
1648   Assumes the quaternion is unit norm. If you pass in a quaternion
1649   with :math:`|q|^2 = 2` then you WILL NOT get back 2 times the
1650   result you get for a unit quaternion.
1651
1652
1653.. function:: void QuaternionRotatePoint<T>(const T q[4], const T pt[3], T result[3])
1654
1655   With this function you do not need to assume that q has unit norm.
1656   It does assume that the norm is non-zero.
1657
1658.. function:: void QuaternionProduct<T>(const T z[4], const T w[4], T zw[4])
1659
1660   .. math:: zw = z * w
1661
1662   where :math:`*` is the Quaternion product between 4-vectors.
1663
1664
1665.. function:: void CrossProduct<T>(const T x[3], const T y[3], T x_cross_y[3])
1666
1667   .. math:: \text{x_cross_y} = x \times y
1668
1669.. function:: void AngleAxisRotatePoint<T>(const T angle_axis[3], const T pt[3], T result[3])
1670
1671   .. math:: y = R(\text{angle_axis}) x
1672