• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1[/
2Copyright (c) 2019 Nick Thompson
3Use, modification and distribution are subject to the
4Boost Software License, Version 1.0. (See accompanying file
5LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
6]
7
8[section:diff Lanczos Smoothing Derivatives]
9
10[heading Synopsis]
11
12``
13#include <boost/math/differentiation/lanczos_smoothing.hpp>
14
15namespace boost::math::differentiation {
16
17    template <class Real, size_t order=1>
18    class discrete_lanczos_derivative {
19    public:
20        discrete_lanczos_derivative(Real spacing,
21                                    size_t n = 18,
22                                    size_t approximation_order = 3);
23
24        template<class RandomAccessContainer>
25        Real operator()(RandomAccessContainer const & v, size_t i) const;
26
27        template<class RandomAccessContainer>
28        void operator()(RandomAccessContainer const & v, RandomAccessContainer & dvdt) const;
29
30        template<class RandomAccessContainer>
31        RandomAccessContainer operator()(RandomAccessContainer const & v) const;
32
33        Real get_spacing() const;
34    };
35
36} // namespaces
37``
38
39[heading Description]
40
41The `discrete_lanczos_derivative` class calculates a finite-difference approximation to the derivative of a noisy sequence of equally-spaced values /v/.
42A basic usage is
43
44    std::vector<double> v(500);
45    // fill v with noisy data.
46    double spacing = 0.001;
47    using boost::math::differentiation::discrete_lanczos_derivative;
48    auto lanczos = discrete_lanczos_derivative(spacing);
49    // Compute dvdt at index 30:
50    double dvdt30 = lanczos(v, 30);
51    // Compute derivative of entire vector:
52    std::vector<double> dvdt = lanczos(v);
53
54Noise-suppressing second derivatives can also be computed.
55The syntax is as follows:
56
57    std::vector<double> v(500);
58    // fill v with noisy data.
59    auto lanczos = lanczos_derivative<double, 2>(spacing);
60    // evaluate second derivative at a point:
61    double d2vdt2 = lanczos(v, 18);
62    // evaluate second derivative of entire vector:
63    std::vector<double> d2vdt2 = lanczos(v);
64
65For memory conscious programmers, you can reuse the memory space for the derivative as follows:
66
67    std::vector<double> v(500);
68    std::vector<double> dvdt(500);
69    // . . . define spacing, create and instance of discrete_lanczos_derivative
70
71    // populate dvdt, perhaps in a loop:
72    lanczos(v, dvdt);
73
74
75If the data has variance \u03C3[super 2],
76then the variance of the computed derivative is roughly \u03C3[super 2]/p/[super 3] /n/[super -3] \u0394 /t/[super -2],
77i.e., it increases cubically with the approximation order /p/, linearly with the data variance,
78and decreases at the cube of the filter length /n/.
79In addition, we must not forget the discretization error which is /O/(\u0394 /t/[super /p/]).
80You can play around with the approximation order /p/ and the filter length /n/:
81
82    size_t n = 12;
83    size_t p = 2;
84    auto lanczos = lanczos_derivative(spacing, n, p);
85    double dvdt = lanczos(v, i);
86
87If /p=2n/, then the discrete Lanczos derivative is not smoothing:
88It reduces to the standard /2n+1/-point finite-difference formula.
89For /p>2n/, an assertion is hit as the filter is undefined.
90
91In our tests with AWGN, we have found the error decreases monotonically with /n/,
92as is expected from the theory discussed above.
93So the choice of /n/ is simple:
94As high as possible given your speed requirements (larger /n/ implies a longer filter and hence more compute),
95balanced against the danger of overfitting and averaging over non-stationarity.
96
97The choice of approximation order /p/ for a given /n/ is more difficult.
98If your signal is believed to be a polynomial,
99it does not make sense to set /p/ to larger than the polynomial degree-
100though it may be sensible to take /p/ less than this.
101
102For a sinusoidal signal contaminated with AWGN, we ran a few tests showing that for SNR = 1,
103p = n/8 gave the best results,
104for SNR = 10, p = n/7 was the best, and for SNR = 100, p = n/6 was the most reasonable choice.
105For SNR = 0.1, the method appears to be useless.
106The user is urged to use these results with caution: they have no theoretical backing and are extrapolated from a single case.
107
108The filters are (regrettably) computed at runtime-the vast number of combinations of approximation order and filter length makes the number of filters that must be stored excessive for compile-time data.
109The constructor call computes the filters.
110Since each filter has length /2n+1/ and there are /n/ filters, whose element each consist of /p/ summands,
111the complexity of the constructor call is O(/n/[super 2]/p/).
112This is not cheap-though for most cases small /p/ and /n/ not too large (< 20) is desired.
113However, for concreteness, on the author's 2.7GHz Intel laptop CPU, the /n=16/, /p=3/ filter takes 9 microseconds to compute.
114This is far from negligible, and as such the filters may be used with multiple data, and even shared between threads:
115
116
117    std::vector<double> v1(500);
118    std::vector<double> v2(500);
119    // fill v1, v2 with noisy data.
120    auto lanczos = lanczos_derivative(spacing);
121    std::vector<double> dv1dt = lanczos(v1); // threadsafe
122    std::vector<double> dv2dt = lanczos(v2); // threadsafe
123    // need to use a different spacing?
124    lanczos.reset_spacing(0.02); // not threadsafe
125
126
127The implementation follows [@https://doi.org/10.1080/00207160.2012.666348 McDevitt, 2012],
128who vastly expanded the ideas of Lanczos to create a very general framework for numerically differentiating noisy equispaced data.
129
130[heading Example]
131
132We have extracted some data from the [@https://www.gw-openscience.org/data/ LIGO signal] and differentiated it
133using the (/n/, /p/) = (60, 4) Lanczos smoothing derivative, as well as using the (/n/, /p/) = (4, 8) (nonsmoothing) derivative.
134
135[graph ligo_derivative]
136
137The original data is in orange, the smoothing derivative in blue, and the non-smoothing standard finite difference formula is in gray.
138(Each time series has been rescaled to fit in the same graph.)
139We can see that the smoothing derivative tracks the increase and decrease in the trend well, whereas the standard finite difference formula produces nonsense and amplifies noise.
140
141[heading Caveats]
142
143The computation of the filters is ill-conditioned for large /p/.
144In order to mitigate this problem, we have computed the filters in higher precision and cast the results to the desired type.
145However, this simply pushes the problem to larger /p/.
146In practice, this is not a problem, as large /p/ corresponds to less powerful denoising, but keep it in mind.
147
148In addition, the `-ffast-math` flag has a very large effect on the speed of these functions.
149In our benchmarks, they were 50% faster with the flag enabled, which is much larger than the usual performance increases we see by turning on this flag.
150Hence, if the default performance is not sufficient, this flag is available, though it comes with its own problems.
151
152This requires C++17 `if constexpr`.
153
154[heading References]
155
156* Corless, Robert M., and Nicolas Fillion. ['A graduate introduction to numerical methods.] AMC 10 (2013): 12.
157
158* Lanczos, Cornelius. ['Applied analysis.] Courier Corporation, 1988.
159
160* Timothy J. McDevitt (2012): ['Discrete Lanczos derivatives of noisy data], International Journal of Computer Mathematics, 89:7, 916-931
161
162
163[endsect]
164