1[/ 2Copyright (c) 2019 Nick Thompson 3Use, modification and distribution are subject to the 4Boost Software License, Version 1.0. (See accompanying file 5LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) 6] 7 8[section:diff Lanczos Smoothing Derivatives] 9 10[heading Synopsis] 11 12`` 13#include <boost/math/differentiation/lanczos_smoothing.hpp> 14 15namespace boost::math::differentiation { 16 17 template <class Real, size_t order=1> 18 class discrete_lanczos_derivative { 19 public: 20 discrete_lanczos_derivative(Real spacing, 21 size_t n = 18, 22 size_t approximation_order = 3); 23 24 template<class RandomAccessContainer> 25 Real operator()(RandomAccessContainer const & v, size_t i) const; 26 27 template<class RandomAccessContainer> 28 void operator()(RandomAccessContainer const & v, RandomAccessContainer & dvdt) const; 29 30 template<class RandomAccessContainer> 31 RandomAccessContainer operator()(RandomAccessContainer const & v) const; 32 33 Real get_spacing() const; 34 }; 35 36} // namespaces 37`` 38 39[heading Description] 40 41The `discrete_lanczos_derivative` class calculates a finite-difference approximation to the derivative of a noisy sequence of equally-spaced values /v/. 42A basic usage is 43 44 std::vector<double> v(500); 45 // fill v with noisy data. 46 double spacing = 0.001; 47 using boost::math::differentiation::discrete_lanczos_derivative; 48 auto lanczos = discrete_lanczos_derivative(spacing); 49 // Compute dvdt at index 30: 50 double dvdt30 = lanczos(v, 30); 51 // Compute derivative of entire vector: 52 std::vector<double> dvdt = lanczos(v); 53 54Noise-suppressing second derivatives can also be computed. 55The syntax is as follows: 56 57 std::vector<double> v(500); 58 // fill v with noisy data. 59 auto lanczos = lanczos_derivative<double, 2>(spacing); 60 // evaluate second derivative at a point: 61 double d2vdt2 = lanczos(v, 18); 62 // evaluate second derivative of entire vector: 63 std::vector<double> d2vdt2 = lanczos(v); 64 65For memory conscious programmers, you can reuse the memory space for the derivative as follows: 66 67 std::vector<double> v(500); 68 std::vector<double> dvdt(500); 69 // . . . define spacing, create and instance of discrete_lanczos_derivative 70 71 // populate dvdt, perhaps in a loop: 72 lanczos(v, dvdt); 73 74 75If the data has variance \u03C3[super 2], 76then the variance of the computed derivative is roughly \u03C3[super 2]/p/[super 3] /n/[super -3] \u0394 /t/[super -2], 77i.e., it increases cubically with the approximation order /p/, linearly with the data variance, 78and decreases at the cube of the filter length /n/. 79In addition, we must not forget the discretization error which is /O/(\u0394 /t/[super /p/]). 80You can play around with the approximation order /p/ and the filter length /n/: 81 82 size_t n = 12; 83 size_t p = 2; 84 auto lanczos = lanczos_derivative(spacing, n, p); 85 double dvdt = lanczos(v, i); 86 87If /p=2n/, then the discrete Lanczos derivative is not smoothing: 88It reduces to the standard /2n+1/-point finite-difference formula. 89For /p>2n/, an assertion is hit as the filter is undefined. 90 91In our tests with AWGN, we have found the error decreases monotonically with /n/, 92as is expected from the theory discussed above. 93So the choice of /n/ is simple: 94As high as possible given your speed requirements (larger /n/ implies a longer filter and hence more compute), 95balanced against the danger of overfitting and averaging over non-stationarity. 96 97The choice of approximation order /p/ for a given /n/ is more difficult. 98If your signal is believed to be a polynomial, 99it does not make sense to set /p/ to larger than the polynomial degree- 100though it may be sensible to take /p/ less than this. 101 102For a sinusoidal signal contaminated with AWGN, we ran a few tests showing that for SNR = 1, 103p = n/8 gave the best results, 104for SNR = 10, p = n/7 was the best, and for SNR = 100, p = n/6 was the most reasonable choice. 105For SNR = 0.1, the method appears to be useless. 106The user is urged to use these results with caution: they have no theoretical backing and are extrapolated from a single case. 107 108The filters are (regrettably) computed at runtime-the vast number of combinations of approximation order and filter length makes the number of filters that must be stored excessive for compile-time data. 109The constructor call computes the filters. 110Since each filter has length /2n+1/ and there are /n/ filters, whose element each consist of /p/ summands, 111the complexity of the constructor call is O(/n/[super 2]/p/). 112This is not cheap-though for most cases small /p/ and /n/ not too large (< 20) is desired. 113However, for concreteness, on the author's 2.7GHz Intel laptop CPU, the /n=16/, /p=3/ filter takes 9 microseconds to compute. 114This is far from negligible, and as such the filters may be used with multiple data, and even shared between threads: 115 116 117 std::vector<double> v1(500); 118 std::vector<double> v2(500); 119 // fill v1, v2 with noisy data. 120 auto lanczos = lanczos_derivative(spacing); 121 std::vector<double> dv1dt = lanczos(v1); // threadsafe 122 std::vector<double> dv2dt = lanczos(v2); // threadsafe 123 // need to use a different spacing? 124 lanczos.reset_spacing(0.02); // not threadsafe 125 126 127The implementation follows [@https://doi.org/10.1080/00207160.2012.666348 McDevitt, 2012], 128who vastly expanded the ideas of Lanczos to create a very general framework for numerically differentiating noisy equispaced data. 129 130[heading Example] 131 132We have extracted some data from the [@https://www.gw-openscience.org/data/ LIGO signal] and differentiated it 133using the (/n/, /p/) = (60, 4) Lanczos smoothing derivative, as well as using the (/n/, /p/) = (4, 8) (nonsmoothing) derivative. 134 135[graph ligo_derivative] 136 137The original data is in orange, the smoothing derivative in blue, and the non-smoothing standard finite difference formula is in gray. 138(Each time series has been rescaled to fit in the same graph.) 139We can see that the smoothing derivative tracks the increase and decrease in the trend well, whereas the standard finite difference formula produces nonsense and amplifies noise. 140 141[heading Caveats] 142 143The computation of the filters is ill-conditioned for large /p/. 144In order to mitigate this problem, we have computed the filters in higher precision and cast the results to the desired type. 145However, this simply pushes the problem to larger /p/. 146In practice, this is not a problem, as large /p/ corresponds to less powerful denoising, but keep it in mind. 147 148In addition, the `-ffast-math` flag has a very large effect on the speed of these functions. 149In our benchmarks, they were 50% faster with the flag enabled, which is much larger than the usual performance increases we see by turning on this flag. 150Hence, if the default performance is not sufficient, this flag is available, though it comes with its own problems. 151 152This requires C++17 `if constexpr`. 153 154[heading References] 155 156* Corless, Robert M., and Nicolas Fillion. ['A graduate introduction to numerical methods.] AMC 10 (2013): 12. 157 158* Lanczos, Cornelius. ['Applied analysis.] Courier Corporation, 1988. 159 160* Timothy J. McDevitt (2012): ['Discrete Lanczos derivatives of noisy data], International Journal of Computer Mathematics, 89:7, 916-931 161 162 163[endsect] 164