• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1[/
2  Copyright 2018 Nick Thompson
3
4  Distributed under the Boost Software License, Version 1.0.
5  (See accompanying file LICENSE_1_0.txt or copy at
6  http://www.boost.org/LICENSE_1_0.txt).
7]
8
9[section:signal_statistics Signal Statistics]
10
11[heading Synopsis]
12
13``
14#include <boost/math/statistics/signal_statistics.hpp>
15
16namespace boost::math::statistics {
17
18    template<class Container>
19    auto absolute_gini_coefficient(Container & c);
20
21    template<class ForwardIterator>
22    auto absolute_gini_coefficient(ForwardIterator first, ForwardIterator last);
23
24    template<class Container>
25    auto sample_absolute_gini_coefficient(Container & c);
26
27    template<class ForwardIterator>
28    auto sample_absolute_gini_coefficient(ForwardIterator first, ForwardIterator last);
29
30    template<class Container>
31    auto hoyer_sparsity(Container const & c);
32
33    template<class ForwardIterator>
34    auto hoyer_sparsity(ForwardIterator first, ForwardIterator last);
35
36    template<class Container>
37    auto oracle_snr(Container const & signal, Container const & noisy_signal);
38
39    template<class Container>
40    auto oracle_snr_db(Container const & signal, Container const & noisy_signal);
41
42    template<class ForwardIterator>
43    auto m2m4_snr_estimator(ForwardIterator first, ForwardIterator last, decltype(*first) estimated_signal_kurtosis=1, decltype(*first) estimated_noise_kurtosis=3);
44
45    template<class Container>
46    auto m2m4_snr_estimator(Container const & noisy_signal, typename Container::value_type estimated_signal_kurtosis=1, typename Container::value_type estimate_noise_kurtosis=3);
47
48    template<class ForwardIterator>
49    auto m2m4_snr_estimator_db(ForwardIterator first, ForwardIterator last, decltype(*first) estimated_signal_kurtosis=1, decltype(*first) estimated_noise_kurtosis=3);
50
51    template<class Container>
52    auto m2m4_snr_estimator_db(Container const & noisy_signal,typename Container::value_type estimated_signal_kurtosis=1, typename Container::value_type estimate_noise_kurtosis=3);
53
54}
55``
56
57[heading Description]
58
59The file `boost/math/statistics/signal_statistics.hpp` is a set of facilities for computing quantities commonly used in signal analysis.
60
61Our examples use `std::vector<double>` to hold the data, but this not required.
62In general, you can store your data in an Eigen array, and Armadillo vector, `std::array`, and for many of the routines, a `std::forward_list`.
63These routines are usable in float, double, long double, and Boost.Multiprecision precision, as well as their complex extensions whenever the computation is well-defined.
64
65
66[heading Absolute Gini Coefficient]
67
68The Gini coefficient, first used to measure wealth inequality, is also one of the best measures of the sparsity of an expansion in a basis.
69A sparse expansion has most of its norm concentrated in just a few coefficients, making the connection with wealth inequality obvious.
70See [@https://arxiv.org/pdf/0811.4706.pdf Hurley and Rickard] for details.
71However, for measuring sparsity, the phase of the numbers is irrelevant, so we provide the `absolute_gini_coefficient`:
72
73    using boost::math::statistics::sample_absolute_gini_coefficient;
74    using boost::math::statistics::absolute_gini_coefficient;
75    std::vector<std::complex<double>> v{{0,1}, {0,0}, {0,0}, {0,0}};
76    double abs_gini = sample_absolute_gini_coefficient(v);
77    // now abs_gini = 1; maximally unequal
78
79    std::vector<std::complex<double>> w{{0,1}, {1,0}, {0,-1}, {-1,0}};
80    abs_gini = absolute_gini_coefficient(w);
81    // now abs_gini = 0; every element of the vector has equal magnitude
82
83    std::vector<double> u{-1, 1, -1};
84    abs_gini = absolute_gini_coefficient(u);
85    // now abs_gini = 0
86    // Alternative call useful for computing over subset of the input:
87    abs_gini = absolute_gini_coefficient(u.begin(), u.begin() + 1);
88
89
90The sample Gini coefficient returns unity for a vector which has only one nonzero coefficient.
91The population Gini coefficient of a vector with one non-zero element is dependent on the length of the input.
92
93The sample Gini coefficient lacks one desirable property of the population Gini coefficient,
94namely that "cloning" a vector has the same Gini coefficient; though cloning holds to very high accuracy with the sample Gini coefficient and can easily be recovered by a rescaling.
95
96If sorting the input data is too much expense for a sparsity measure (is it going to be perfect anyway?),
97consider calculating the Hoyer sparsity instead.
98
99[heading Hoyer Sparsity]
100
101The Hoyer sparsity measures a normalized ratio of the \u2113[super 1] and \u2113[super 2] norms.
102As the name suggests, it is used to measure the sparsity of an expansion in some basis.
103
104The Hoyer sparsity computes ([radic]/N/ - \u2113[super 1](v)/\u2113[super 2](v))/([radic]N -1).
105For details, see [@http://www.jmlr.org/papers/volume5/hoyer04a/hoyer04a.pdf Hoyer] as well as [@https://arxiv.org/pdf/0811.4706.pdf Hurley and Rickard].
106
107A few special cases will serve to clarify the intended use:
108If /v/ has only one nonzero coefficient, the Hoyer sparsity attains its maxima of 1.
109If the coefficients of /v/ all have the same magnitude, then the Hoyer sparsity attains its minima of zero.
110If the elements of /v/ are uniformly distributed on an interval [0, /b/], then the Hoyer sparsity is approximately 0.133.
111
112
113Usage:
114
115    std::vector<Real> v{1,0,0};
116    Real hs = boost::math::statistics::hoyer_sparsity(v);
117    // hs = 1
118    std::vector<Real> v{1,-1,1};
119    Real hs = boost::math::statistics::hoyer_sparsity(v.begin(), v.end());
120    // hs = 0
121
122The container must be forward iterable and the contents are not modified.
123Accepts real, complex, and integer inputs.
124If the input is an integral type, the output is a double precision float.
125
126
127[heading Oracle Signal-to-noise ratio]
128
129The function `oracle_snr` computes the ratio \u2016 /s/ \u2016[sub 2][super 2] / \u2016 /s/ - /x/ \u2016[sub 2][super 2], where /s/ is signal and /x/ is a noisy signal.
130The function `oracle_snr_db` computes 10`log`[sub 10](\u2016 /s/ \u2016[super 2] / \u2016 /s/ - /x/ \u2016[super 2]).
131The functions are so named because in general, one does not know how to decompose a real signal /x/ into /s/ + /w/ and as such /s/ is regarded as oracle information.
132Hence this function is mainly useful for unit testing other SNR estimators.
133
134Usage:
135
136    std::vector<double> signal(500, 3.2);
137    std::vector<double> noisy_signal(500);
138    // fill 'noisy_signal' signal + noise
139    double snr_db = boost::math::statistics::oracle_snr_db(signal, noisy_signal);
140    double snr = boost::math::statistics::oracle_snr(signal, noisy_signal);
141
142The input can be real, complex, or integral.
143Integral inputs produce double precision floating point outputs.
144The input data is not modified and must satisfy the requirements of a `RandomAccessContainer`.
145
146[heading /M/[sub 2]/M/[sub 4] SNR Estimation]
147
148Estimates the SNR of a noisy signal via the /M/[sub 2]/M/[sub 4] method.
149See [@https://doi.org/10.1109/26.871393  Pauluzzi and N.C. Beaulieu] and [@https://doi.org/10.1109/ISIT.1994.394869 Matzner and Englberger] for details.
150
151    std::vector<double> noisy_signal(512);
152    // fill noisy_signal with data contaminated by Gaussian white noise:
153    double est_snr_db = boost::math::statistics::m2m4_snr_estimator_db(noisy_signal);
154
155The /M/[sub 2]/M/[sub 4] SNR estimator is an "in-service" estimator, meaning that the estimate is made using the noisy, data-bearing signal, and does not require a background estimate.
156This estimator has been found to be work best between roughly -3 and 15db, tending to overestimate the noise below -3db, and underestimate the noise above 15db.
157See [@https://www.mdpi.com/2078-2489/8/3/75/pdf Xue et al] for details.
158
159The /M/[sub 2]/M/[sub 4] SNR estimator, by default, assumes that the kurtosis of the signal is 1 and the kurtosis of the noise is 3, the latter corresponding to Gaussian noise.
160These parameters, however, can be overridden:
161
162    std::vector<double> noisy_signal(512);
163    // fill noisy_signal with the data:
164    double signal_kurtosis = 1.5;
165    // Noise is assumed to follow Laplace distribution, which has kurtosis of 6:
166    double noise_kurtosis = 6;
167    double est_snr = boost::math::statistics::m2m4_snr_estimator_db(noisy_signal, signal_kurtosis, noise_kurtosis);
168
169Now, technically the method is a "blind SNR estimator", meaning that the no /a-priori/ information about the signal is required to use the method.
170However, the performance of the method is /vastly/ better if you can come up with a better estimate of the signal and noise kurtosis.
171How can we do this? Suppose we know that the SNR is much greater than 1.
172Then we can estimate the signal kurtosis simply by using the noisy signal kurtosis.
173If the SNR is much less than one, this method breaks down as the noisy signal kurtosis will tend to the noise kurtosis-though in this limit we have an excellent estimator of the noise kurtosis!
174In addition, if you have a model of what your signal should look like, you can precompute the signal kurtosis.
175For example, sinusoids have a kurtosis of 1.5.
176See [@http://www.jcomputers.us/vol8/jcp0808-21.pdf here] for a study which uses estimates of this sort to improve the performance of the /M/[sub 2]/M/[sub 4] estimator.
177
178
179/Nota bene/: The traditional definition of SNR is /not/ mean invariant.
180By this we mean that if a constant is added to every sample of a signal, the SNR is changed.
181For example, adding DC bias to a signal changes its SNR.
182For most use cases, this is really not what you intend; for example a signal consisting of zeros plus Gaussian noise has an SNR of zero,
183whereas a signal with a constant DC bias and random Gaussian noise might have a very large SNR.
184
185The /M/[sub 2]/M/[sub 4] SNR estimator is computed from mean-invariant quantities,
186and hence it should really be compared to the mean-invariant SNR.
187
188/Nota bene/: This computation requires the solution of a system of quadratic equations involving the noise kurtosis, the signal kurtosis, and the second and fourth moments of the data.
189There is no guarantee that a solution of this system exists for all value of these parameters, in fact nonexistence can easily be demonstrated for certain data.
190If there is no solution to the system, then failure is communicated by returning NaNs.
191This happens distressingly often; if a user is aware of any blind SNR estimators which do not suffer from this drawback, please open a github ticket and let us know.
192
193The author has not managed to fully characterize the conditions under which a real solution with /S > 0/ and /N >0/ exists.
194However, a very intuitive example demonstrates why nonexistence can occur.
195Suppose the signal and noise kurtosis are equal.
196Then the method has no way to distinguish between the signal and the noise, and the solution is non-unique.
197
198
199[heading References]
200
201* Mallat, Stephane. ['A wavelet tour of signal processing: the sparse way.] Academic press, 2008.
202* Hurley, Niall, and Scott Rickard. ['Comparing measures of sparsity.] IEEE Transactions on Information Theory 55.10 (2009): 4723-4741.
203* Jensen, Arne, and Anders la Cour-Harbo. ['Ripples in mathematics: the discrete wavelet transform.] Springer Science & Business Media, 2001.
204* D. R. Pauluzzi and N. C. Beaulieu, ['A comparison of SNR estimation techniques for the AWGN channel,] IEEE Trans. Communications, Vol. 48, No. 10, pp. 1681-1691, 2000.
205* Hoyer, Patrik O. ['Non-negative matrix factorization with sparseness constraints.], Journal of machine learning research 5.Nov (2004): 1457-1469.
206
207[endsect]
208[/section:signal_statistics Signal Statistics]
209