1[/ 2 Copyright 2018 Nick Thompson 3 4 Distributed under the Boost Software License, Version 1.0. 5 (See accompanying file LICENSE_1_0.txt or copy at 6 http://www.boost.org/LICENSE_1_0.txt). 7] 8 9[section:bivariate_statistics Bivariate Statistics] 10 11[heading Synopsis] 12 13`` 14#include <boost/math/statistics/bivariate_statistics.hpp> 15 16namespace boost{ namespace math{ namespace statistics { 17 18 template<class Container> 19 auto covariance(Container const & u, Container const & v); 20 21 template<class Container> 22 auto means_and_covariance(Container const & u, Container const & v); 23 24 template<class Container> 25 auto correlation_coefficient(Container const & u, Container const & v); 26 27}}} 28`` 29 30[heading Description] 31 32This file provides functions for computing bivariate statistics. 33 34[heading Covariance] 35 36Computes the population covariance of two datasets: 37 38 std::vector<double> u{1,2,3,4,5}; 39 std::vector<double> v{1,2,3,4,5}; 40 double cov_uv = boost::math::statistics::covariance(u, v); 41 42The implementation follows [@https://doi.org/10.1109/CLUSTR.2009.5289161 Bennet et al]. 43The data is not modified. Requires a random-access container. 44Works with real-valued inputs and does not work with complex-valued inputs. 45 46The algorithm used herein simultaneously generates the mean values of the input data /u/ and /v/. 47For certain applications, it might be useful to get them in a single pass through the data. 48As such, we provide `means_and_covariance`: 49 50 std::vector<double> u{1,2,3,4,5}; 51 std::vector<double> v{1,2,3,4,5}; 52 auto [mu_u, mu_v, cov_uv] = boost::math::statistics::means_and_covariance(u, v); 53 54[heading Correlation Coefficient] 55 56Computes the [@https://en.wikipedia.org/wiki/Pearson_correlation_coefficient Pearson correlation coefficient] of two datasets /u/ and /v/: 57 58 std::vector<double> u{1,2,3,4,5}; 59 std::vector<double> v{1,2,3,4,5}; 60 double rho_uv = boost::math::statistics::correlation_coefficient(u, v); 61 // rho_uv = 1. 62 63The data must be random access and cannot be complex. 64 65If one or both of the datasets is constant, the correlation coefficient is an indeterminant form (0/0) and definitions must be introduced to assign it a value. 66We use the following: If both datasets are constant, then the correlation coefficient is 1. 67If one dataset is constant, and the other is not, then the correlation coefficient is zero. 68 69 70[heading References] 71 72* Bennett, Janine, et al. ['Numerically stable, single-pass, parallel statistics algorithms.] Cluster Computing and Workshops, 2009. CLUSTER'09. IEEE International Conference on. IEEE, 2009. 73 74[endsect] 75[/section:bivariate_statistics Bivariate Statistics] 76