1[section:lanczos The Lanczos Approximation] 2 3[h4 Motivation] 4 5['Why base gamma and gamma-like functions on the Lanczos approximation?] 6 7First of all I should make clear that for the gamma function 8over real numbers (as opposed to complex ones) 9the Lanczos approximation (See [@http://en.wikipedia.org/wiki/Lanczos_approximation Wikipedia or ] 10[@http://mathworld.wolfram.com/LanczosApproximation.html Mathworld]) 11appears to offer no clear advantage over more traditional methods such as 12[@http://en.wikipedia.org/wiki/Stirling_approximation Stirling's approximation]. 13__pugh carried out an extensive comparison of the various methods available 14and discovered that they were all very similar in terms of complexity 15and relative error. However, the Lanczos approximation does have a couple of 16properties that make it worthy of further consideration: 17 18* The approximation has an easy to compute truncation error that holds for 19all /z > 0/. In practice that means we can use the same approximation for all 20/z > 0/, and be certain that no matter how large or small /z/ is, the truncation 21error will /at worst/ be bounded by some finite value. 22* The approximation has a form that is particularly amenable to analytic 23manipulation, in particular ratios of gamma or gamma-like functions 24are particularly easy to compute without resorting to logarithms. 25 26It is the combination of these two properties that make the approximation 27attractive: Stirling's approximation is highly accurate for large z, and 28has some of the same analytic properties as the Lanczos approximation, but 29can't easily be used across the whole range of z. 30 31As the simplest example, consider the ratio of two gamma functions: one could 32compute the result via lgamma: 33 34 exp(lgamma(a) - lgamma(b)); 35 36However, even if lgamma is uniformly accurate to 0.5ulp, the worst case 37relative error in the above can easily be shown to be: 38 39 Erel > a * log(a)/2 + b * log(b)/2 40 41For small /a/ and /b/ that's not a problem, but to put the relationship another 42way: ['each time a and b increase in magnitude by a factor of 10, at least one 43decimal digit of precision will be lost.] 44 45In contrast, by analytically combining like power 46terms in a ratio of Lanczos approximation's, these errors can be virtually eliminated 47for small /a/ and /b/, and kept under control for very large (or very small 48for that matter) /a/ and /b/. Of course, computing large powers is itself a 49notoriously hard problem, but even so, analytic combinations of Lanczos 50approximations can make the difference between obtaining a valid result, or 51simply garbage. Refer to the implementation notes for the __beta function for 52an example of this method in practice. The incomplete 53[link math_toolkit.sf_gamma.igamma gamma_p gamma] and 54[link math_toolkit.sf_beta.ibeta_function beta] functions 55use similar analytic combinations of power terms, to combine gamma and beta 56functions divided by large powers into single (simpler) expressions. 57 58[h4 The Approximation] 59 60The Lanczos Approximation to the Gamma Function is given by: 61 62[equation lanczos0] 63 64Where S[sub g](z) is an infinite sum, that is convergent for all z > 0, 65and /g/ is an arbitrary parameter that controls the "shape" of the 66terms in the sum which is given by: 67 68[equation lanczos0a] 69 70With individual coefficients defined in closed form by: 71 72[equation lanczos0b] 73 74However, evaluation of the sum in that form can lead to numerical instability 75in the computation of the ratios of rising and falling factorials (effectively 76we're multiplying by a series of numbers very close to 1, so roundoff errors 77can accumulate quite rapidly). 78 79The Lanczos approximation is therefore often written in partial fraction form 80with the leading constants absorbed by the coefficients in the sum: 81 82[equation lanczos1] 83 84where: 85 86[equation lanczos2] 87 88Again parameter /g/ is an arbitrarily chosen constant, and /N/ is an arbitrarily chosen 89number of terms to evaluate in the "Lanczos sum" part. 90 91[note 92Some authors 93choose to define the sum from k=1 to N, and hence end up with N+1 coefficients. 94This happens to confuse both the following discussion and the code (since C++ 95deals with half open array ranges, rather than the closed range of the sum). 96This convention is consistent with __godfrey, but not __pugh, so take care 97when referring to the literature in this field.] 98 99[h4 Computing the Coefficients] 100 101The coefficients C0..CN-1 need to be computed from /N/ and /g/ 102at high precision, and then stored as part of the program. 103Calculation of the coefficients is performed via the method of __godfrey; 104let the constants be contained in a column vector P, then: 105 106P = D B C F 107 108where B is an NxN matrix: 109 110[equation lanczos4] 111 112D is an NxN matrix: 113 114[equation lanczos3] 115 116C is an NxN matrix: 117 118[equation lanczos5] 119 120and F is an N element column vector: 121 122[equation lanczos6] 123 124Note than the matrices B, D and C contain all integer terms and depend 125only on /N/, this product should be computed first, and then multiplied 126by /F/ as the last step. 127 128[h4 Choosing the Right Parameters] 129 130The trick is to choose 131/N/ and /g/ to give the desired level of accuracy: choosing a small value for 132/g/ leads to a strictly convergent series, but one which converges only slowly. 133Choosing a larger value of /g/ causes the terms in the series to be large 134and\/or divergent for about the first /g-1/ terms, and to then suddenly converge 135with a "crunch". 136 137__pugh has determined the optimal 138value of /g/ for /N/ in the range /1 <= N <= 60/: unfortunately in practice choosing 139these values leads to cancellation errors in the Lanczos sum as the largest 140term in the (alternating) series is approximately 1000 times larger than the result. 141These optimal values appear not to be useful in practice unless the evaluation 142can be done with a number of guard digits /and/ the coefficients are stored 143at higher precision than that desired in the result. These values are best 144reserved for say, computing to float precision with double precision arithmetic. 145 146[table Optimal choices for N and g when computing with guard digits (source: Pugh) 147[[Significand Size] [N] [g][Max Error]] 148[[24] [6] [5.581][9.51e-12]] 149[[53][13][13.144565][9.2213e-23]] 150] 151 152The alternative described by __godfrey is to perform an exhaustive 153search of the /N/ and /g/ parameter space to determine the optimal combination for 154a given /p/ digit floating-point type. Repeating this work found a good 155approximation for double precision arithmetic (close to the one __godfrey found), 156but failed to find really 157good approximations for 80 or 128-bit long doubles. Further it was observed 158that the approximations obtained tended to optimised for the small values 159of z (1 < z < 200) used to test the implementation against the factorials. 160Computing ratios of gamma functions with large arguments were observed to 161suffer from error resulting from the truncation of the Lancozos series. 162 163__pugh identified all the locations where the theoretical error of the 164approximation were at a minimum, but unfortunately has published only the largest 165of these minima. However, he makes the observation that the minima 166coincide closely with the location where the first neglected term (a[sub N]) in the 167Lanczos series S[sub g](z) changes sign. These locations are quite easy to 168locate, albeit with considerable computer time. These "sweet spots" need 169only be computed once, tabulated, and then searched when required for an 170approximation that delivers the required precision for some fixed precision 171type. 172 173Unfortunately, following this path failed to find a really good approximation 174for 128-bit long doubles, and those found for 64 and 80-bit reals required an 175excessive number of terms. There are two competing issues here: high precision 176requires a large value of /g/, but avoiding cancellation errors in the evaluation 177requires a small /g/. 178 179At this point note that the Lanczos sum can be converted into rational form 180(a ratio of two polynomials, obtained from the partial-fraction form using 181polynomial arithmetic), 182and doing so changes the coefficients so that /they are all positive/. That 183means that the sum in rational form can be evaluated without cancellation 184error, albeit with double the number of coefficients for a given N. Repeating 185the search of the "sweet spots", this time evaluating the Lanczos sum in 186rational form, and testing only those "sweet spots" whose theoretical error 187is less than the machine epsilon for the type being tested, yielded good 188approximations for all the types tested. The optimal values found were quite 189close to the best cases reported by __pugh (just slightly larger /N/ and slightly 190smaller /g/ for a given precision than __pugh reports), and even though converting 191to rational form doubles the number of stored coefficients, it should be 192noted that half of them are integers (and therefore require less storage space) 193and the approximations require a smaller /N/ than would otherwise be required, 194so fewer floating point operations may be required overall. 195 196The following table shows the optimal values for /N/ and /g/ when computing 197at fixed precision. These should be taken as work in progress: there are no 198values for 106-bit significand machines (Darwin long doubles & NTL quad_float), 199and further optimisation of the values of /g/ may be possible. 200Errors given in the table 201are estimates of the error due to truncation of the Lanczos infinite series 202to /N/ terms. They are calculated from the sum of the first five neglected 203terms - and are known to be rather pessimistic estimates - although it is noticeable 204that the best combinations of /N/ and /g/ occurred when the estimated truncation error 205almost exactly matches the machine epsilon for the type in question. 206 207[table Optimum value for N and g when computing at fixed precision 208[[Significand Size][Platform/Compiler Used][N][g][Max Truncation Error]] 209[[24][Win32, VC++ 7.1] [6] [1.428456135094165802001953125][9.41e-007]] 210[[53][Win32, VC++ 7.1] [13] [6.024680040776729583740234375][3.23e-016]] 211[[64][Suse Linux 9 IA64, gcc-3.3.3] [17] [12.2252227365970611572265625][2.34e-024]] 212[[116][HP Tru64 Unix 5.1B \/ Alpha, Compaq C++ V7.1-006] [24] [20.3209821879863739013671875][4.75e-035]] 213] 214 215Finally note that the Lanczos approximation can be written as follows 216by removing a factor of exp(g) from the denominator, and then dividing 217all the coefficients by exp(g): 218 219[equation lanczos7] 220 221This form is more convenient for calculating lgamma, but for the gamma 222function the division by /e/ turns a possibly exact quality into an 223inexact value: this reduces accuracy in the common case that 224the input is exact, and so isn't used for the gamma function. 225 226[h4 References] 227 228# [#godfrey]Paul Godfrey, [@http://my.fit.edu/~gabdo/gamma.txt "A note on the computation of the convergent 229Lanczos complex Gamma approximation"]. 230# [#pugh]Glendon Ralph Pugh, 231[@http://bh0.physics.ubc.ca/People/matt/Doc/ThesesOthers/Phd/pugh.pdf 232"An Analysis of the Lanczos Gamma Approximation"], 233PhD Thesis November 2004. 234# Viktor T. Toth, 235[@http://www.rskey.org/gamma.htm "Calculators and the Gamma Function"]. 236# Mathworld, [@http://mathworld.wolfram.com/LanczosApproximation.html 237The Lanczos Approximation]. 238 239[endsect][/section:lanczos The Lanczos Approximation] 240 241[/ 242 Copyright 2006 John Maddock and Paul A. Bristow. 243 Distributed under the Boost Software License, Version 1.0. 244 (See accompanying file LICENSE_1_0.txt or copy at 245 http://www.boost.org/LICENSE_1_0.txt). 246] 247