1 2 3Bignum support (Fast Number Theoretic Transform or FNT): 4======================================================== 5 6Bignum arithmetic in libmpdec uses the scheme for fast convolution 7of integer sequences from: 8 9J. M. Pollard: The fast Fourier transform in a finite field 10http://www.ams.org/journals/mcom/1971-25-114/S0025-5718-1971-0301966-0/home.html 11 12 13The transform in a finite field can be used for convolution in the same 14way as the Fourier Transform. The main advantages of the Number Theoretic 15Transform are that it is both exact and very memory efficient. 16 17 18Convolution in pseudo-code: 19~~~~~~~~~~~~~~~~~~~~~~~~~~~ 20 21 fnt_convolute(a, b): 22 x = fnt(a) # forward transform of a 23 y = fnt(b) # forward transform of b 24 z = pairwise multiply x[i] and y[i] 25 result = inv_fnt(z) # backward transform of z. 26 27 28Extending the maximum transform length (Chinese Remainder Theorem): 29------------------------------------------------------------------- 30 31The maximum transform length is quite limited when using a single 32prime field. However, it is possible to use multiple primes and 33recover the result using the Chinese Remainder Theorem. 34 35 36Multiplication in pseudo-code: 37~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 38 39 _mpd_fntmul(u, v): 40 c1 = fnt_convolute(u, v, P1) # convolute modulo prime1 41 c2 = fnt_convolute(u, v, P2) # convolute modulo prime2 42 c3 = fnt_convolute(u, v, P3) # convolute modulo prime3 43 result = crt3(c1, c2, c3) # Chinese Remainder Theorem 44 45 46Optimized transform functions: 47------------------------------ 48 49There are three different fnt() functions: 50 51 std_fnt: "standard" decimation in frequency transform for array lengths 52 of 2**n. Performs well up to 1024 words. 53 54 sixstep: Cache-friendly algorithm for array lengths of 2**n. Outperforms 55 std_fnt for large arrays. 56 57 fourstep: Algorithm for array lengths of 3 * 2**n. Also cache friendly 58 in large parts. 59 60 61List of bignum-only files: 62-------------------------- 63 64Functions from these files are only used in _mpd_fntmul(). 65 66 umodarith.h -> fast low level routines for unsigned modular arithmetic 67 numbertheory.c -> routines for setting up the FNT 68 difradix2.c -> decimation in frequency transform, used as the 69 "base case" by the following three files: 70 71 fnt.c -> standard transform for smaller arrays 72 sixstep.c -> transform large arrays of length 2**n 73 fourstep.c -> transform arrays of length 3 * 2**n 74 75 convolute.c -> do the actual fast convolution, using one of 76 the three transform functions. 77 transpose.c -> transpositions needed for the sixstep algorithm. 78 crt.c -> Chinese Remainder Theorem: use information from three 79 transforms modulo three different primes to get the 80 final result. 81 82 83 84