1Copyright: (c) 2009-2013 by Apple Inc. All Rights Reserved. 2 3math_brute_force test Feb 24, 2009 4===================== 5 6Usage: 7 8 Please run the executable with --help for usage information. 9 10 11 12System Requirements: 13 14 This test requires support for correctly rounded single and double precision arithmetic. 15The current version also requires a reasonably accurate operating system math library to 16be present. The OpenCL implementation must be able to compile kernels online. The test assumes 17that the host system stores its floating point data according to the IEEE-754 binary single and 18double precision floating point formats. 19 20 21Test Completion Time: 22 23 This test takes a while. Modern desktop systems can usually finish it in 1-3 24days. Engineers doing OpenCL math library software development may find wimpy mode (-w) 25a useful screen to quickly look for problems in a new implementation, before committing 26to a lengthy test run. Likewise, it is possible to run just a range of tests, or specific 27tests. See Usage above. 28 29 30Test Design: 31 32 This test is designed to do a somewhat exhaustive examination of the single 33and double precision math library functions in OpenCL, for all vector lengths. Math 34library functions are compared against results from a higher precision reference 35function to determine correctness. All possible inputs are examined for unary 36single precision functions. Other functions are tested against a table of difficult 37values, followed by a few billion random values. If an error is found in a function, 38the test for that function terminates early, reports an error, and moves on to the 39next test, if any. 40 41The test currently doesn't support half precision math functions covered in section 429 of the OpenCL 1.0 specification, but does cover the half_func functions covered in 43section six. It also doesn't test the native_<funcname> functions, for which any result 44is conformant. 45 46For the OpenCL 1.0 time frame, the reference library shall be the operating system 47math library, as modified by the test itself to conform to the OpenCL specification. 48That will help ensure that all devices on a particular operating system are returning 49similar results. Going forward to future OpenCL releases, it is planned to gradually 50introduce a reference math library directly into the test, so as to reduce inter- 51platform variance between OpenCL implementations. 52 53Generally speaking, this test will consider a result correct if it is one of the following: 54 55 1) bitwise identical to the output of the reference function, 56 rounded to the appropriate precision 57 58 2) within the allowed ulp error tolerance of the infinitely precise 59 result (as estimated by the reference function) 60 61 3) If the reference result is a NaN, then any NaN is deemed correct. 62 63 4) if the devices is running in FTZ mode, then the result is also correct 64 if the infinitely precise result (as estimated by the reference 65 function) is subnormal, and the returned result is a zero 66 67 5) if the devices is running in FTZ mode, then we also calculate the 68 estimate of the infinitely precise result with the reference function 69 with subnormal inputs flushed to +- zero. If any of those results 70 are within the error tolerance of the returned result, then it is 71 deemed correct 72 73 6) half_func functions may flush per 4&5 above, even if the device is not 74 in FTZ mode. 75 76 7) Functions are allowed to prematurely overflow to infinity, so long as 77 the estimated infinitely precise result is within the stated ulp 78 error limit of the maximum finite representable value of appropriate 79 sign 80 81 8) Functions are allowed to prematurely underflow (and if in FTZ mode, 82 have behavior covered by 4&5 above), so long as the estimated 83 infinitely precise result is within the stated ulp error limit 84 of the minimum normal representable value of appropriate sign 85 86 9) Some functions have limited range. Results of inputs outside that range 87 are considered correct, so long as a result is returned. 88 89 10) Some functions have infinite error bounds. Results of these function 90 are considered correct, so long as a result is returned. 91 92 11) The test currently does not discriminate based on the sign of zero 93 We anticipate a later test will. 94 95 12) The test currently does not check to make sure that edge cases called 96 out in the standard (e.g. pow(1.0, any) = 1.0) are exactly correct. 97 We anticipate a later test will. 98 99 13) The test doesn't check IEEE flags or exceptions. See section 7.3 of the 100 OpenCL standard. 101 102 103 104Performance Measurement: 105 106 There is also some optional timing code available, currently turned off by default. 107These may be useful for tracking internal performance regressions, but is not required to 108be part of the conformance submission. 109 110 111If the test is believed to be in error: 112 113The above correctness heuristics shall not be construed to be an alternative to the correctness 114criteria established by the OpenCL standard. An implementation shall be judged correct 115or not on appeal based on whether it is within prescribed error bounds of the infinitely 116precise result. (The ulp is defined in section 7.4 of the spec.) If the input value corresponds 117to an edge case listed in OpenCL specification sections covering edge case behavior, or 118similar sections in the C99 TC2 standard (section F.9 and G.6), the the function shall return 119exactly that result, and the sign of a zero result shall be correct. In the event that the test 120is found to be faulty, resulting in a spurious failure result, the committee shall make a reasonable 121attempt to fix the test. If no practical and timely remedy can be found, then the implementation 122shall be granted a waiver. 123 124 125Guidelines for reference function error tolerances: 126 127 Errors are measured in ulps, and stored in a single precision representation. So as 128to avoid introducing error into the error measurement due to error in the reference function 129itself, the reference function should attempt to deliver 24 bits more precision than the test 130function return type. (All functions are currently either required to be correctly rounded or 131may have >= 1 ulp of error. This places the 1's bit at the LSB of the result, with 23 bits of 132sub-ulp accuracy. One more bit is required to avoid accrual of extra error due to round-to- 133nearest behavior. If we start to require sub-ulp precision, then the accuracy requirements 134for reference functions increase.) Therefore reference functions for single precision should 135have 24+24=48 bits of accuracy, and reference functions for double precision should ideally 136have 53+24 = 77 bits of accuracy. 137 138A double precision system math library function should be sufficient to safely verify a single 139precision OpenCL math library function. A long double precision math library function may or 140may not be sufficient to verify a double precision OpenCL math library function, depending on 141the precision of the long double type. A later version of these tests is expected to replace 142long double with a head+tail double double representation that can represent sufficient precision, 143on all platforms that support double. 144 145 146Revision history: 147 148 Feb 24, 2009 IRO Created README 149 Added some reference functions so the test will run on Windows. 150 151