• Home
Name Date Size #Lines LOC

..--

CMakeLists.txtD03-May-2024730 3631

FunctionList.cppD03-May-202418.9 KiB206169

FunctionList.hD03-May-20242.8 KiB9968

README.txtD03-May-20247.6 KiB151106

Sleep.cppD03-May-20244.1 KiB11973

Sleep.hD03-May-2024719 255

Utility.cppD03-May-20244.1 KiB170115

Utility.hD03-May-20248.5 KiB255176

binary.cppD03-May-202472.1 KiB1,5571,249

binaryOperator.cppD03-May-202466.3 KiB1,4631,182

binary_i.cppD03-May-202455.7 KiB1,2341,005

binary_two_results_i.cppD03-May-202449.6 KiB1,133934

i_unary.cppD03-May-202423.5 KiB628494

macro_binary.cppD03-May-202453.6 KiB1,2351,012

macro_unary.cppD03-May-202437.4 KiB990799

mad.cppD03-May-202456.6 KiB1,129507

main.cppD03-May-202460 KiB1,8421,516

reference_math.cppD03-May-2024176.5 KiB5,4703,818

reference_math.hD03-May-20249.1 KiB233196

run_math_brute_force_in_parallel.pyD03-May-20242.7 KiB11179

ternary.cppD03-May-202467.4 KiB1,3601,124

unary.cppD03-May-202446.8 KiB1,210992

unary_two_results.cppD03-May-202441.3 KiB993829

unary_two_results_i.cppD03-May-202432.5 KiB802655

unary_u.cppD03-May-202426 KiB693553

README.txt

1Copyright:	(c) 2009-2013 by Apple Inc. All Rights Reserved.
2
3math_brute_force test                                                   Feb 24, 2009
4=====================
5
6Usage:
7
8        Please run the executable with --help for usage information.
9
10
11
12System Requirements:
13
14        This test requires support for correctly rounded single and double precision arithmetic.
15The current version also requires a reasonably accurate operating system math library to
16be present. The OpenCL implementation must be able to compile kernels online. The test assumes
17that the host system stores its floating point data according to the IEEE-754 binary single and
18double precision floating point formats.
19
20
21Test Completion Time:
22
23        This test takes a while. Modern desktop systems can usually finish it in 1-3
24days. Engineers doing OpenCL math library software development may find wimpy mode (-w)
25a useful screen to quickly look for problems in a new implementation, before committing
26to a lengthy test run. Likewise, it is possible to run just a range of tests, or specific
27tests. See Usage above.
28
29
30Test Design:
31
32        This test is designed to do a somewhat exhaustive examination of the single
33and double precision math library functions in OpenCL, for all vector lengths. Math
34library functions are compared against results from a higher precision reference
35function to determine correctness. All possible inputs are  examined for unary
36single precision functions.  Other functions are tested against a table of difficult
37values, followed by a few billion random values. If an error is found in a function,
38the test for that function terminates early, reports an error, and moves on to the
39next test, if any.
40
41The test currently doesn't support half precision math functions covered in section
429 of the OpenCL 1.0 specification, but does cover the half_func functions covered in
43section six. It also doesn't test the native_<funcname> functions, for which any result
44is conformant.
45
46For the OpenCL 1.0 time frame, the reference library shall be the operating system
47math library, as modified by the test itself to conform to the OpenCL specification.
48That will help ensure that all devices on a particular operating system are returning
49similar results.  Going forward to future OpenCL releases, it is planned to gradually
50introduce a reference math library directly into the test, so as to reduce inter-
51platform variance between OpenCL implementations.
52
53Generally speaking, this test will consider a result correct if it is one of the following:
54
55        1) bitwise identical to the output of the reference function,
56                rounded to the appropriate precision
57
58        2) within the allowed ulp error tolerance of the infinitely precise
59                result (as estimated by the reference function)
60
61        3) If the reference result is a NaN, then any NaN is deemed correct.
62
63        4) if the devices is running in FTZ mode, then the result is also correct
64                if the infinitely precise result (as estimated by the reference
65                function) is subnormal, and the returned result is a zero
66
67        5) if the devices is running in FTZ mode, then we also calculate the
68                estimate of the infinitely precise result with the reference function
69                with subnormal inputs flushed to +- zero.  If any of those results
70                are within the error tolerance of the returned result, then it is
71                deemed correct
72
73        6) half_func functions may flush per 4&5 above, even if the device is not
74                in FTZ mode.
75
76        7) Functions are allowed to prematurely overflow to infinity, so long as
77                the estimated infinitely precise result is within the stated ulp
78                error limit of the maximum finite representable value of appropriate
79                sign
80
81        8) Functions are allowed to prematurely underflow (and if in FTZ mode,
82                have behavior covered by 4&5 above), so long as the estimated
83                infinitely precise result is within the stated ulp error limit
84                of the minimum normal representable value of appropriate sign
85
86        9) Some functions have limited range. Results of inputs outside that range
87                are considered correct, so long as a result is returned.
88
89        10) Some functions have infinite error bounds. Results of these function
90                are considered correct, so long as a result is returned.
91
92        11) The test currently does not discriminate based on the sign of zero
93                We anticipate a later test will.
94
95        12) The test currently does not check to make sure that edge cases called
96                out in the standard (e.g. pow(1.0, any) = 1.0) are exactly correct.
97                We anticipate a later test will.
98
99        13) The test doesn't check IEEE flags or exceptions. See section 7.3 of the
100                OpenCL standard.
101
102
103
104Performance Measurement:
105
106        There is also some optional timing code available, currently turned off by default.
107These may be useful for tracking internal performance regressions, but is not required to
108be part of the conformance submission.
109
110
111If the test is believed to be in error:
112
113The above correctness heuristics shall not be construed to be an alternative to the correctness
114criteria established by the OpenCL standard. An implementation shall be judged correct
115or not on appeal based on whether it is within prescribed error bounds of the infinitely
116precise result. (The ulp is defined in section 7.4 of the spec.) If the input value corresponds
117to an edge case listed in OpenCL specification sections covering edge case behavior, or
118similar sections in the C99 TC2 standard (section F.9 and G.6), the the function shall return
119exactly that result, and the sign of a zero result shall be correct. In the event that the test
120is found to be faulty, resulting in a spurious failure result, the committee shall make a reasonable
121attempt to fix the test. If no practical and timely remedy can be found, then the implementation
122shall be granted a waiver.
123
124
125Guidelines for reference function error tolerances:
126
127        Errors are measured in ulps, and stored in a single precision representation. So as
128to avoid introducing error into the error measurement due to error in the reference function
129itself, the reference function should attempt to deliver 24 bits more precision than the test
130function return type. (All functions are currently either required to be correctly rounded or
131may have >= 1 ulp of error. This places the 1's bit at the LSB of the result, with 23 bits of
132sub-ulp accuracy. One more bit is required to avoid accrual of extra error due to round-to-
133nearest behavior. If we start to require sub-ulp precision, then the accuracy requirements
134for reference functions increase.) Therefore reference functions for single precision should
135have 24+24=48 bits of accuracy, and reference functions for double precision should ideally
136have 53+24 = 77 bits of accuracy.
137
138A double precision system math library function should be sufficient to safely verify a single
139precision OpenCL math library function.  A long double precision math library function may or
140may not be sufficient to verify a double precision OpenCL math library function, depending on
141the precision of the long double type. A later version of these tests is expected to replace
142long double with a head+tail double double representation that can represent sufficient precision,
143on all platforms that support double.
144
145
146Revision history:
147
148 Feb 24, 2009                IRO        Created README
149                                        Added some reference functions so the test will run on Windows.
150
151