Lines Matching refs:the
13 the IEC/IEEE Standard for Binary Floating-Point Arithmetic. As many as four
15 precision, and quadruple precision. All operations required by the standard
18 This document gives information about the types defined and the routines
19 implemented by SoftFloat. It does not attempt to define or explain the
20 IEC/IEEE Floating-Point Standard. Details about the standard are available
30 particular, the distributed header files will not be acceptable to any
33 Support for the extended double-precision and quadruple-precision formats
34 depends on a C compiler that implements 64-bit integer arithmetic. If the
35 largest integer format supported by the C compiler is 32 bits, SoftFloat is
36 limited to only single and double precisions. When that is the case, all
37 references in this document to the extended double precision, quadruple
68 part by the International Computer Science Institute, located at Suite 600,
70 provided by the National Science Foundation under grant MIP-9311980. The
72 a fixed-point vector processor in collaboration with the University of
85 When 64-bit integers are supported by the compiler, the `softfloat.h' header
89 terms of 32-bit and 64-bit integer types, respectively, while the `float128'
91 the byte order of the particular machine being used. The `floatx80' type
93 the machine's byte order again determining the order of the `high' and `low'
96 When 64-bit integers are _not_ supported by the compiler, the `softfloat.h'
99 the `float32' type is identified with an appropriate integer type. The
100 `float64' type is defined as a structure of two 32-bit integers, with the
101 machine's byte order determining the order of the fields.
103 In either case, the types in `softfloat.h' are defined such that if a system
104 implements the usual C `float' and `double' types according to the IEC/IEEE
105 Standard, then the `float32' and `float64' types should be indistinguishable
106 in memory from the native `float' and `double' types. (On the other hand,
108 the compiler, the type of registers used may differ from those used for the
111 SoftFloat implements the following arithmetic operations:
113 -- Conversions among all the floating-point formats, and also between
114 integers (32-bit and 64-bit) and any of the floating-point formats.
119 -- For each format, the floating-point remainder operation defined by the
123 rounds to the nearest integer value in the same format. (The floating-
126 -- Comparisons between two values in the same floating-point format.
128 The only functions required by the IEC/IEEE Standard that are not provided
135 All four rounding modes prescribed by the IEC/IEEE Standard are implemented
137 by the global variable `float_rounding_mode'. This variable may be set
138 to one of the values `float_round_nearest_even', `float_round_to_zero',
146 For extended double precision (`floatx80') only, the rounding precision
147 of the standard arithmetic operations is controlled by the global variable
153 operations are rounded (as usual) to the full precision of the extended
155 or to 64 causes the operations listed to be rounded to reduced precision
158 bits in the result significand beyond the rounding point are set to zero.
160 than 32, 64, or 80 is not specified. Operations other than the ones listed
167 All five exception flags required by the IEC/IEEE Standard are
168 implemented. Each flag is stored as a unique bit in the global variable
169 `float_exception_flags'. The positions of the exception flag bits within
170 this variable are determined by the bit masks `float_flag_inexact',
175 An individual exception flag can be cleared with the statement
179 where `<exception>' is the appropriate name. To raise a floating-point
180 exception, the SoftFloat function `float_raise' should be used (see below).
182 In the terminology of the IEC/IEEE Standard, SoftFloat can detect tininess
184 the global variable `float_detect_tininess', which can be set to either
198 All conversions among the floating-point formats are supported, as are all
217 Each conversion function takes one operand of the appropriate type and
224 Conversions from floating-point to integer raise the invalid exception if
225 the source value cannot be rounded to a representable integer of the desired
226 size (32 or 64 bits). If the floating-point operand is a NaN, the largest
227 positive integer is returned. Otherwise, if the conversion overflows, the
228 largest integer with the same sign as the operand is returned.
230 On conversions to integer, if the floating-point operand is not already an
231 integer value, the operand is rounded according to the current rounding
233 languages) require that conversions to integers be rounded toward zero, the
255 The operands and result are all of the same type.
257 Rounding of the extended double-precision (`floatx80') functions is affected
258 by the `floatx80_rounding_precision' variable, as explained above in the
264 For each format, SoftFloat implements the remainder function according to
265 the IEC/IEEE Standard. The remainder functions are:
273 of the same type. Given operands x and y, the remainder functions return
274 the value x - n*y, where n is the integer closest to x/y. If x/y is exactly
275 halfway between two integers, n is the even integer closest to x/y. The
278 Depending on the relative magnitudes of the operands, the remainder
279 functions can take considerably longer to execute than the other SoftFloat
280 functions. This is inherent in the remainder operation itself and is not a
281 flaw in the SoftFloat implementation.
286 For each format, SoftFloat implements the round-to-integer function
287 specified by the IEC/IEEE Standard. The functions are:
295 the same type. (Note that the result is not an integer type.) The operand
296 is rounded to an exact integer according to the current rounding mode, and
297 the resulting integer value is returned in the same floating-point format.
309 Each function takes two operands of the same type and returns a 1 or 0
315 (!=) functions are easily obtained using the functions provided. The
316 not-equal function is just the logical complement of the equal function.
317 The greater-than-or-equal function is identical to the less-than-or-equal
318 function with the operands reversed; and the greater-than function can be
319 obtained from the less-than function in the same way.
321 The IEC/IEEE Standard specifies that the less-than-or-equal and less-than
322 functions raise the invalid exception if either input is any kind of NaN.
323 The equal functions, on the other hand, are defined not to raise the invalid
324 exception on quiet NaNs. For completeness, SoftFloat provides the following
332 The `signaling' equal functions are identical to the standard functions
333 except that the invalid exception is raised for any NaN input. Likewise,
334 the `quiet' comparison functions are identical to their counterparts except
335 that the invalid exception is not raised for quiet NaNs.
348 The functions take one operand and return 1 if the operand is a signaling
358 The function takes a mask indicating the set of exceptions to raise. No
359 result is returned. In addition to setting the specified exception flags,
360 this function may cause a trap or abort appropriate for the current system.
368 At the time of this writing, the most up-to-date information about
369 SoftFloat and the latest release can be found at the Web page `http://