This reference page gives various technical details
of floatingpoint (FP) numbers in Java. This information is quite
useful if you plan on doing extensive numerical calculations with
Java. We recommend that newcomers to Java should just scan the info
and come back to it later as needed.
FloatingPoint
Representations
Floatingpoint values in Java, which follows most
of the standard IEEE 754 floatingpoint specifications, are represented
by two types: the float
and double.
As shown previously, the bit representation for float
goes as
1
bit 
8
bits 
23
bits 
Sign 
exponent 
significand 
and for double
type
1
bit 
11 
52 
Sign 
exponent 
significand 
For float
the 8 bits of the exponent give values in the range of 0255.
However, 0 and 255 are special values (discussed below), so the
allow values range from 1 to 254. A bias of 127 is subtracted
to give an unbiased exponent range of 126 to 127.
Similarly, for double
the 11 bits of the exponent give values in the range of 02047.
In this case, 0 and 2047 are special values (discussed below),
so the allow values range from 1 to 2046. A bias of 1023 is subtracted
to give an unbiased exponent range of 1022 to 1023.
The float
representation gives 6 to 9 digits of decimal precision while
double
gives 15 to 17 digits of decimal precision.
When the exponent values are in their allowed unbiased
ranges, the representations are said to be nomalized. In the normalized
modes, the b_{0}
value in
(1)^{s}
·(b_{0} + b_{1}·2^{1} + b_{2}·2^{2}
+ b_{3}·2^{3} + ...+ b_{n1}·2^{(n1)})·2^{exponent}
is taken as 1 so that the effective number of bits
is increased to 24 for float
and 53 for double.
When the biased exponent is zero (i.e. all bits
are zero), the value is is denormalized and the b_{0}
value is taken as 0. The exponent is taken to be 126 for float
and 1022 for double.
The denormalized mode allows for a "smoother approach to
zero" at the smallest value range.
The following shows the minimum and maximum values
possible with these types in the two different modes:
 float
 Normalized
127 < exponent
< +128
min = 2^{126} * 1.00000000000000000000000 = 1.17549435E38
max = 2^{+127} * 1.11111111111111111111111 = 3.4028235E+38
 Denormalized
exponent = 126
min = 2^{126} * 0.00000000000000000000001 = 1.4012985E45
max = 2^{126} * 0.11111111111111111111111 = 1.1754942E38
 double
 Normalized
1023
< exponent < +1024
min = 2^{1022} * 1.0000000000000000000000000000000000000000000000000000
= 2.2250738585072014E308
max = 2^{+1023} * 1.1111111111111111111111111111111111111111111111111111
= 1.7976931348623157E+308
 Denormalized
exponent
= 1022
min = 2^{1022} * 0.0000000000000000000000000000000000000000000000000001
= 4.9E324
max = 2^{1022} * 0.1111111111111111111111111111111111111111111111111111
= 2.225073858507201E308
The normalized/denormalized modes are not usually something the
programmer has to deal with but for numerical computing can be
of possible importance.
Next we look at the other special floatingpoint values.
FloatingPoint
Special Values
Operations with floatingpoint never result in an
exception thrown. (Exceptions
are Java error conditions, to be discussed later.) For example,
even if an operation results in a divide by zero there
is no exception message. (An integer divided by zero does give
an exception.)
Instead of error messages for abnormal operations,
the floatingoint result is filled with one of several special
floatingpoint values:
The special floatingpoint cases include:
 +/ Zero : if the bits
in both the exponent and the significand all equal 0, then
the FP value is 0 or +0 depending on the sign bit.
 Positive zero is produced by underflow form the
positive direction, e.g.
x
= 2.0e45 * 1.0e10
 Negative zero is produced by underflow from the
negative direction, e.g.
x
= 2.0e45 * 1.0e10

+/Infinity : if all the bits in the exponent
equal 1 and all the bits in the significand equal 0, then
the FP value is Infinity
or +Infinity
depending on the sign
 Positive infinity is produced by overflow of
a positive value
 Negative infinity is produced by overflow of
a negative value
 NaN : if all the bits
in the exponent equal 1 and any of the bits in the significand
equal 1, then the FP value is NotaNumber and the sign value
is ignored. Produced by operations such as a divide by zero
and square root of 1.
Overflows, underflows and divide by zero in Java
do not lead to error states. A division by zero leads to
the +/Infinity
value unless the nominator equaled zero, in which case the NaN
value appears. You can test for such values using methods from
the floatingpoint wrapper classes (see Chapter
3: Java.) such as Double.isNaN(double
x). Also, the NaN
value can be checked for with the test if
( x != x) statement which will fail for NaN
values.
Finite floatingpoint numbers and the special values
are ordered from smallest to largest as follows:
The positive and negative zero values act as

Positive zero and negative zero compare as equal

1.0
/ (positive zero) ==> POSITIVE_INFINITY

1.0
/ (negative zero) ==> NEGATIVE_INFINITY
The NaN
values are unordered. This means that:

Numerical comparisons and tests for numerical
equality result in false if either or both operands are NaN.

A test for numerical equality of a value against
itself results in false if and only if the value is NaN.

A test for numerical inequality results in true
if either operand is NaN
Extended Exponents
The JVM
Specifications after version 1.1. allow for an implementation
to include extended exponent versions of either or both the
float and double types during intermediate calculations to
avoid over/under flows.

N = number bits in mantissa

K = number bits exponent

Emax = maximum value of exponent

Emin = minimum size of exponent.
The table maps the floatingoint specifications allowed for
the four types.
Parameter
 float
 floatextendedexponent
 double
 doubleextendedexponent

N
 24
 24
 53
 53

K
 8
 > 10
 11
 > 14

Emax
 +127
 > +1022
 +1023
 > +16382

Emin
 126
 < 1021
 1022
 < 16381

The final accessible floatingpoint results will be in float
or double
types but intermediate floatingpoint values can use the larger
extended exponent representations if the platform processor allows
it. There is no access for the Java programmer to the extended exponent
types.
The JVM does not support either the official IEEE 754 single extended
or double extended format since these extended formats require extended
precision, i.e. longer significand, in addition to the extended
exponent ranges shown in the above table.
The documentation for a particular JVM should indicate whether
it allows for the extended exponent options.
The modifier strictfp
in front of a method will force the precision to remain at 64
bit for all calculations within that method. This is useful if
one wants to ensure exactly the same results regardless of the
platform or JVM implementation.
(This is not related to the strictMath
class discussed in the Math class section.)
Floating
Point Literals and Rounding Rules
Some more notes about
Java floatingpoint include:
Literals
Literals default to double
unless appended with f or F:
float
x=1.0; // compile time error
float x=1.0f; // OK
double x=1.0; // OK
Floatingpoint rounding:
The JVM uses IEEE 754 roundtonearest mode: inexact
results are rounded to the nearest representable value, with
ties going to the value with a zero leastsignificant bit.
Instructions that convert values of floatingpoint types to
integer values will round towards zero.
FloatingPoint Programming
Notes
In general, it is safest to do floatingpoint calculations in double
type. This helps to reduce roundoff errors that can reduce precision
during intermediate calculations. (You can always cast the final
value to float if that is a more convenient size for I/O or storage.)
There can be some performance tradeoff, since double operations
involve more data transfer, but the size of the tradeoff depends
on the JVM and the platform. (In Chapter
12 we discuss techniques for measuring code performance.)
The representations of the primitives are the same on all machines
to insure the portability of the code. However, during calculations
involving floatingpoint values, intermediate values can exceed
the standard exponent ranges if allowed by the particular processor
(see table above).
The strictfp
modifier of classes or methods requires that the values remain within
the range allowed by the Java specifications throughout the calculation
to insure the same results on all platforms.
FloatingPoint
Demo
Here we use an applet to display results of several
math expressions. To see outputs from the print
statements run with an appletviewer or look in the browser's
Java
console. You can also run it as
an application. Try to predict the results before looking at the
output.

import
java.applet.Applet;
import java.awt.*;
/** This applet tests various math expressions.
* Run with appletviewer to see print out on
* screen or with a browser Java console.
**/
public class FPSpecialValues extends Applet {
public void init() {
// FP literals are double type by
default.
// Append F or f to make float or
cast to float
float x = 5.1f;
float y = 0.0f;
float div_by_zero = x/y;
System.out.println ("Divide By Zero
= x/y = " + div_by_zero + "\n");
x = 1.0f;
div_by_zero = x/y;
System.out.println ("Divide negative
by zero = x/y = " + div_by_zero +
"\n");
x = 2.0e45f;
y = 1.0e10f;
float positive_underflow = x*y;
System.out.println ("Positive underflow
= " + positive_underflow +
"\n");
x = 2.0e45f;
y = 1.0e10f;
float negative_underflow = x*y;
System.out.println ("Negative underflow
= " + negative_underflow +
"\n");
x = 1.0f;
y = negative_underflow;
float div_by_neg_zero = x/y;
System.out.println ("Divide 1 by
negative zero = " + div_by_neg_zero +
"\n");
x = 0.0f;
y = 0.0f;
float div_zero_by_zero = x/y;
System.out.println ("Divide zero
by zero = " + div_zero_by_zero + "\n")
}
public void paint (Graphics g) {
g.drawString ("Math tests",20,20);
}
}

References & Web Resources
Latest update: Oct. 15, 2004
