floating point – Float and double datatype in Java

floating point – Float and double datatype in Java

The Wikipedia page on it is a good place to start.

To sum up:

  • float is represented in 32 bits, with 1 sign bit, 8 bits of exponent, and 23 bits of the significand (or what follows from a scientific-notation number: 2.33728*1012; 33728 is the significand).

  • double is represented in 64 bits, with 1 sign bit, 11 bits of exponent, and 52 bits of significand.

By default, Java uses double to represent its floating-point numerals (so a literal 3.14 is typed double). Its also the data type that will give you a much larger number range, so I would strongly encourage its use over float.

There may be certain libraries that actually force your usage of float, but in general – unless you can guarantee that your result will be small enough to fit in floats prescribed range, then its best to opt with double.

If you require accuracy – for instance, you cant have a decimal value that is inaccurate (like 1/10 + 2/10), or youre doing anything with currency (for example, representing $10.33 in the system), then use a BigDecimal, which can support an arbitrary amount of precision and handle situations like that elegantly.

A float gives you approx. 6-7 decimal digits precision while a double gives you approx. 15-16. Also the range of numbers is larger for double.

A double needs 8 bytes of storage space while a float needs just 4 bytes.

floating point – Float and double datatype in Java

Floating-point numbers, also known as real numbers, are used when evaluating expressions that require fractional precision. For example, calculations such as square root, or transcendentals such as sine and cosine, result in a value whose precision requires a floating-point type. Java implements the standard (IEEE–754) set of floatingpoint types and operators. There are two kinds of floating-point types, float and double, which represent single- and double-precision numbers, respectively. Their width and ranges are shown here:

   Name     Width in Bits   Range 
    double  64              1 .7e–308 to 1.7e+308
    float   32              3 .4e–038 to 3.4e+038


The type float specifies a single-precision value that uses 32 bits of storage. Single precision is faster on some processors and takes half as much space as double precision, but will become imprecise when the values are either very large or very small. Variables of type float are useful when you need a fractional component, but dont require a large degree of precision.

Here are some example float variable declarations:

float hightemp, lowtemp;


Double precision, as denoted by the double keyword, uses 64 bits to store a value. Double precision is actually faster than single precision on some modern processors that have been optimized for high-speed mathematical calculations. All transcendental math functions, such as sin( ), cos( ), and sqrt( ), return double values. When you need to maintain accuracy over many iterative calculations, or are manipulating large-valued numbers, double is the best choice.

Leave a Reply

Your email address will not be published.