@@ -211,11 +211,11 @@ overflow 11111 111
211211
212212== Floating Point Numbers
213213
214- So far we have a look at whole numbers, let's have a look at numbers with a decimal part, aka floating point numbers.
214+ So far we have looked at whole numbers, let's have a look at numbers with a decimal part, aka floating point numbers.
215215
216216=== Decimal System
217217
218- What you are use to in the decimal system.
218+ What you are used to in the decimal system.
219219
220220[source]
221221----
@@ -288,6 +288,14 @@ Formula:
288288(sign) x (1 + mantissa) x 2^(exponent - 127)
289289----
290290
291+ The `1 +` in the formula reflects the *hidden (implicit) bit*.
292+ In normalised form the leading bit before the decimal point is always `1`
293+ (e.g. `1.10010001 x 2^6`).
294+ Because it is always `1` it does not need to be stored — the 23 mantissa bits
295+ only store the *fractional part* after the `1.`.
296+ When interpreting the stored value you always add that implicit `1` back,
297+ which is what the `1 + mantissa` in the formula does.
298+
291299Example:
292300
293301[source]
@@ -378,7 +386,7 @@ Final float binary:
378386
379387=== Why 0.1 Can't Be Represented Exactly
380388
381- There are some problems with floating point numbers, on of the is that not every number can be represented correctly.
389+ There are some problems with floating point numbers, one of them is that not every number can be represented correctly.
382390
383391[source]
384392----
@@ -412,6 +420,41 @@ The double is defined as follows:
412420* 11 bits exponent
413421* 52 bits mantissa
414422
423+ The formula is the same as for a float, but with a larger exponent bias of *1023*:
424+
425+ [source]
426+ ----
427+ (-1)^sign x (1 + mantissa) x 2^(exponent - 1023)
428+ ----
429+
430+ Using our earlier example `100.25`:
431+
432+ [source]
433+ ----
434+ Integer part: 1100100
435+ Fractional part: 01
436+ Combined: 1100100.01 x 2^0
437+
438+ Normalise (shift left 6): 1.10010001 x 2^6
439+
440+ Exponent: 6 + 1023 = 1029
441+ 1029 in binary: 10000000101 (11 bits)
442+
443+ Mantissa (52 bits, fractional part only):
444+ 1001000100000000000000000000000000000000000000000000
445+
446+ Final double binary:
447+ 0 10000000101 1001000100000000000000000000000000000000000000000000
448+ ----
449+
450+ Show in Java:
451+
452+ [source,java]
453+ ----
454+ var bits = Double.doubleToLongBits(100.25);
455+ Long.toBinaryString(bits);
456+ ----
457+
415458== BigDecimal
416459
417460If we want to keep precision, Java supports `BigDecimal` and `BigInteger`, which have virtually infinite precision.
0 commit comments