28 Jan 2015.
I need to generate a random floating point number.
I know I don't want to do something like rand() % 100 / 100.
A double
is just 64-bits, so generate a 64-bit string that looks like
0x3FFn nnnn nnnn nnnn
Where the n
s are random hex digits.
As a double
, this will be a random number in the range \([1, 2)\).
Floating point numbers generated this way have the same distribution as the underlying generator used.
Why not zero to one?
The range \([1, 2)\) uses a single exponent, whereas the range \([0, 1)\) spans 1074 different exponents.
Because of the way the floating point encoding works, if I use uniformly distributed random numbers to generate all the bits of the double
(the exponent as well as the significand) up to 1.0, I will end up with the same number of samples in the range \([2^{-1074}, 2^{-1023})\) as I will in the range \([0.5, 1.0)\).
The benefit of using the range \([1, 2)\) is that I can use a fixed exponent and generate just the significand, and the distribution stays the same.
Floating point encoding
IEEE 754-1985 double-precision floating point numbers are encoded as:
- Sign bit.
- 11 bits of exponent.
- 52 bits of significand (sometimes called fraction or mantissa).
Laid out like this:
6 6 5 5 4 3 2 1 0 3210987654321098765432109876543210987654321098765432109876543210 +EEEEEEEEEEEssssssssssssssssssssssssssssssssssssssssssssssssssss | | | sign bit exponent significand
The exponent is encoded with an offset:
- A zero value means the exponent is \(2^{-1022}\) and the number is a "denormal."
- A nonzero value, \(E\), means the exponent is \(2^{E-1023}\) and the significand is "normalized."
How to interpret the significand:
- For example, the bit string
1011100000000000000000000000000000000000000000000000
. - This corresponds to the binary number \(1.10111_2\).
- The leading
1.
is implicit in the encoding:- The exponent is chosen so that the first significant bit is 1.
- Since the first bit is known, we can elide it.
- If it were a denormal number, the significand would be \(0.10111_2\).
- \(1.10111_2\) means \(1 + \frac{1}{2} + \frac{1}{8} + \frac{1}{16} + \frac{1}{32}\).
- Or, in decimal: \(1.718750_{10}\).
Floating point 0.0 is all zeroes:
+EEEEEEEEEEEssssssssssssssssssssssssssssssssssssssssssssssssssss 0000000000000000000000000000000000000000000000000000000000000000
The next number after zero (i.e. nexttoward(0., 1.)
) is:
+EEEEEEEEEEEssssssssssssssssssssssssssssssssssssssssssssssssssss 0000000000000000000000000000000000000000000000000000000000000001 1 2 3 4 5 bit index: 1234567890123456789012345678901234567890123456789012
Here the exponent is \(2^{-1022}\), the number is denormal, and the significand has the least significant bit set, so the number is:
$$ 2^{-1022} \times 0.0000000000000000000000000000000000000000000000000000000000000001_2\\ = 2^{-1022} \times 2^{-52}\\ = 2^{-1074} $$
The floating point representation of 1.0 is:
+EEEEEEEEEEEssssssssssssssssssssssssssssssssssssssssssssssssssss 0011111111110000000000000000000000000000000000000000000000000000
The exponent is encoded as \(01111111111_2 = 1023_{10}\) which corresponds to \(2^0\).
The significand is \(1.0_2\) because a normalized number has the implicit bit and the encoded significand is all zeroes after it.