Floating Point Precision with Log Likelihoods

Computing the log likelihood is a common task in probabilistic machine learning, but it can easily under- or overflow. I discuss one such issue and its resolution.

The problem: floating point precision

In my research, I encountered an odd issue: my log likelihood was always -\infty or -inf. The problem was that this term was always -inf:

log(det(A)) \log(\det(A))

for some diagonal matrix AA. To understand what was happening, consider the scenario where the elements of AA are very small. Since the determinant of a diagonal matrix is just the product of the elements along the diagonal,

det(A)=iai \det(A) = \prod_i a_i

if the values aia_i are small, this number might be rounded to 00 due to our computer’s floating point precision. Then when we take the log of 00, we get -inf. To see this, consider this Python snippet:

>>> import numpy as np
>>> A = np.ones(100) * 1e-5
>>> np.linalg.det(np.diag(A))
0

To convince ourselves that floating point precision is the culprit, let us see if the above calculation results in a number that is too small for our machine. On my machine, the smallest possible floating point number in Python is

>>> import sys
>>> sys.float_info.min * sys.float_info.epsilon
5e-324

I can use the Python REPL to quickly see that 5e-324 is handled fine but 5e-325 is rounded as 0.0. Now consider our previous snippet of code. The determinant is going to be 1e-5 ** 100 or 1e-500. This is clearly too small to be represented in Python, and so it will be rounded to 0. Since 324 / 5 = 64.8, we can predict that (1e-5)**64 will cause no issues, while (1e-5)**65 will be rounded to 0:

>>> (1e-5)**64
1e-320
>>> (1e-5)**65
0.0

So we have found our issue. And note that for a large enough AA, the numbers along the diagonal do not have to be that small to result in a rounding error. How can we fix this?

The solution: a little math

As mentioned above, the determinant of a diagonal matrix AA is the product of the elements along its diagonal,

det(A)=iai \det(A) = \prod_i a_i

So the log of this is

log(det(A))=log(iai)=ilog(ai) \log(\det(A)) = \log \big( \prod_i a_i \big) = \sum_i \log(a_i)

If we compute the right-hand side instead of the left-hand side, we might avoid an issue with floating point precision because we’re taking the log of much bigger numbers and then adding them together. Let’s try it:

>>> import numpy as np
>>> A = np.ones(100) * 1e-5
>>> np.linalg.det(np.diag(A))
0.0
>>> np.log(A).sum()
-1151.2925464970228

Okay! Now we have a better estimate using fixed-precision computers. As a sanity check, if our numbers are not small, these two calculations should give us roughly the same number:

>>> A = np.ones(100) * 2
>>> np.log(np.linalg.det(np.diag(A)))
69.31471805599459
>>> np.log(A).sum()
69.31471805599453

And they do.

   

Acknowledgements

I want to thank Bianca Dumitrascu for catching this bug and explaining the issue.