Free, Fast, Accurate — Pick Any Three!

This technical note describes how Gnumeric does calculations internally and why certain simple calculations like 0.3-(3*0.1) give unexpected non-zero results.

Numbers in Gnumeric are so-called floating-point numbers described in the IEEE-754 standard. Specifically we are using the double precision format described therein. (It is possible to set up Gnumeric to use another format, but it is very rarely done. Unless you have worked hard on doing that, you may assume that you are indeed using "double precision".)

This choice of number system has consequences just like making another choice of number system would have come with a different set of consequences. Here are some of the consequences of using "double precision":

- Gnumeric can only work with numbers whose magnitude are
less than about 10
^{300}. This does not normally cause any problems. - Numbers that are too close to zero will be rounded to
zero. The limit is about 10
^{-300}. Such a rounding to zero is called underflow. **Not all numbers can be precisely represented.**When a number that cannot be represented is used, it will be internally rounded to the nearest number that can be represented. The distance between representable numbers is much smaller for numbers close to zero than it is for large numbers.- All integers up to about 10
^{16}can be precisely represented. That is the good news. **The numbers 0.1, 0.01, ... cannot be precisely represented.**This is for the same reason that a desk calculator cannot precisely represent 1/3 but uses something like 0.33333333 instead.

"Double precision" has some special values like "Infinity" and "Not-a-number". When encountered, these are turned into error values in Gnumeric. We also do our best to hide the value "-0" from users.

Note: "double precision" is what pretty much every program uses because it is what the hardware works with. Microsoft Excel, LibreOffice, OpenOffice, and R all are based on "double precision".

When an operation like addition is performed in Gnumeric,
that operation will get a number of argument values and
produce one or more output values. It is important to realize
that the argument values that the operation gets are always
representable numbers, i.e., *any rounding needed to make
numbers representable has already taken place.* The job of
the operation is to calculate its result using the arguments
and round the result to the nearest representable number.

For example, for the division 1/10 where the arguments ("1" and "10") are both precisely representable, we calculate the mathematical value one-tenth and round that into the nearest representable value which happens to be exactly 0.1000000000000000000013552527156068805425093160010874271392822265625. That value clearly is a tiny bit bigger than one tenth in the same way the desk calculator's 0.33333333 is a tiny bit smaller than one third.

Rounding errors tend to add up. Computing 0.3-(3*0.1), for
example, one does not get zero as one might have hoped for.
Each individual sub-result is subject to rounding to
representable values, so we are really computing
r(r(0.3)-r(r(3)*r(0.1))) where "r" is the rounding operation.
The end result is about -6*10^{-17}. That is an
unpleasant result of working with finite-precision math.

For more complex operations, like ERF(0.25), the goal is still to produce the exact result rounded to nearest representable value. Again, the term "exact result" is based on the argument value as already rounded to a representable number. We do not always achieve this goal, i.e., we sometimes calculate a value that is slightly off from the nearest representable value. If we have at least about 15 correct significant digits we are not too concerned by this, but anything less is worth reporting as a bug. Note, however, that Gnumeric in general produces results that are far more accurate than other spreadsheets.

Microsoft Excel uses the same number system as Gnumeric, but Microsoft Excel does some operations differently.

If the top-level operand of an expression is a subtraction
and the result of the subtraction is very close to zero
compared to the two arguments, then the result will be rounded
to zero. This operation, called *snap-to-zero*, is
also performed for addition and the subtraction implied by
comparisons. Is is not performed for operations deeper in an
expression, not even for subtractions surrounded by nothing
but parentheses.

Snap-to-zero has the effect of hiding the small rounding errors that arise by the representation error on 0.01. That is, accounting calculations on amounts of dollars and cents will appear to be accurate most of the time.

But snap-to-zero comes at a price:

- Snap-to-zero will take place even if the difference between two numbers is not the cause of representation errors. Excel will claim that two numbers are equal even when they are not.
- It becomes possible to have three numbers A1, A2, and A3, where A1=A2 and A2=A3, but A1<>A3.
- It is possible for A1-5005=0 and at the same time A1-5004-1<>0.
- It is possible to have A1=A2, but (say) TAN(A1)<>TAN(A2).
- Snap-to-zero can greatly increase rounding errors in scientific calculations.

Gnumeric does not do snap-to-zero. We recommend an explicit rounding operation where it matters.

LibreOffice and OpenOffice both use the same number system as Gnumeric. These programs both use snap-to-zero (see above), but use it much more aggressively than Microsoft Excel. In particular, every subtraction is subject to snap-to-zero and there are no easy ways of performing a subtraction that does not snap-to-zero. Snap-to-zero is also done internally in many sheet functions, for example in VAR. There does not seem to be any accurate and complete documentation for snap-to-zero.

The lack of an honest subtraction operation makes it hard to even test the accuracy of sheet-level functions. LibreOffice and OpenOffice will claim that a result matches a reference value even when they do not. (Update: LibreOffice nowadays has a function that allows raw subtraction.)

We sometimes get asked why Gnumeric is not using some other number system that would avoid a particular problem a user is seeing.

**Arbitrary precision:**computers are powerful today and should be able to just using enough precision that all these rounding issues would go away, right? Well, no. If one tried that, an immediate consequence would be that sometimes basic operators like "+" and comparisons like "<" would take an unbounded amount of time, i.e., the program would "hang" instead of producing an answer. This is a consequence of the undecidable halting problem and there is no escaping it.**A base-10 format:**if we used a base-10 ("decimal") format then 0.1 and 0.01 would be precisely representable, even if some other set of numbers would not be. We could, in fact, do that. We do not do so because we have no support from the compiler and lower-level libraries that we use. Until they support such a number system, we cannot. To do this right would be a fantastic amount of work and we do not expect it to happen.**As entered:**some people believe we should keep values precisely as entered, i.e., basically as strings. This is severely misguided. It would have the same problems as for arbitrary precision above, and on top of that it would have extra problems. For example, if a value of "999,123" is entered into a spreadsheet somewhere in Europe and that file is subsequently sent to The United States, then the initial value of a little less than one thousand might be interpreted as a value of a little less than one million. Clearly that is madness. (There are further problems that are outside the scope of this note.)