Excel bug calls for wrong number (and GHP CDs)

Welcome to Wednesday's IT Blogwatch: in which Excel 2007 gets its sums badly wrong. Not to mention the discography of probably the best mashup DJ around...

Egan Orion reports

Excel 2007 loses its grip with arithmetic that involves the number 65,535 ... perhaps the simplest of which is the calculation ( 850 X 77.1 ), which should produce 65,535 but instead returns 100,000.

There's all sorts of speculation as to how this bug occurred, postulating floating-point and rounding errors and the like, but it seems much more likely that some Excel developer simply punted at some point and the Vole's stringent quality control (cough) never caught it.

Some might recall that mathematical errors have been discovered in Excel periodically in various releases going back at least as far as Excel 5. Microsoft people appear to have been involved in the discussion and confirmed the bug. [more]

Molham Serry is the Egyptian horse's mouth:

Simply when you try to multiply 850 by 77.1 excel display the result to be 100000. [more]

Matthew Sparkes adds: [Groan -Ed.]

You can replicate the problem by placing the formula =850*77.1 into an empty cell. Some other combinations of sums that equal 65,535 are also affected by the bug, but others are returning the correct number.

The number 65,535 should be familiar to programmers, as it is the highest number that can be stored in a 16-bit unsigned binary number. A problem with transitioning between 16-bit and 32-bit numbers may well be the cause. [more]

Neil Rubenking expands:

We all learned how to multiply with pencil and paper, even great big numbers and decimals. But when it comes to something important like a blueprint or a scientific formula we reach for a calculator - or a spreadsheet. That's much more reliable, right? Well, not if the spreadsheet is Excel 2007 ... [which] thinks that 850*77.1 is 100,000. What's the correct answer? Anybody? Anybody? Bueller? Anybody? Right, it should be 65,535.
There are tons of other problem numbers, as I discovered for myself. I set up a spreadsheet to divide 65,535 by every number from 1 to 65,535 itself, then multiply the number by that result. So, for example, it divided 65,535 by 26 to get 2,520.577. Then it multiplied 26 by 2,520.577 to get... 100,000?! Over ten thousand of these simple calculations gave the wrong answer.

We won't know just why the problem comes up until Microsoft speaks out, but there is one thing about 65535 - it's the very largest 16-bit number. In binary it's a string of 16 ones. In hexadecimal (the programmer's friend) it's FFFF. But converting the "problem" results to hexadecimal in Excel yields FFFE. That's surely a clue. Meanwhile, if you have any spreadsheets where some results hit the range around 65535, it might be a good idea to double-check with your trusty calculator... or a pencil. [more]

Microsoft's David Gainer grovels:

Yesterday we were alerted to an issue in Excel 2007 (and Excel Services 2007) involving calculation of numbers around 65,535 ... our testing showed that this really didn't have anything do to with multiplication - it manifested itself with many but not all calculations in Excel that should have resulted in 65,535 (=65535*1 and =16383.75*4 worked for instance). Further testing showed a similar phenomenon with 65,536 as well.
Excel incorrectly displays the result of a calculation in 12 very specific cases ... Of the 9.214*10^18 different floating point numbers ... that Excel 2007 can store, there are 6 floating point numbers (using binary representation) between 65534.99999999995 and 65535, and 6 between 65535.99999999995 and 65536 that cause this problem.
We take calculation in Excel very seriously and we do everything we can in order to ensure that calculation is correct for all cases. We’ve come up with a fix for this issue and are in the final phases of a broad test pass in order to ensure that the fix works and doesn’t introduce any additional issues - especially any other calculation issues. This fix then needs to make its way through our official build lab and onto a download site - which we expect to happen very soon. [more]

Mark Sofroniou has more background:

How hard can arithmetic be? Today we were reminded again ... Why is arithmetic so difficult to get right? The standard "schoolbook" algorithms are pretty easy. But they're inefficient and often unnecessarily inaccurate. So people like me have done a huge amount of work to find more efficient and accurate algorithms. And the problem is that these algorithms are inevitably more complicated, and one has to be very careful to avoid insidious bugs.
Today's Excel bug is actually not about underlying multiplication--but instead about displaying numbers in base 10. Current computers essentially all represent numbers in binary. But when one outputs them, one has to convert to base 10. Doing this conversion might seem to be the "easy part." But actually, it's really subtle. The main issue is being able to round the underlying binary form to the "nearest representation" in decimal. If one gets it wrong, very bizarre things can happen.
The tricky part is that for some numbers, the [conversion] needs to be done to a higher precision than the numbers themselves in order to get the correct answers. Most computer systems just work with numbers at a fixed precision that's immediately available from the underlying hardware. And if one can't do anything to increase the precision, it's simply not possible to always get the right answer for binary-to-decimal and decimal-to-binary conversion.
This has happened many times. The Pentium FDIV bug in 1994 occurred only for particular numbers that happened to sample an incorrect entry in a lookup table. The Ariane 5 rocket explosion in 1996 occurred because a conversion between number types failed for particular numbers. And so on. [more]

And protoslo pontificates:

It sounds like they are doing small-number math in one representation (perhaps they use short fixed-width decimal representations) and then switching to another method (arbitrary length decimal numbers?) at the binary-inspired boundary 2^16...but somehow they got it mixed up with a different decimal boundary in the edge case.

Clearly the error is weirdly subtle, if 5.1*12850 gives the bugged behavior, but 8.5*7710 works just fine. In fact, I verified that all permutations of a bugged combination =A*B of the form =A/2*B*2 are bugged. Further...all of the buggy decimal values have no perfect floating point binary representation. 77.1 has an infinite binary expansion using IEE 754, while 8.5 has an exact representation. It seems likely that they are only using their BCD format (or whatever) when binary floating (or fixed) point just won't cut it, but then their internal->decimal conversion code chokes on 2^16 for some reason, while the binary (whether it is floating or fixed point) conversion works just fine (possibly because it doesn't have a boundary at 2^16--maybe it has its own threshold bugs ;p). [more]

Inevitably, Scott Morris compares with OpenOffice.org:

How many ways do I love open source? Let me count the ways. But not in Excel.

And before you get ahead of yourself, OpenOffice does this calculation properly ... Better switch to open source if you are doing any important financial spreadsheets. [more]

And Christopher Neher yells, "Get a Mac":

Here's another little equation: $400 for Office 2008 for Mac - $79 for iWork '08 = $321 in your pocket and the ability to multiply correctly. [more]

Sun's Brian Nitz spoofs it up:

The following is based on part of the OOXML "standard" Microsoft submitted to the ISO committee for fast tracking: (look up the paragraph referenced to see Microsoft's actual autoSpaceLikeWord95 "standard") multiplyLikeExcel2007 (Emulate Excel 2007 multiplication product output.) The * operation, previously known as "multiply" shall be replaced with a new operation which will be known as multiply. A description of this function shall emulate the behavior of a previously existing word processing application (Microsoft Excel 2007) when determining the function output of values near 65535, the resulting output (also known as "product") shall behave identically to Excel 2007. [Guidance: To faithfully replicate this behavior, applications must imitate the behavior of that application, which involves many possible behaviors (erroneous, random or otherwise) and cannot be faithfully placed into narrative for this Office Open XML Standard. If applications wish to match this behavior, they must utilize and duplicate the output of those applications (perhaps using a table or random number generator?) It is recommended that applications not intentionally replicate this behavior as it was deprecated due to issues with its output, and is maintained only for compatibility with existing documents from that application. end guidance]. [more]

Buffer overflow:

Around the Net Around Computerworld Previously in IT Blogwatch

And finally... Go Home Productions' official complete discography (CDs 1-12) now torrenting

Richi Jennings is an independent analyst/adviser/consultant, specializing in blogging, email, and spam. A 20 year, cross-functional IT veteran, he is also an analyst at Ferris Research. You too can pretend to be Richi's friend on Facebook, or just use boring old email: blogwatch@richi.co.uk.
Shop Tech Products at Amazon