This week we celebrate the 20th anniversary of the FDIV bug, an error in the then-new Intel Pentium processor. It was discovered by Thomas Nicely, a professor of mathematics, on 19 October 1994 and reported to Intel five days later. Then on 30 October 1994, he wrote a fateful email to "a number of individuals and organizations" that set the wheels in motion.
The processor floating-point divide problem was caused by a subtle but specific circuit-design error; the flaw was easily corrected with changes to masks in the next regular production revision of the chip, in 1994.
Although its actual impact would have been negligible, it then snowballed into something much bigger, thanks to national press coverage and can legitimately be called the first computer hardware problem to have made the headline worldwide, well before the Millennium Bug.
Why was there a bug in the first place? Well, because microprocessors are such complicated pieces of technology that, even back then, involves millions of transistors. The whole industry was still learning about tools, processes (steppings) and mechanisms (like microcode, firmware updates) that would help reduce defects and errors but even today, bugs are very much part and parcel of any processor.
We asked Intel for some background on how the company lived through what could be called its first real PR crisis and Tom Waldrop, an Intel veteran who witnessed first-hand the drama unravel, was kind enough to contribute [note that his account is entirely based on Andy Grove's "Only the paranoid survive (opens in new tab)" and Albert Yu's "Creating the Digital Future: The Secrets of Consistent Innovation at Intel (opens in new tab)".]
"Intel learnt a lot from that issue; previously, the firm had been the classic engineering-driven company. For the twenty-six years, we had decided what was good and what wasn't when it came to our own products. We set our own quality levels and specifications, and shipped when we decided a product met our own criteria. Furthermore, since we generally didn't sell microprocessors to computer users but to computer makers, whatever problems we had in the past, we used to handle with the computer manufactures, engineer to engineer, based on data analysis.
But in 1991, Intel had introduced the "Intel Inside" program, a major merchandising campaign intended to give Intel distinction and an identity, and to help build computer-user communities' awareness of Intel and our products. Hundreds of manufacturers, domestic and international, participated in this campaign, and Intel spent a lot of money promoting the brand. By 1994, research showed that our "Intel Inside" logo had become one of the most recognised logos in consumer merchandising.
So when problems developed with our flagship Pentium chip, our merchandising pointed the users directly back to us. In addition, we had become the world's largest semiconductor manufacturer, and we were growing faster than most large companies. We had become gigantic in the eyes of computer buyers. The old rules of our business no longer applied. The trouble was, not only didn't we realize that the rules had changed – what was worse, we didn't know what rules we now had to abide by.
Intel CEO Andy Grove said at the time: "Bad companies are destroyed by crises. Good companies survive them. Great companies are improved by them."
Intel survived the crisis and was made stronger by it. We dramatically improved our validation methodology to quickly capture and fix errata, and investigated innovative ways to design products that are error-free right from the beginning. We set up permanent phone support teams and web-based discussion groups to listen to and respond to consumer needs. We found we could shorten our response time from days to minutes on urgent matters.
Lastly, we embarked on a policy to publish all errata we found so that our end-users would have open and full disclosure from us. We explain all errata to OEMs and ISVs and work with them to devise workarounds so that there is minimal impact on end-users."
Intel's 15-core Xeon Ivy Bridge-EX CPU has more than 4.3 billion transistors, about 1400x more than what the original Pentium processor packed.