If that guess proves to be wrong, the result of one of the parallel instructions has to be discarded. A final point is that if only the arithmetic/logic and floating point units are duplicated, there is still contention for other resources within the processor.
All of the approaches to parallelism described above have one common factor. The parallel resources are managed within the processor, which means that they are transparent to both the operating system and application software.
In other words, if you'd swapped from a processor without pipelining to one that supports it, or if you'd upgraded from a pipelined processor with one arithmetic/logic unit to one with two, you'd have seen an immediate performance increase with the same software. Sadly, this does not happen with multi-core processors; as some industry experts have commented, there's no longer such a thing as a free lunch. For the first time, parallel resources will only work with specially written software that employs a technique called 'multi-threading'.
At its simplest, a thread can be an application. In this case, all you need to make use of multi-core processors is an operating system that supports multi-threading. Happily, all versions of Windows since XP have this support. Now you can run two programs at once without them competing for the resources of a single core.
This helps if you want to run a power-hungry application in the background – such as rendering – without the foreground task being affected; but it doesn't help if you want that power-hungry application to run faster. To achieve that, the application itself has to support multi-threading. This involves splitting up the operation of the code into logical tasks that can be performed independently and in parallel. We'll investigate this issue in quite a bit more detail later on.
The gains on offer
The idea that going from one to two or two to four cores will double the performance of a chip is, sadly, naive. So what performance gains can we reasonably expect in the real world? I asked James Reinders, Director of Marketing and Business Development for Intel's Software Development Products, what he expected to happen.
"Some programs scale well," he told me, "and may see gains close to the number of cores. Occasionally, this might mean a four-fold speed increase on four cores, but even a three-fold increase on four cores is good. Anything which has reasonably independent work to do can see such speed-ups, for instance database queries. But we will not see doubling cores make a system twice as fast in general."
So why does each additional core give less advantage than the previous one? "Eventually, the overhead of distributing work catches up to any program," Reinders explained. "For some it might be a few cores, others it might be tens of thousands. For anyone who has been up against a deadline, you know that having someone offer to help can sometimes be the answer to your prayers, and other times be simply of no use at all. Sometimes the overhead of breaking down short term work can be more trouble than it is worth. On the other hand, the more work there is to do, the more likely it is that you can use the help."
Reinders also had a cautionary tale to tell about how all this extra power might end up being used. "I used to have a 40MHz processor in my laptop, now I have one with a processor running at over 2GHz. Is that fifty times as fast? Not really, and the speed boosts it does have are used for many things other than running a single application faster such as smooth scrolling, higher resolution screens with more colours, wireless connectivity and virus checkers. The extra power is used more for things that I didn't have before, than for making my old applications faster. We will see history repeat itself with multi-core processors."