The basic idea of multi-GPU graphics cards makes a lot of sense. Graphics rendering is an inherently parallel workload. So, why not strap a few GPUs together and crank out even more pixels?
When that approach works, it works really well. Catch SLI and Crossfire in full multi-chip flow and both produce huge performance. The problem is, they don't always work.
Currently, it all boils down to drivers. Both SLI and Crossfire are dependent on driver profiles to detect games and apply the correct multi-GPU scaling method. If there's no driver profile for a given game or if the game is not detected correctly, you're in trouble. At best, you'll get single-GPU performance. At worst, the game won't run at all.
So, the task for AMD's latest dual-chip beast, the ATI Radeon HD 4870 X2, is clear enough. It must sidestep all suggestion of multi-GPU iffyness and deliver comparable stability and ease of use to that of conventional single-chip graphics cards.
The big idea is smaller chips
Actually, it's a particularly important task for AMD since the company has effectively given up duking it out with NVIDIA for the fastest single graphics chip prize. Instead, AMD's ruse is to slap NVIDIA's monstrous GeForce GTX 280 around with a pair of slightly smaller 4870 chips.
But will it work? At first glance, the new 4870 X2 certainly looks like a step up for twin-chip technology. Compared to AMD's previous dual-chipper, the Radeon HD 3870 X2, the inter-GPU PCI Express bridge chip has been upgraded from 1.0 to 2.0 specification.
AMD has also added a 5GB bi-directional sideport to each GPU. The end result of which is a boost in bandwidth from 6.8GB/s to 21.8GB/s.
But perhaps AMD's niftiest move is to give this massive card fully 2GB of graphics memory. It's often overlooked that all multi-GPU platforms to date use discrete memory for each graphics chip. Hence, the graphics data must be copied to memory for each GPU. In other words, a 1GB dual-chip card is actually 2x 512MB.
In the most advanced games at really high resolutions, 512MB may not be enough to store all the game data. When that happens, a graphics card is forced to use the PCI Express bus to fetch data. And that means hideously slow frame rates. But with 1GB per GPU, the 4870 should suffer from no such shortcoming.
The 4870 is also very user friendly by multi-GPU standards. Notably, it's capable of full multi-monitor support. NVIDIA's SLI platform, whether in the form of two cards or a dual-chip board like the old GeForce 7950 GX2, does not support multiple displays.
Two chips on one card
As for the rest of the 4870 X2's nitty gritty, it largely mirrors the single-GPU Radeon HD 4870 chipset from which it is derived. The GPUs themselves are essentially the same 55nm items running at an identical 750MHz.
Of course, there's two of 'em and that makes for a faintly silly grand total of 1,600 stream processor and a theoretical maximum compute performance of 2.4TFLOPs. For the record, that's the same as the fastest supercomputer in the world circa 1999.
Yes, the 4870 X2 is a match for a room-filling machine from the last century covering goodness knows how many thousands of square metres. Insane.
Anyhow, the X2 also matches the single-chip 4870 for speed with a freakishly fast 3.6Bbps data rate from its GDDR5 graphics memory though remember it has 1GB per GPU to the standard 4870's 512MB. Like every other 4800 series card, the X2 is the full DirectX 10.1 Monty. For what it's worth, NVIDIA's competing GPUs remain 10.0 bound.
And so to the really important question. How well does the 4780 X perform? When it works, it's unconscionably quick, and clearly the fastest single card on that planet.
Much quicker than the GeForce GTX 280, then? Yes, but that is no surprise given that a single Radeon HD 4870 GPU isn't all that far behind.
Most impressive is how the X2 occasionally puts real distance between itself and a pair of standard Radeon HD 4780s running in Crossfire mode. For instance, the 4870s really fall off a cliff at the uber high resolution of 2,560 x 1,600 (thanks to data swapping of the PCI express bus).
No such problems for the X2 and its 1GB-per-GPU memory buffers. It just keeps on pumping out highly playable frame rates.
It's not all good news, however. On our Intel Skulltrail test platform, the X2 is highly unstable when running Crytek's state-of-the-art shooter, Crysis. Indeed, the 4870s in Crossfire exhibit identical behaviour. Neither will last more than five seconds after a level load before locking up.
No doubt we are unlucky that our particular test configuration doesn't jive well with Crossfire. Likewise, a fix will probably be forthcoming. But it's the sort of problem that you encounter time and again with multi-GPU solutions. And it's incredibly infuriating.
One day, a company will produce a multi-chip solution that behaves likes a single rendering device and we'll all forget the flakiness of current efforts. Despite the 4870 X2's often impressive performance, that day has not yet arrived.