Nvidia's GeForce GTS 250 and the ATI Radeon HD 4770 from AMD share a common purpose.
For gaming junkies, you might even say it's a sacred calling. Both aim to deliver the maximum performance in return for the minimum pecunias. Other 3D cards might be faster, but none can match the bang-for-buck ratio achieved by these mass market pixel pumpers.
The funny thing is that the way they go about it couldn't be more different. Take the GTS 250. Contrary to what the branding insinuates, this is not a new mid-range derivative of Nvidia's mighty GT200 GPU, the graphics chip that forms the beating heart of both the GeForce GTX 285 and 295. It's yet another rehash of the trusty old G92 core that began life eons ago in the GeForce 8800 GT.
Since then, it has fought Nvidia's cause in a number of different guises and yet remarkably little has changed. Now as then, the latest version of G92 packs 128 stream processors, the mini-programmable execution cores responsible for calculating funky visual effects in the latest games.
Likewise it still has 64 texture filtering units, 16 pixel outputs and a 256-bit memory bus. In fact, the only significant tweak involves the manufacturing process used to produce G92 dies.
What began life as a 65nm chip has since been given a slight 55nm squeeze. Consequently, each one is physically smaller. And smaller dies make for cheaper chips. All of which means the GTS 250 adds up to a slightly dated enthusiast class GPU sold at a mainstream price. You can now bag a 512MB GTS 250 for well under £100. But what about AMD's Radeon HD 4770?
Well, it's new from the ground up and sports an architecture optimised to give the best possible performance where it really counts for mainstream customers. AMD has therefore decided to focus the chip's resources heavily on shader processing.
With no less than 640 stream processors and a core clockspeed of 750MHz, the 4770 has around 75 per cent of the raw computational power of AMD's fastest single GPU, the Radeon HD 4890. That's a card that typically sells for nearly £200 and is therefore two and half times more expensive than the 4770.
A question of bandwidth
Indeed, the 4770 also matches the 4800 for pixel output with 16 ROPs and comes close in the texture processing department with 32 units, just eight fewer than its bigger brother.
In fact, in terms of floating point processing power – an interesting if somewhat academic measure of a graphics chip's computational grunt – the 4770 even manages to get within about 10 per cent of Nvidia's might GeForce GTX 285, a graphics card that sells for around £300. So, how has AMD pulled this off at such a low price point? By reducing the size of the 4770's die, that's how.
For starters, thanks to the use of 40nm chip production technology the 4770 has the tiniest transistors yet seen in any GPU. But AMD has also made one very significant compromise in architectural terms. The 4770 has a 128-bit memory bus.
That's half the width of the 250's memory bus and one quarter the size of a GeForce GTX 285's. The upside is lower manufacturing costs. The narrower bus requires fewer connections making both the chip packaging and graphics board design simpler and cheaper.
The penalty, of course, is less bandwidth into and out of the GPU. That sounds bad, but AMD knows that at lower resolutions bandwidth is less critical.
And given that the 4770 is a mainstream board, it's not likely to be paired with large, high resolution monitors - in single-card configurations, at least. Instead, the 4770 will typically be driving 20-or 22-inch monitors with 1,680 x 1,050 pixel grids.
You may be wondering what all this has to do with multi-GPU performance. Actually, it's highly relevant for reasons that ultimately involve memory bandwidth. For starters, any multi- GPU setup comes with increased expectations.
What with the multiple cards and the mobo needed to support them, you're looking at a fairly expensive rig. That in turns means you're more likely to be running at higher resolutions.
The mechanics of multi-GPU technology also count. To cut a long story short, the most common multi-GPU rendering method is alternate-frame rendering (AFR) which, as the name suggests, involves the GPUs taking turns drawing full frames.
That requires both cards having a complete copy of the graphics data, which further compounds the problem of data bandwidth. This is precisely where the differences between the Radeon HD 4770 and GeForce GTS 250 are most telling.
As our benchmark results show, a pair of Radeon HD 4770s in dual-GPU CrossFireX configuration have a nasty habit of losing the plot at higher resolutions. Far Cry 2 is the best example, with performance plummeting horribly above 1,680 x 1,050. Yup, it's the 4770's poxy 128-bit memory bus doing the damage.
Making matters worse, early examples of the 4770, including the HIS boards tested here, are limited to 512MB. At really high resolutions and detail settings, that can force the cards to use main system memory to store graphics data which further reduces performance.
By contrast, the GTS 250's enthusiast class origins and 256-bit memory make a much better platform for multi-GPU antics. As the resolutions ramp up, it maintains its composure and performs in a much more linear fashion.
The fact that the Gigabyte and Zotac GTS 250s used for this test have 1GB frame buffers also helps. That's particularly true at the epic 2,560 x 1,600 resolution where data swapping over the PCI-e bus can become a major handicap for 512MB cards.