With the name Tahiti being dangled before us, we were hoping for a more luxurious surrounding for an insight into AMD's new Southern Islands graphics processor than the basement of a Soho hotel.
At least there were no pesky cocktails by the pool or white sandy beaches to distract us from the real business at hand: Graphics Core Next.
You may have heard about GCN in the autumn of last year, but now that advanced technology has finally touched down and we've had a chance to sit with the AMD engineers and grill them about what they've been up to.
Article continues below
The Tahiti XT and Tahiti Pro are the first 28nm PCIe 3.0 graphics cards around, and AMD is rightly excited about being the first to hit the market with two brand new bits of graphics technology.
But the benefits of a new production process and a new connection interface only tell part of the story of AMD's latest opus, because there's brand new silicon here too. Until now, the two main players in the graphics world had taken diverging paths when putting together their GPUs.
Nvidia, from the word go, has been working towards a 'scalar' architecture - a brute force approach that involves having a large number of simple processors in an array, giving them each one thing to work on at a time until all the instructions are completed. It's not the most efficient way of going about a task at hand, but it is the most flexible approach.
The evidence of this is in how much GPGPU computing the Nvidia CUDA cores are capable of. It also threw a lot of these cores at its high-end processors which meant that in both computing and graphical terms they were seriously powerful cards.
AMD, however, was more concerned with raw graphical processing than general computing and as such opted for a four-way 'vector' processor instead. That meant sorting out single instructions into batches before firing them down the GPU pipelines. It was a much more elegant solution, and was incredibly efficient for fixed graphical processing.
The resulting HD 4xxx through to HD 6xxx series cards were great pixel pushers with considerably lower power requirements than their peers.
Times are changing though, and that general purpose computing is becoming more and more necessary for a company putting a lot of its eggs in the APU, conjoined CPU/GPU twin basket. As such, the very long instruction words (VLIW) 'vector' processor of the old AMD days is now coming into line with Nvidia's 'scalar' architecture.
"From a purely technical point of view, the architecture that we've moved to is similar," says Mike Mantor, AMD's senior fellow architect. "On the GPU side though we've been investigating this scalar architecture for some time. We had good ideas how we'd go about it, what our gains would be and the cost of actually implementing it."
So, what are the gains? "It has its advantages when you really do have data dependencies," says Mantor. "That's not the case across all games." This means that AMD still sees the old VLIW architecture as effective in gaming, which explains why within the scalar architecture there are still the remnants of four-way vector units.
The Graphics Core Next (GCN) architecture that's going in the Tahiti GPU is built of Compute Units (CUs), which comprised four vector units with 16 unified shaders in each. The CUs can be thought of as similar to Nvidia's Streaming Microprocessors and are essentially self-contained processors capable of acting independently of the whole.
It's this combination of the four-way vector processing and the scalar architecture that AMD hopes will push its Tahiti-based GPUs to the top of the graphics card pile. The fact the CUs can be put to task on more compute-oriented tasks makes them ideal going forward.
"GCN is optimised for heterogeneous computing," says Mantor. But AMD hasn't been resting on its gaming laurels either. "Delivering the performance improvement in games reliably has been the important thing," he adds.
"We modified the architecture not just to get to that scalar bit." The top-end Tahiti, the Radeon HD 7970, is shipping with 32 Compute Units, making a grand total of 2,048 Radeon Cores in the GPU. The Radeon HD 7950, lower down the pecking order, still comes in with 28 CUs and 1,792 Radeon Cores.
While it may well be an apples vs oranges comparison between the old and new architecture, the HD 6970 was sporting a relatively lowly 1,536 Radeon Cores.
Beat the FLOP
What does that all mean in performance terms? Well, the fact that the Tahiti XT GPU, in all its PCIe 3.0 and 28nm finery, is capable of around 3.8TFLOPs of processing performance should tell you something. The previous generation was batting around 2.7TFLOPs, which means that should equate to some serious speed boosts.
And it really should, when you consider that this vast GPU is housing 4.3 billion transistors. Though with AMD's current numeracy issues (it recently revised the transistor count of the Bulldozer CPU down from 2bn to 1.2bn) that could always change further down the road.
With the Tahiti XT chip sitting at a 925MHz clock speed out of the box, you could be forgiven for thinking there wouldn't be much headroom left in the chip. According to Zvika Greenstein, AMD director of Product Management for Discrete Graphics, AMD "made a conscious decision to leave a lot of overclocking headroom."
But why not clock it higher out of the box if it's capable? "One of the things the enthusiast likes to do with our cards is overclock them. They pay a premium for that," says Greenstein. "We can position the HD 7970 as the fastest graphics card in the market at the reference clocks, so we thought that we might as well leave it to the end users."
We would have thought that having a 1GHz GPU at launch would have really pushed Nvidia to compete when it brings Kepler to market in a couple of months' time, but it looks as though AMD is leaving that option to its motherboard manufacturers and their factory overclocked offerings.
AMD is still confident that it's the fastest single GPU around, but wasn't willing to give us any concrete numbers. So we got our hands on a selection of Southern Islands cards to see if AMD can be trusted in it's claims.