Q+A: Nvidia and AMD talk up DX11

DX11 render
One of DirectX 11's greatest features is its ability to render more realistic pebbles. Yay!

One of the key changes in Windows 7 – behind the fancy desktop dressing and pared down control panels – is DirectX 11.

As the name suggests, it's the successor to DirectX 10, Microsoft's home grown API for software developers to interact with graphics hardware.

Tony tamasi

Name: Tony Tamasi
Job title:
What he does: Oversees Nvidia's interaction with games engine designers and incorporating their wishlists into future hardware plans

Richard huddy

Name: Richard Huddy
Job title: Senior Manager Developer Relations, AMD
What he does: Heads up the team of AMD software engineers that visit games studios and help programmers use the latest hardware features

We're not in graphics anymore

There are five key new features that arrive with DX11, and the first has nothing to do with gaming graphics.

DirectX Compute is Microsoft's GPGPU language, joining Nvidia's CUDA and OpenCL in allowing graphics cards to be used for things like PhysX, AI and video transcoding – the later of which is supported as an impressively quick drag and drop feature in Windows 7.

Importantly, any DX10 graphics card will be able to run some DX Compute code, although because of hardware differences it's likely that most in-game routines will target DX11 class cards only.

Tony says:

"GPGPU, in general, is the most important initiative for the whole graphics card industry period.

"First of all, it's an entirely new market of potential software developers and customers that we can reach. The market for people who play games is obviously very large and very important to us, but the market for people who watch video, record things on YouTube and use Flash is much larger still, and in that market today there's no strong motivation for those people to buy high-end graphics processors.

"In GPGPU computing there's a whole realm of application possibilities and potentially large growth. Nvidia believes, at its core, that the more ways you have to make use of the graphics processor for any kind of parallel process is good.

"The exciting thing is that now we have Microsoft getting into the game and standardising the process, and completely legitimising GPGPU computing to the extent that Windows 7 has core functionality that will be accelerated.

"Using the GPU for general computing isn't just about Photoshop or video transcoding though. It's also about game functions outside of rendering. We're helping developers work on physics and artificial intelligence, for driving animation systems and so on. It can be used for a whole host of things."

Richard says:

"I've got 30GB of video on my hard drive and my portable video player just died, so I'm probably going to have to move it to another platform and reconvert my video. Transcoding that on the GPU is typically three times as fast as doing it on the CPU, and better than realtime.

"In Windows 7 it's so easy that my mother could transcode video just by dragging and dropping one file into another folder. It's a thing of beauty, it hides all of the tech from you.

"Compute shaders in games are very difficult to write for at the moment, though: even my engineers are struggling with it. They can run anything from half the speed of the original code if you mess it up to a factor of three or four times faster if it works well.

"A member of my staff now has his compute shader running three or four times faster than the pixel shader, and we like that. It means that things like post-processing effects will typically benefit from using compute shaders from now on.

Extra instructions

As part of the DirextX overhaul, the Higher Level Shading Language (HLSL) developers use to write shaders is being expanded to allow for longer shaders and subroutines, and for more to be done in a single pass.

Tony says:

"This is a big deal because now developers can write what I call 'uber-shaders'. It sets the stage for increasingly sophisticated shadow and lighting models, and the ability to have sub-routines allows you to have increasingly complex code, but in a manageable way."

Richard says

"The thing that strikes me as the most instantly useful part of it is the ability to pick up four texels at once in a single clock cycle.

"Typically, when you look down at the fine grain detail of a GPU, we're limited by one of two things: math operations and texture fetches.

"Either one of those will be the killer for any shader, the bottleneck. Texture fetches are a real problem, they're limited by bandwidth and they're limited by the number of texture instructions you fit into the shader and get them through the pipeline.

"If you can fetch four texels at once, you may be able to do four times as much texture work, especially if you're working with information that's in the texture cache. This can be a big win in some cases.