The GPGPU revolution is more like a slow spin.
If you're an average home user, what has GPGPU ever done for you other than provide another bullet point on a graphics card manufacturer's box?
The truth is that professional high-performance computing offered via GPGPU is big business. Around a quarter of Nvidia's $1 billion turnover comes from 'professional' services, if you know what we mean. And we mean super computing. For specific applications – often 3D rendering and scientific modelling – GPGPU networked render farms provide a cost effective way of delivering massive computational power.
In days gone by, if you wanted teraflops of processing power you'd trundle over to the likes of IBM or Cray with your cheque book open, and let it get on with designing bespoke supercomputer systems with cryogenic cooling. Today, off-the-shelf components can do the same work thanks to bespoke software solutions.
All this hard development work eventually trickles down to the lowly individual users like you and me.
The bad news, depending on your outlook, is that even after being trickled, GPGPU tools all tend to be heavily maths biased. Great if you love maths, but who does?
Do the maths
Maths isn't a bad thing; ultimately 3D games are maths in the form of matrix transformations. It's just that any GPGPU functions need to work on a similar level. This limits applications until conditional branching becomes mainstream, which is happening, but slowly.
From the other direction, most GPGPU programs support all cards. With the exception of dedicated Nvidia CUDA builds, the main difference is the amount of work the card is capable of, and so its ultimate top speed. In some cases it's quite possible that an older graphics card could be out- performed by a more modern processor, even though our tests with encoding still saw a relatively poor Nvidia 6600 GT doing relatively well.
So what programs can you hunt down that will take advantage of your lazy, good-for-nothing GPU shaders? Well, to start, WinZip offers OpenCL acceleration for compressing and decompressing files with a 20-30 per cent increase in speed.
One of the original uses, and one that remains strong, is cracking encryption and passwords. Have a look at CRARk. A clever play on Crack RAR, this unfriendly- looking command line program is in fact a hardcore RAR password cracker, which uses GPGPU to increase attack speeds 20-fold. Using its benchmark mode, password checks jumped from 283 per second to 4,281 per second. We doubt it'll be particularly useful, but it's a real example of what can be achieved. If you want something slightly easier to use, try Parallel Recovery.
Another oldie but goodie is Folding@Home. This was – and still is – one of the best known applications of GPGPU, which was made extra famous by taking advantage of the PlayStation 3 Cell processor. Equally clever is the distributed modelling system that hands out work tasks to individual systems. However, despite its cleverness and the fact that it could be helping humanity advance, Folding@Home doesn't actually do anything practical.
The first truly useful program is Musemage, an image processor written from the ground up around GPGPU acceleration. This makes it lightning-fast, and enables it to apply filters, effects and image manipulations in real time. It's an impressively swift package, and it's interesting how a little load on the GPU makes for a huge gain in program performance. For example, adjusting blur levels adds just a 5 per cent GPGPU load.
The same thing is coming to GIMP via a technology called GEGL, but this isn't due to be fully implemented until version 2.10. There was talk of it being partially implemented on certain filters for 2.8 RC1, but that seems to be unavailable for now.
The big guns are also turning their attention to GPGPU acceleration. Adobe has already rolled out Photoshop and Premier, while Sony offers its Vegas Movie Studio 11. A host of encoding tools also take full advantage of GPGPU. Freemake boasts video encoding that leverages CUDA and DXVA, but CyberLink Media Espresso is excellent too.
Finally, you can give 3D rendering a speed boost, using the powerful (but free) Blender and LuxRender combo. Even with GPGPU acceleration, ray-tracing remains an arduous task, but the results can be worth it.
One issue with rendering is that not all operations can be supported by pure GPGPU calculations. Memory access for the GPU remains an issue, and sometimes means that the CPU ends up holding the GPU through many operations.
If you have an old or slow CPU, your GPU can cut encode times:
Available (as the name suggests) for free, Freemake Video Converter uses CUDA and general hardware acceleration via DXVA. When you install it, remember to deselect all the adware bundled with it. There's quite the lot in there. Hardware acceleration is enabled by default, so all you need to do is load up a suitable video file ready for converting.
Set and go
Now you just need to drag your video to the interface and select one of the preset options that run along the bottom of the window. We're going to encode our own custom video for an Android tablet, so we chose the 'MP4 > Add your preset' option to give us custom options across the board. Start by setting the resolution to that of the device, such as 1,280 x 720 or 800 x 480.
Damn you Intel
The issue for GPGPU encoding in the modern world is that Intel came along with its Quick Sync technology. That's dedicated hardware for encoding media right on the processor. Offloading to the GPU when there's something on the processor makes no sense and isn't as fast, so if you're running an Intel Sandy Bridge Core processor or later, this will be the best option.
Blend it like Blender
GPGPU acceleration for the über-powerful but free renderer:
For our project we're using the 64-bit version of Blender. The GPGPU-accelerated 64-bit LuxRender is available too. You should also install the 64-bit version of Microsoft Visual C++ 2008, or the 32-bit version.
We're not going to look at Blender in any detail – you could write a whole book about just configuring its interface. We need to add LuxBlend as an external rendering plug-in. Click the 'File > User preferences' menu. Select the 'Addons' tab, then click the bottom 'Install addon' button and find the LuxBlendXX_64bit.zip file in the Program Files/LuxBlend Install directory. Select this and click "Install".
This installs the plug-in that enables Blender to send render jobs to LuxRend, but we need to enable it before it will work. First, in the 'User preferences > Add ons' tab, select the render categories on the left and check the 'Render: LuxRender' tick box. It can take a few seconds to register. Select the 'System' tab, and over in the bottom left select 'OpenCL'. Save as 'Defaults'.
From the top Info bar, open the central pull-down menu that lists the available render engines and select 'LuxRender'. Over on the right, a number of interface menus will change. Open the 'Render' tab and set the Path to point to where LuxRender has been install within Program Files. Scroll down and find 'LuxRender render settings'. Under 'Rendering mode' select 'Hybrid path'.
Engage with GPGPU
Make sure 'Use GPUs' is selected. The bar below will enable you to select the OpenCL device to use if you have more than one graphics card or an OpenCL capable processor installed. At this point you can scroll back up and click the 'Render' button. All being well, LuxRender will open and render the slowest cube you've ever seen.
Away you go
You're now ready for a bit of GPGPU-accelerated rendering. Blender is a complex beast, but LuxRender offers a beginner's tutorial that's well worth trying. Your first job will be simply getting used to its complex interface. Our top survival trick is holding [Ctrl] and clicking the corner anchors to drag panels around.
Get more from GPU
Tasty tools for testing and teasing your GPGPU:
Caps Viewer is very handy when you're playing with GPGPU. It's not essential, but if you're trying to tune or compare systems and software running GPGPU accelerated software, it has a host of information and tests you'll want to use. The opening tab provides a current GPU load and temperatures, with benchmarks available at the bottom.
Smallpt GPU is more of historic interest, but still stands as a valid GPGPU performance test. Smallpt GPU was the original test software to validate GPGPU as a way to accelerate ray tracing work. It sits alongside the CPU-only version Smallpt CPU. It's based on the Cornell scene – a single room containing balls, to which you can add various lighting and environmental effects.
The final validation of LuxRender appeared as LuxMark, a usable GPGPU benchmark test that was designed to see how well the rendering system worked over different OpenCL and CUDA systems. It enabled the team to collate a lot of real-world testing andprovide a useful benchmark for people to test their GPGPU speeds.
Finally, let's have a but of fractal fun. Fragmentarium is an interesting 3D fractal explorer that makes full use of your GPGPU to render some impressive 3D fractals. Fire it up and select 'Octobulb' from the right-hand drop-down menu, then click the 'Apply' button to see it in action. It's a wee bit complex, but it's worth the time playing around with for some stunning images.