Nvidia reveals the personal supercomputer

Nvidia Tesla
Nvidia has quoted '250 times the computing performance of a PC'.

Harnessing the power of the GPU beyond just basic graphics output has been inevitably bigged-up by Nvidia as the next big computing leap. Such confidence has it that general processing GPUs (GPGPUs) will be in everyone's PC of the future that it has released a range of high-end GPU-based plug-in cards under the name Tesla.

The numbers make for compelling reading with the latest 10-series products in Nvidia's Tesla range quoting up to 960 cores and up to 4 teraflops of performance. Before you get all over-excited though, the cores referred to are in fact shader processor threads.

Unless your chosen application runs optimally under an accelerated floating point environment, you won't see much of Nvidia's quoted '250 times the computing performance of a PC'. That said, Nvidia's CUDA programming environment (see below) is designed to tap into Tesla's specifically designed architecture and start the ball rolling for the next generation of mainstream application development.

Data-intensive computations

Tesla's origins are fairly humble having originally stemmed from the GeForce 8 Series GPUs, hence the original Tesla 8 Series products. As if to emphasise its intentions for the high performance computing (HPC) market, Nvidia has done away with anything silly like display outputs. The current applications and sectors targeted with this product need the horsepower to perform complex, data-intensive computations right at their desk, processing more data faster and cutting down the time to getting results.

Dr. Graham Pullen of the University of Cambridge gave a compelling example of harnessing the power of a Tesla deskside system pitched against a 2.5GHz quad core CPU-only powered system. Using fluid dynamic analysis, the shape and positions of the blades of a turbine are adjusted for optimal flow.

Test results that were normally returned in over 12-hours on the CPU-only system, were returned in minutes on Tesla-enhanced systems. If a single blade was focussed on, the results returned were near real-time. This allows for engineers to adjust for best performance on-the-fly so as to achieve optimal designs much faster.

The end benefit to such hiking in computational performance is not just to engineers speed of receiving test results - the massive reduction in computational efforts subsequently leads to massive energy savings. Dr. Pullen also highlighted another major benefit specifically targeting the overcompensation of design thresholds in engineering, many in place for safety reasons.

The time taken to complete extremely complex tests with any real accuracy is often prohibitive and so the engineering world has had to live with inefficient estimates for some time now. With the scientific world reporting accelerated tests of up to 250 times their standard set-ups, less material leading to lighter and stronger equipment and in the case of engines, with more efficient fuel usage, will be the norm.

Trickling down from the scientific community, a number of well-known PC manufacturers, including Dell, have already jumped on the Tesla cart. Available now is Dell's Precision T7400 workstation featuring the single GPU C1060 model of Nvidia's Tesla with a 4 GPU S1070 1U rack system also available. Prices are dependent on the overall spec chosen, but expect to pay less than $10,000 for a system that, for certain applications at least, could out-perform a CPU-only based system 10 times its price.

Intel, with Larabee, and AMD/ATi, with Stream, are both likely to get in on the GPGPU act from late 2009. Once we get some decent price competition, you could be looking at a sub-£1,000 supercomputer sitting under your desk chucking out 10,000 fps of Crysis mayhem!

What is CUDA?

CUDA, originally derived from Compute Unified Device Architecture, is the computing engine inside Nvidia's GPUs. It can be directly programmed using the industry standard C programming language, with a few extensions. APIs such as OpenCL and DirectX 11 are also supported. OpenCL, or the Open Computing Language, is used for heterogeneous programming - tasking data processing across parallel GPU and CPU setups, which heralds a future for mainstream applications to benefit from Tesla and similar products.

Using CUDA, the latest Nvidia GPUs effectively become open architectures, much like CPUs, with the caveat that GPUs are most suitable to processing applications where the data sets are vast and formatted to run over the parallel "many-core" architecture of a GPU's shader thread environment. Traditionally, games rendering and physics calculations have been perfectly suited to this environment but now CUDA is being used to accelerate non-graphical applications such as engineering and stock markets trends analysis.


Now read The ultimate guide to graphics cards

Sign up for the free weekly TechRadar newsletter
Get tech news delivered straight to your inbox. Register for the free TechRadar newsletter and stay on top of the week's biggest stories and product releases. Sign up at http://www.techradar.com/register