Amazon’s Inferentia chip looks to bring machine learning to all – at Nvidia’s expense?

Artificial Intelligence
(Image credit: Pixabay)

Over at AWS re:Invent 2019, Amazon has officially launched its new Inferentia chip which is designed for machine learning.

Specifically, AWS Inferentia is a custom-built chip designed to facilitate faster and more cost-effective machine learning inferencing, meaning using models you’ve already trained to perform tasks and make predictions.

AWS says that Inferentia will deliver high throughput inference performance, and it will do this at an “extremely low-cost” with a pay-as-you-go usage model. Low latency is also promised courtesy of a hefty amount of on-chip memory.

In terms of that inference throughput, Inferentia is capable of achieving up to 128 TOPS (trillions of operations per second), and multiple chips can be combined together if you really want to push the performance boundaries.

TOPS trumps

As TechCrunch reports, Amazon’s new Inf1 instances promise up to 2,000 TOPS, no less. Compared to a regular G4 instance on EC2 – which uses the latest Nvidia T4 GPUs – Amazon claims that these new instances boast three times the throughput, with a 40% lower cost-per-inference, so they make for a compelling offering indeed.

Currently, Inferentia is only available with Amazon EC2, but it will be brought to other Amazon services including SageMaker and Amazon Elastic Inference soon enough.

Inferentia comes with the AWS Neuron SDK which facilitates complex neural net models that have been created and trained in popular frameworks.

Amazon observes: “Neuron consists of a compiler, run-time, and profiling tools and is pre-integrated into popular machine learning frameworks including TensorFlow, Pytorch, and MXNet to deliver optimal performance of EC2 Inf1 instances.”

Furthermore, Amazon notes that Inferentia, and its more cost-effective nature, is part of a broader drive to make machine learning accessible to all developers.

While the Inferentia chip may not pose an immediate danger to Nvidia, this path which Amazon is looking to carve for the future, and the customers it may attract to its relatively affordable cloud-based model, could be a threat to Nvidia’s sales in the machine learning arena. And of course Amazon itself won’t need to buy Tesla graphics solutions if the firm is using its own hardware…

Custom-built chips have obvious benefits in terms of the performance they can be driven to achieve, and Amazon isn’t the only innovator with ideas in this space, as Google has already been pushing forward with its own Tensor Processing Unit (TPU) solution for some years now.

Darren is a freelancer writing news and features for TechRadar (and occasionally T3) across a broad range of computing topics including CPUs, GPUs, various other hardware, VPNs, antivirus and more. He has written about tech for the best part of three decades, and writes books in his spare time (his debut novel - 'I Know What You Did Last Supper' - was published by Hachette UK in 2013).