Is the cloud the wrong place for AI?

(Image credit: Shutterstock / Blackboard)

The enterprise software playbook seemed clear: everything moves to the cloud eventually. Applications, databases, storage: they all followed the same inevitable arc from on-premises to software-as-a-service.

But with the arrival and boom of artificial intelligence, we’re seeing a different story play out, one where the cloud is just one chapter rather than the entire book.

Brad Phipps

Director of SaaS and Infrastructure at Speechmatics.

AI systems

AI workloads are fundamentally different beasts than the enterprise applications that defined the cloud migration wave. Traditional software scales predictably, processes data in batches, and can tolerate some latency.

AI systems are non-deterministic, require massive parallel processing, and often need to respond in real-time. These differences reshape the entire economic equation of where and how you run your infrastructure.

Take the challenge of long-running training jobs. Machine learning models don't train on schedule; they train until they converge. This could be hours, days, or weeks. Cloud providers excel at providing infrastructure at short notice, but GPU capacity at hyperscalers can be hard to get without a 1 year reservation.

The result is either paying for guaranteed capacity you might not fully use, or risking that your training job gets interrupted when using spot instances to reduce costs.

Then there's the inference challenge. Unlike web applications that might see traffic spikes during Black Friday, AI services often need to scale continuously as customer usage grows.

The token-based pricing models that govern large language models make this scaling unpredictable in ways that traditional per-request pricing never was. A single customer query might consume 10 tokens or 10,000, depending on the complexity of the response and the size of the context window.

Hybrid approaches

The most intriguing development involves companies discovering hybrid approaches that acknowledge these unique requirements rather than abandoning the cloud. They're using on-premises infrastructure for baseline, predictable workloads while leveraging cloud resources for genuine bursts of demand.

They're co-locating servers closer to users for latency-sensitive applications like conversational AI. They're finding that owning their core infrastructure gives them the stability to experiment more freely with cloud services for specific use cases.

This evolution is being accelerated by regulatory requirements that simply don't fit the cloud-first model. Financial services, healthcare, and government customers often cannot allow data to leave their premises.

For these sectors, on-premises or on-device inference represents a compliance requirement rather than a preference. Rather than being a limitation, this constraint is driving innovation in edge computing and specialized hardware that makes local AI deployment increasingly viable.

Infrastructure strategies

The cloud providers aren't standing still, of course. They're developing AI-specific services, improving GPU access, and creating new pricing models. But the fundamental mismatch between AI's resource requirements and traditional cloud economics suggests that the future won't be a simple rerun of the SaaS revolution.

Instead, we're heading toward a more nuanced landscape where different types of AI workloads find their natural homes. Experimentation and rapid prototyping will likely remain cloud-native. Production inference for established products might move closer to owned infrastructure. Training runs might split between cloud spot instances for cost efficiency and dedicated hardware for mission-critical model development.

The approach represents a step toward infrastructure strategies that match the actual needs of AI systems rather than forcing them into patterns designed for different types of computing

The most successful AI companies of the next decade will likely be those that think beyond the cloud-first assumptions and build infrastructure strategies as sophisticated as their algorithms.

We've featured the best cloud storage.

This article was produced as part of TechRadarPro's Expert Insights channel where we feature the best and brightest minds in the technology industry today. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: https://www.techradar.com/news/submit-your-story-to-techradar-pro

TOPICS

Director of SaaS and Infrastructure at Speechmatics.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.