What are foundation models?

Foundation models
(Image credit: NPowell/GPT Image-1)

Foundation models are the big daddy of modern AI systems, as they are the neural networks which act as the backbone of all modern artificial intelligence systems.

Some of the most famous of these are Google's Gemini, Anthropic's Claude, OpenAI's GPT range and Llama from Meta.

The key feature of these models is that they have been built from scratch and trained on a vast amount of data culled from multiple domains, such as text audio video and images.

What makes them special is they are designed to understand and process a huge range of information, for use in a diverse number of applications.

In many ways they are the fundamental building blocks on which more specialist models are created, which then produce specific outputs for health, financial services, and other business and industrial needs.

Secret training, stunning results

The ultra specialist training methods used for these massive billion dollar models is often a closely guarded secret, although there are a number of smaller foundation models which are at least partly or fully open.

This training typically involves exposing the neural network to an enormous amount of data, using either supervised or unsupervised learning techniques. The network learns how to identify patterns and relationships within the data, eventually without the need for explicit human supervision.

This means it can develop a deep and comprehensive understanding of the world, which lets it output value and relevant responses to user requests on demand.

However the massive compute requirements of these huge models gives them limited application away from their massive cloud computing homes. The really interesting work therefore often comes once the foundation model is fine-tuned for more general deployment.

Fine-tuning spreads the love

Foundation models

(Image credit: Freepik)

In many cases this fine tuning is done by the owners themselves, such as Google or OpenAI, but some models like Llama or Deepseek are actively fine-tuned by the public at large, and released to the world under an open license.

Because they're optimized for more modest computing requirements, rather than massive data centers, these smaller models can reach a much wider demographic of users across the world, and for a variety of uses.

This wide ranging flexibility has given rise to some extremely powerful AI systems for delivering applications such as video and image generation, language translation, music generation and much much more.

In each case the model has either been tuned by the brand owner themselves, or through the work done by third party research and commercial agencies

A good example of the type of specialist models derived from foundation models are the multimodal products which can handle different inputs such as images, audio, and even video.

Recently we have also witnessed an astonishing growth in the rise of what are called reasoning models, which are trained specifically to think about the task at hand in logical steps before delivering their answers. It has been a step change in AI's utility across a wide range of applications.

The issue of safety

Safety

(Image credit: Freepik)

Because foundation models are designed specifically to deliver a vast range of application utility, they are typically subject to stringent controls to prevent abuse at the hands of unscrupulous users.

This aspect of AI 'safety' is becoming increasingly important as the models grow in power. Brand owners struggle to maintain a balance between open ended utility, and the need to prevent abuse in things like video and image production.

One of the major concerns over the development of advanced artificial intelligence has been the lack of a globally coordinated impetus to govern the delivery of safe AI which mitigates or minimizes any potential threats to the world at large.

The other important aspect of the rise of these mega models around the world is the question of responsible deployment.

There are legitimate concerns that widespread use without some form of planned implementation, could lead to massive disruption in labor markets, geopolitical interactions and more.

As we look towards the future, we can only hope that the public demand for ethical, sustainable AI will ensure that these amazing technological products will deliver all the benefits that society needs, without any of the peril and drama.

Nigel Powell
Tech Journalist

Nigel Powell is an author, columnist, and consultant with over 30 years of experience in the tech industry. He produced the weekly Don't Panic technology column in the Sunday Times newspaper for 16 years and is the author of the Sunday Times book of Computer Answers, published by Harper Collins. He has been a technology pundit on Sky Television's Global Village program and a regular contributor to BBC Radio Five's Men's Hour. He's an expert in all things software, security, privacy, mobile, AI, and tech innovation.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.