How does ChatGPT actually know what to say? Here's how the AI generates its answers

(Image credit: ilgmyzin/Unsplash)

If you’ve ever found yourself marvelling at how ChatGPT can come up with coherent, contextually relevant replies (most of the time) you’re not alone. But what’s actually going on behind the scenes?

You might think it doesn’t really matter how ChatGPT works. But I’d argue it’s crucial to at least get your head around the basics. Especially as this kind of AI becomes more deeply embedded in how we live and work. If you treat it like “magic” you’re more likely to over-rely on it or use it in ways that don’t serve you.

In this guide, we’ll demystify the process behind its skills and go some way towards understanding how ChatGPT actually works.

How does ChatGPT break down what you say?

ChatGPT is a type of AI called a large language model (LLM). More specifically, it’s a causal language model, which means it generates text by predicting the next word (or part of a word) based on what came before it. Think of it like predictive text on your phone, just way more advanced.

Before ChatGPT can make predictions, it needs to process what you ask it to do in a way that it can understand. This is where tokenization comes in.

Tokens are the basic units of text ChatGPT works with. A token can be as short as a single character or as long as a whole word. For example, the word “ChatGPT” might be split into the tokens “Chat” and “GPT.”

When you enter a prompt, ChatGPT converts it into a sequence of these tokens. It then analyzes the tokens and begins predicting the next one over and over again until you see a complete response.

How does ChatGPT decide what to say next?

A man in a suit using a laptop with a projected display showing a mockup of the ChatGPT interface. — (Image credit: Shutterstock)

ChatGPT generates its responses to you one token at a time. Here's how that works:

Input processing: The prompt you write into ChatGPT is broken down into tokens.

Contextual analysis: ChatGPT analyzes these tokens to understand the context of what you’re asking.

Next-token prediction: It then calculates which token is most likely to come next.

Iteration: That predicted token is added to the sequence. The cycle then repeats until the model completes its response to you.

This real-time, token-by-token generation is why ChatGPT’s replies often appear as if they're being typed out. Because, in a way, they are; ChatGPT is building its answer to you one token at a time.

How does ChatGPT know what matters?

ChatGPT runs on a type of deep learning system called a Transformer.

Transformers rely on something known as self-attention. This helps ChatGPT decide how important each word in a sentence is relative to the others.

It’s this process that allows ChatGPT to consider the context of words. But it does so not just in isolation, but in relation to the full sentence or prompt. And that's crucial for its ability to understand nuance and ambiguity..

For example, let’s take the sentence “The bank will not approve the loan.”

The word “bank” could mean a financial institution or the side of a stream. Thanks to self-attention, ChatGPT can look at the words around it to figure out which meaning fits best.

How did ChatGPT learn to talk like us?

We’ve covered how ChatGPT reads context, predicts what should come next, pays attention to the right bit and repeats the cycle. But how does it actually know which words to use?

Well, ChatGPT’s abilities come from extensive training on huge and varied datasets. These are massive collections of data from a wide range of sources. Think of it like ChatGPT’s library of reading material to better understand us and how we write and speak.

The process happens in two main stages:

Pre-training: ChatGPT learns to predict the next token in a sentence by analyzing vast amounts of text. This helps it grasp grammar, facts about the world and even some basic reasoning skills.

Fine-tuning: The model is fine-tuned on more specific datasets. This often involves human reviewers, who provide feedback to help shape the model’s behavior and make its responses more useful, appropriate and aligned with our expectations.

Why doesn't ChatGPT always give the same answer?

A ChatGPT window on a laptop screen — (Image credit: OpenAI / Future)

When ChatGPT predicts the next token, it’s not just picking a word at random. Instead, it calculates the probability of all of the possible tokens and chooses the one most likely to come next.

This approach is what allows ChatGPT to generate responses that are usually coherent and contextually appropriate.

Interestingly, it also explains how you can put the same thing into ChatGPT one day and it can result in different replies the next. That’s because multiple next-token options might have similar probabilities.

ChatGPT is a smart tool, but not a mind

Even though ChatGPT can produce impressively human-like responses, it’s important to remember that it doesn’t actually understand language the way you and I do.

There’s no awareness or comprehension going on here. Instead, it works to identify patterns and find correlations in data it’s presented with and trained on. Essentially, it’s an advanced prediction machine.

Learning this really matters when considering AI’s limitations. It means it doesn’t truly understand what you’re saying or grasp the meaning behind it. Instead, it’s generating text based on statistical patterns.

This also explains why AI tools can “hallucinate”; this is the term used to describe the way ChatGPT and other LLMs can produce incorrect, vague or nonsensical responses.

We also need to bear in mind what that vast dataset it’s trained on (that initial reading material) consists of – because it learns from existing text. And that means AI tools can also unintentionally reflect biases that were present in that all-important training material. It's like if you were to read about the history of the world, but only from the perspective of a few hundred people.

ChatGPT is an impressive technology and its ability to generate fluent, relevant responses makes it a powerful tool for everything from productivity to creative brainstorming.

But hopefully what this guide shows you is that it’s still just a tool. It doesn’t think or understand like we do, instead it predicts. And keeping that in mind is key to using it wisely and effectively.

Becca is a contributor to TechRadar, a freelance journalist and author. She’s been writing about consumer tech and popular science for more than ten years, covering all kinds of topics, including why robots have eyes and whether we’ll experience the overview effect one day. She’s particularly interested in VR/AR, wearables, digital health, space tech and chatting to experts and academics about the future. She’s contributed to TechRadar, T3, Wired, New Scientist, The Guardian, Inverse and many more. Her first book, Screen Time, came out in January 2021 with Bonnier Books. She loves science-fiction, brutalist architecture, and spending too much time floating through space in virtual reality.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.