OpenAI has 3 new AI voice models that the ChatGPT maker says will ‘unlock a new class of voice apps for developers’

ChatGPT Advanced Voice Mode
(Image credit: Shutterstock/Koshiro K)

  • OpenAI has launched three new artificial intelligence (AI) models
  • They’re for real-time voice tasks: reasoning, translation, and transcription
  • Each one is designed to be integrated into developers’ AI apps

If you’re a regular ChatGPT user, you might be aware that you don’t have to interact with the artificial intelligence (AI) chatbot purely through text — it can speak to you and take your voice requests, too. Now, ChatGPT maker OpenAI has announced three new voice models that it believes will “unlock a new class of voice apps for developers.”

Each AI voice model is designed for a different purpose, including in-depth reasoning, translation, and transcription. If you’re looking for a voice model along those lines, they could be worth a shot.

According to OpenAI, the new models include the following:

Latest Videos From
  • “GPT‑Realtime‑2, our first voice model with GPT‑5‑class reasoning that can handle harder requests and carry the conversation forward naturally.
  • “GPT‑Realtime‑Translate, a new live translation model that translates speech from 70+ input languages into 13 output languages while keeping pace with the speaker.
  • “GPT‑Realtime‑Whisper, a new streaming speech-to-text that transcribes speech live as the speaker talks.”

OpenAI’s news post explains that the company has seen developers use AI voice models in three distinct ways: by asking the AI to carry out a task; by having the AI explain a situation (such as a travel delay) to the user; and by having conversations in the user’s local language.

It’s those use cases that OpenAI is trying to address with its new voice models. Each is designed for developers to use in their own apps, and all three are available as part of OpenAI’s Realtime API. GPT-Realtime-2 will cost $32 per one million input tokens and $64 per one million output tokens. GPT-Realtime-Translate is priced at $0.034 per minute, while GPT-Realtime-Whisper costs $0.017 per minute.

Three new tools for developers

A person uses ChatGPT's voice mode on their phone.

(Image credit: OpenAI)

If you’re after an AI model that is able to reason deeply and adapt to conversation flows, OpenAI says the new GPT-Realtime-2 option is for you. Developers can use it to check multiple sources at once, adjust its tone depending on the user’s input, tap into more advanced reasoning levels, and parse specialized terms (such as proper nouns and expressions used in healthcare and production).

Translation apps, on the other hand, can put GPT-Realtime-Translate to use converting speech in real time. Users will be able to speak their own language and have it translated and transcribed without delay. This model works with over 70 input languages and 13 output languages.

And if you want audio to be transcribed quickly and accurately, there’s GPT-Realtime-Whisper. This model is useful for creating captions, meeting notes, and summaries as conversations are ongoing, OpenAI says, which means “live products can feel faster, more responsive, and more natural.”

If you want to try out any of the new models, they’re available in OpenAI’s Playground site. And if you’re using Codex, OpenAI has created a prompt that will directly add GPT-Realtime-2 to the agentic coding platform.


Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds. Make sure to click the Follow button!

And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form, and get regular updates from us on WhatsApp too.

An Apple MacBook Air against a white background
The best laptops for all budgets

➡️ Read our full guide to the best laptops
1. Best overall:
Apple MacBook Air 13-inch M5
2. Best budget:
Apple MacBook Neo
3. Best Windows 11 laptop
Microsoft Surface Laptop 13-inch
4. Best thin and light:
Lenovo Yoga Slim 9i
5. Best Ultrabook
Asus Zenbook S 16

TOPICS
Alex Blake
Freelance Contributor

Alex Blake has been fooling around with computers since the early 1990s, and since that time he's learned a thing or two about tech. No more than two things, though. That's all his brain can hold. As well as TechRadar, Alex writes for iMore, Digital Trends and Creative Bloq, among others. He was previously commissioning editor at MacFormat magazine. That means he mostly covers the world of Apple and its latest products, but also Windows, computer peripherals, mobile apps, and much more beyond. When not writing, you can find him hiking the English countryside and gaming on his PC.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.