Everyone seems to be talking about ChatGPT nowadays thanks to Microsoft Bing, but given the nature of large language models (LLMs), a gamer would be forgiven if they feel a certain déjà vu.
See, even though LLMs run on huge cloud servers, they use special GPUs (e.g. Nvidia A100 or Nvidia H100) to do all the training they need to run. Usually, this means feeding a downright obscene amount of data through neural networks running on an array of GPUs with sophisticated tensor cores, and not only does this require a lot of power, but it also requires a lot of actual GPUs to do at scale.
This sounds a lot like cryptomining but it also doesn't. Cryptomining has nothing to do with machine learning algorithms and, unlike machine learning, cryptomining's only value is producing a highly speculative digital commodity called a token that some people think is worth something and so are willing to spend real money on it.
This gave rise to a cryptobubble that drove a shortage of GPUs over the past two years when cryptominers bought up all the Nvidia Ampere graphics cards from 2020 through 2022, leaving gamers out in the cold. That bubble has now popped, and GPU stock has now stabilized.
But with the rise of ChatGPT, are we about to see a repeat of the past two years? It's unlikely, but it's also not out of the question either.
Your graphics card is not going to drive major LLMs
While you might think the best graphics card you can buy might be the kind of thing that machine learning types might want for their setups, you'd be wrong. Unless you're at a university and you're researching machine learning algorithms, a consumer graphics card isn't going to be enough to drive the kind of algorithm you need.
Most LLMs and other generative AI models that produce images or music really put the emphasis on the first L: Large. ChatGPT has processed an unfathomably large amount of text, and a consumer GPU isn't really as suited for that task as industrial-strength GPUs that run on server-class infrastructure.
These are the GPUs that are going to be high in demand, and this is what has Nvidia so excited about ChatGPT: not that ChatGPT will help people, but that running it is going to require pretty much all of Nvidia's server-grade GPUs, meaning Nvidia's about to make bank on the ChatGPT excitement.
The next ChatGPT is going to be run in the cloud, not on local hardware
Unless you are Google or Microsoft, you aren't running your own LLM infrastructure. You're using someone else's in the form of cloud services. That means that you're not going to have a bunch of startups out there buying up all the graphics cards to develop their own LLMs.
More likely, we're going to see LLMaaS, or Large Language Models as a Service. You'll have Microsoft Azure or Amazon Web Services data centers with huge server farms full of GPUs ready to rent for your machine learning algorithms. This is the kind of thing that startups love. They hate buying equipment that isn't a ping-pong table or beanbag chair.
That means that as ChatGPT and other AI models proliferate, they aren't going to run locally on consumer hardware, even when the people running it are a small team of developers. They're going to be running on server-grade hardware, so no one is coming for your graphics card.
Gamers aren't out of the woods yet
So, nothing to worry about then? Well...
The thing is, while your RTX 4090 might be safe, the question becomes how many RTX 5090s will Nvidia make when it only has a limited amount of silicon at its disposal, and using that silicon for server-grade GPUs can be substantially more profitable than using it for a GeForce graphics card?
If there's anything to fear from the rise of ChatGPT, really, it's the prospect that fewer consumer GPUs get made because shareholders demand more server-grade GPUs are produced to maximize profits. That's no idle threat either, since the way the rules of capitalism are currently written, companies are often required to do whatever maximizes shareholder returns, and the cloud will always be more profitable than selling graphics cards to gamers.
On the other hand, this is really an Nvidia thing. Team Green might go all in on server GPUs with a reduced stock of consumer graphics cards but they aren't the only ones making graphics cards.
AMD RDNA 3 graphics cards just introduced AI hardware but this isn't anything close to the tensor cores in Nvidia cards, which makes Nvidia the de facto choice for machine learning use. That means AMD might become the default card maker for gamers while Nvidia moves on to something else.
It's definitely possible, and unlike crypto, AMD isn't likely to be a second-class LLMs card that is still good for LLMs if you can't get an Nvidia card. AMD really isn't equipped for machine learning at all, especially not at the level that LLMs require, so AMD just isn't a factor here. That means there will always be consumer-grade graphics cards for gamers out there, and good ones as well, there just might not be as many Nvidia cards as there once were.
Team Green partisans might not like that future, but it's the most likely one given the rise of ChatGPT.