Hallucinated packages could be the next big security risk hitting AI developers

Chatbot
(Image credit: Image Credit: Shutterstock)

The risks of Generative AI tools being able to “hallucinate” - or suggest sources, or tools, that don’t exist - has long been a concern for developers. 

Now, experts have warned that if a threat actor discovers a Generative AI hallucination of a, let’s say, software package, they can actually build it, and have it be malicious. 

That way, they’ll end up using super popular AI tools to distribute malware.

Not purely theoretical

Bar Lanyado, a cybersecurity researcher from Lasso Security, recently set out to see if the risk is purely theoretical, and concluded that it could be abused in the wild. 

For his analysis, he collected almost 50,000 “how to” questions which developers might ask Generative AI tools while building a software solution. He focused on five programming languages: python, node.js, go, .net, and ruby, and asked Chat-GPT 3.5-Turbo, GPT-4, Gemini Pro, and Coral.

GPT 4 hallucinated (made software packages up, essentially), 24.2% of the time, repeating the same answers in 19.6% of cases. GPT3.5 hallucinated 22.2% of the time, with 13.6% of repetitiveness, while Gemini hallucinated 64.5% of the time, with 14% of repetitiveness. Finally, Coral returned 29.1% of hallucinations, with 24.2% repetitiveness.

So far, so good. In theory, these four tools would often suggest developers download the same, non-existent packages. If the researcher noticed it, so could hackers, and they could create these hallucinated packages to carry malicious code and let Gen AI promote them. 

It works in practice, too. Lanyado said. 

He took one of the hallucinated packages and created it. To verify the number of real downloads, Lanyado also uploaded a dummy package, to eliminate scanner downloads from the total. “‍The results are astonishing,” he concluded. “In three months the fake and empty package got more than 30k authentic downloads! (and still counting).”

More from TechRadar Pro

Sead is a seasoned freelance journalist based in Sarajevo, Bosnia and Herzegovina. He writes about IT (cloud, IoT, 5G, VPN) and cybersecurity (ransomware, data breaches, laws and regulations). In his career, spanning more than a decade, he’s written for numerous media outlets, including Al Jazeera Balkans. He’s also held several modules on content writing for Represent Communications.