AI guardrails can be easily beaten, even if you don't mean to

A representational concept of a social media network
(Image credit: Shutterstock / metamorworks)

Guardrails designed to prevent AI chatbots from generating illegal, explicit or otherwise wrong responses can be easily bypassed, according to research from the UK’s AI Safety Institute (AISI).

The AISI found that five undisclosed large language models were “highly vulnerable” to jailbreaks –inputs and prompts crafted to elicit responses that are not intended by their makers.

In a recent report, AISI researchers revealed that the models could be circumvented with minimal effort, highlighting the ongoing safety and security concerns associated with generative AI.

AI chatbots can be jailbroken too easily

The report, which arrived in anticipation of the upcoming AI Safety Summit in Seoul, jointly hosted by South Korea and the UK, noted:

“All tested models remain highly vulnerable to basic “jailbreaks”, and some will produce harmful outputs even without dedicated attempts to circumvent safeguards.”

Despite claims from all of the leading AI developers, such as OpenAI, Meta and Google, about their in-house safety measures, AISI’s findings suggest that significant gaps that could lead to potentially lead to major safety concerns remain.

Although the UK government has withheld the names of the five models it tested, it confirmed that they are publicly available.

The interim report, which precedes a full report expected to be published later this year with research from more than 30 countries, arrived just days before the Seoul-based AI Safety Summit, which is seen as the successor to Britain’s Bletchley Park summit late last year.

At the upcoming Seoul Summit, jointly hosted by South Korean President Yoon Suk Yeol and British Prime Minister Rishi Sunak, global leaders and industry experts are expected to come together to discuss AI safety within the realms of innovation and inclusivity.

More from TechRadar Pro

Craig Hale

With several years’ experience freelancing in tech and automotive circles, Craig’s specific interests lie in technology that is designed to better our lives, including AI and ML, productivity aids, and smart fitness. He is also passionate about cars and the decarbonisation of personal transportation. As an avid bargain-hunter, you can be sure that any deal Craig finds is top value!