GitHub autopilot "highly likely" to introduce bugs and vulnerabilities, report claims

Glasses in front of computer screen — Image Credit: Pexels (Image credit: Kevin Ku / Pexels)

Academic researchers discover that nearly 40% of the code suggestions by GitHub’s Copilot tool are erroneous, from a security point of view.

Developed by GitHub in collaboration with OpenAI, and currently in private beta testing, Copilot leverages artificial intelligence (AI) to make relevant coding suggestions to programmers as they write code.

To help quantify the value-add of the system, the academic researchers created 89 different scenarios for Copilot to suggest code for, which produced over 1600 programs. Reviewing them, the researchers discovered that almost 40% were vulnerable in one way or another.

Article continues below

Here’s our roundup of the best laptops for programming
Start your web development journey with these best HTML courses
Check our list of these best Python courses

“Overall, Copilot’s response to our scenarios is mixed from a security standpoint, given the large number of generated vulnerabilities (across all axes and languages, 39.33 % of the top and 40.48 % of the total options were vulnerable),” note the researchers.

Unfiltered learning

To perform their analysis, the researchers prompt Copilot to generate code in scenarios relevant to common software security weaknesses, and then analyze the generated code on three distinct parameters to gauge its effectiveness.

Since Copilot draws on publicly available code in GitHub repositories, the researchers theorize that the generated vulnerable code could perhaps just be the result of the system mimicking the behavior of buggy code in the repositories.

Furthermore, the researchers note that in addition to perhaps inheriting buggy training data, Copilot also fails to consider the age of the training data.

“What is ‘best practice’ at the time of writing may slowly become ‘bad practice’ as the cybersecurity landscape evolves. Instances of out-of-date practices can persist in the training set and lead to code generation based on obsolete approaches,” say the researchers.

GitHub didn’t immediately respond to TechRadar Pro’s email asking for their take on the research.

Here’s our roundup of the best laptops for programming

TOPICS

With almost two decades of writing and reporting on Linux, Mayank Sharma would like everyone to think he’s TechRadar Pro’s expert on the topic. Of course, he’s just as interested in other computing topics, particularly cybersecurity, cloud, containers, and coding.

Unfiltered learning

Useful links