Many PyPI code submissions are exposing vital security data

The Python banner logo on a computer screen running a code editor.
(Image credit: Shutterstock / Trismegist san)

Hackers are being gifted easy access to sensitive databases and important files thanks to sloppy software development practices, new research has claimed. 

A report from GitGuardian found many developers still erroneously leave passwords and other secrets in their code, providing unabated access to their products, to anyone who knows where to look. 

The secrets include not just passwords, but also cryptographic keys, security tokens, and other sensitive information. 


Reader Offer: $50 Amazon gift card with demo

Reader Offer: $50 Amazon gift card with demo
Perimeter 81's Malware Protection intercepts threats at the delivery stage to prevent known malware, polymorphic attacks, zero-day exploits, and more. Let your people use the web freely without risking data and network security.

Preferred partner (What does this mean?

Hundreds of valid key

To draft the report, GitGuardian’s researchers analyzed more than five million files, belonging to 450,000 projects published on PyPI, the official code repository for Python. They found almost 3,000 projects with at least one secret. In some instances, secrets were leaked more than once, and in total, almost 57,000 secrets were exposed. 

“Exposing secrets in open-source packages carries significant risks for developers and users alike,” the report states. “Attackers can exploit this information to gain unauthorized access, impersonate package maintainers, or manipulate users through social engineering tactics.”

Through these secrets, hackers can access Microsoft Active Directory servers, OAuth servers allowing single sign-on, SSH servers, and third-party services for customer communications and cryptocurrencies, ArsTechnica reported. 

In fact, the researchers found valid secrets such as Azure Active Directory API Keys, GitHub OAuth App Keys, Database credentials for providers such as MongoDB, MySQL, and PostgreSQL, Dropbox Keys, Auth0 Keys, SSH Credentials, Coinbase Credentials, and Twilio Master Credentials.

The researchers tested the credentials and concluded that more than 700 were still active. However, this doesn’t mean that the remaining ones are invalid, the researchers further explained: “Only once a secret has been properly rotated can you know if it is invalid. Some types of secrets GitGuardian is still working toward automatically validating include Hashicorp Vault Tokens, Splunk Authentication Tokens, Kubernetes Cluster Credentials, and Okta Tokens.”

Exposing credentials this way makes no sense in any scenario, leading the researchers to conclude that the developers only do it mistakenly.

More from TechRadar Pro

Sead is a seasoned freelance journalist based in Sarajevo, Bosnia and Herzegovina. He writes about IT (cloud, IoT, 5G, VPN) and cybersecurity (ransomware, data breaches, laws and regulations). In his career, spanning more than a decade, he’s written for numerous media outlets, including Al Jazeera Balkans. He’s also held several modules on content writing for Represent Communications.