Python code libraries are riddled with security holes

Glasses in front of computer screen
(Image credit: Kevin Ku / Pexels)

Almost half of the packages in the official Python Package Index (PyPI) repository have at least one security issue, according to analysis by Finnish researchers.

The researchers used static analysis to uncover the security issues in the open source packages, which they reason end up tainting software that use them.

In total the research scanned through 197,000 packages and found more than 749,000 security issues in all.

“With these results and the accompanying discussion, the paper contributes to the field of large-scale empirical studies for better understanding security problems in software ecosystems,” note the researchers in their paper. 

Cause for concern

Explaining their methodology the researchers note that despite the inherent limitations of static analysis, they still found at least one security issue in about 46% of the packages in the repository.

The paper reveals that of the issues identified, the maximum (442,373) are of low severity, while 227,426 are moderate severity issues. However, 11% of the flagged PyPI packages have 80,065 high severity issues. 

In terms of the issue types, exception handling and different code injections were found to be the most prevalent. 

“Of the 46% of all packages with at least one issue, the median number of issues is three,” note the researchers. Of course it’s not evenly distributed with a few packages riddled with a lot more issues, including five that were found to have more than a thousand issues.

The researchers have reason to be concerned. PyPI has been at the receiving end of several campaigns to poison the repository with malicious packages.

Earlier this year in June, PyPI was purged of half a dozen typosquatting packages that contained cryptomining malware, and a month before that the repository was flooded with spam packages.

Mayank Sharma

With almost two decades of writing and reporting on Linux, Mayank Sharma would like everyone to think he’s TechRadar Pro’s expert on the topic. Of course, he’s just as interested in other computing topics, particularly cybersecurity, cloud, containers, and coding.