16TB of corporate intelligence data exposed in one of the largest lead-generation dataset leaks

Data leak
(Image credit: Shutterstock)

  • Researchers found an unprotected 16TB MongoDB database exposing nearly two billion PII‑filled records
  • Data likely scraped from LinkedIn and Apollo.io, tied to a possible lead‑gen company
  • Database was secured after disclosure, but exposure duration and malicious access remain unknown

More than 16 terabytes of professional and corporate intelligence data, including personally identifiable information (PII), was sitting in an unprotected database, available to anyone who knew where to look.

This is according to cybersecurity researchers at Cybernews who found the database and described it as “one of the largest lead-generation datasets to have ever leaked.”

Despite the risks and the disruptive potential, unprotected databases remain one of the most common causes of data leaks. In this instance, the researchers found a MongoDB database with almost 4.3 billion documents.

Catch the price drop- Get 30% OFF for Enterprise and Business plans

Catch the price drop- Get 30% OFF for Enterprise and Business plans

The Black Friday campaign offers 30% off for Enterprise and Business plans for a 1- or 2-year subscription. It’s valid until December 10th, 2025. Customers must enter the promo code BLACKB2B-30 at checkout to redeem the offer.

Personally identifiable information

The documents were split into nine collections, labeled “intent”, “profiles”, “people”, “sitemap”, and “companies” - among others. This structure led the researchers to believe that the database was likely scraped, possibly from LinkedIn and Apollo.io (an AI sales platform).

Of the nine collections, at least three contained personally identifiable information. These collections, holding almost two billion files, exposed people’s names, emails, phone numbers, LinkedIn URLs and profile handles, position titles, employers, employment history, education, degrees and certifications, location data, languages, skills, functions, social media accounts, image URLs, email confidence scoring, and Apollo IDs.

One of the collections also had people’s photographs. All of the PII exposed put users at serious risk of identity theft or fraud.

Cybernews says it could not attribute the database to a specific entity without reasonable doubt, but said that it did find clues pointing to a lead generation company.

“The company helps businesses find and connect with potential customers, providing access to a large-scale B2B database of leads that strongly correlates with the type of information included in the exposed database,” the report states. The researchers reached out to that company, and while they did not get confirmation of ownership, the database was locked down two days later.

It is also unknown for how long the instance remained open, or if a malicious actor accessed it before, but it’s certainly possible.

Via Cybernews


Best antivirus software header
The best antivirus for all budgets

Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds. Make sure to click the Follow button!

And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form, and get regular updates from us on WhatsApp too.

Sead is a seasoned freelance journalist based in Sarajevo, Bosnia and Herzegovina. He writes about IT (cloud, IoT, 5G, VPN) and cybersecurity (ransomware, data breaches, laws and regulations). In his career, spanning more than a decade, he’s written for numerous media outlets, including Al Jazeera Balkans. He’s also held several modules on content writing for Represent Communications.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.