Why data fragmentation is becoming a business problem, not just a technical one

A digital representation of the globe in blue with binary numbers around it — (Image credit: Getty Images)

Gartner expects 60% of AI projects that lack AI-ready data to be abandoned through 2026. "AI-ready" sounds like a technical label. In practice, it describes something much more ordinary: whether a business can actually find, combine, and use the data it already owns.

That is a data fragmentation problem: the state where the information you need to plan a business decision is scattered across different systems, formats, teams and legal entities.

Many boards still treat it as an IT issue. It isn't.

Where data fragmentation breaks the business

Engineers describe data fragmentation as silos and duplicated feeds. That is accurate, but it only names the symptoms.

The business version is easier to see. Walk through a typical week at a large company and you will find it:

A global marketer tries to read customer behavior across regions and discovers there are five different definitions of "customer" in use. A publisher wants to show a brand partner which of its customers were actually reached and converted, but proving it means sharing audience data neither side is willing to expose. There are so many examples.

These are composites built from conversations I have had this year so far, not specific clients. But the pattern is the point. Each problem gets logged as a technical ticket and solved in isolation. None of them stay solved, because the next question arrives in a slightly different shape.

That pattern is what costs money. Not any single issue, but the company-wide drag on the speed and quality of decisions made with data the business already owns. It never appears as a line on the finance report. Instead, it shows up as projects that stall, deals that slip, and AI initiatives that never reach production.

The real cost of compressed decisions

The most expensive thing fragmentation does is force decisions under a deadline the data cannot meet.

Take a campaign planning cycle. A brand wants to activate on its best customers on premium publisher inventory. Its CRM data exists. The publisher’s audience data exists.

The traditional route requires negotiating a data-sharing agreement, getting legal sign-off on both sides, and building a technical integration. By the time that’s done, the campaign window has passed.

Why the obvious fixes fall short

The instinctive responses to this problem are understandable. Publishers further invest in data management platforms (DMPs) to better package their audiences. Brands build out customer data platforms (CDPs) to unify their customer database. And the major platforms (Google, Amazon, Meta) offer their own clean room environments to enable some degree of collaboration within their walls.

None of these solve the cross-partner problem. DMPs and CDPs are built for internal data orchestration. So they give you a better view of what you already own, but they weren't designed for collaboration with external partners. And walled garden clean rooms solve a narrow version of the problem: you can collaborate with the platform's own data, but you can't bring independent partners in, and the platform itself can see what you're doing.

The common limitation is that each of these tools was built to consolidate or contain data as opposed to letting it work across boundaries while staying where it is. That's a different problem, and it needs a different kind of solution.

Data collaboration is replacing consolidation

A smaller group of organizations has stopped trying to move or merge their data at all. They are using data clean rooms built on privacy-enhancing technologies, or PETs: a family of approaches that includes confidential computing, federated learning, and secure multi-party computation (among others).

The common thread is simple: these tools let two or more organizations ask a question of each other's data, and get a joint answer, without either side ever seeing or copying the other's raw records.

At Decentriq we build data clean rooms on this foundation. Global brands, publishers, and especially regulated enterprises use these to collaborate across data that has to stay where it is.

Our Collaborative Audience Platform extends this further, giving organizations a single environment to build, activate, and measure audiences across partner data in real time, without either party’s records ever leaving their own systems.

A few examples of what that looks like in practice:

A major Swiss bank worked with publisher Goldbach to build advertising audiences that resembled its existing best customers. The cost of reaching them dropped 44%. The bank's own customer records never left its systems.

IKEA and Austrian publisher willhaben matched their customer and audience data inside a clean room. Cost per visit fell 30%. The return on each Euro of ad spend went up 10%.

In consumer health, Laboratoires Pierre Fabre built a detailed picture of the customers buying its Avène and Aderma sunscreen brands by combining its own data with audiences from three French publishers (Reworld Media, Groupe Marie-Claire and Média Figaro). No customer data was exposed to any of them.

Ten years ago, each of those questions would have meant pooling the underlying data and absorbing the full compliance overhead. None of these did. Each answered one question across one boundary, and left the rest of the data estate as fragmented as it was before.

The question boards should actually be asking

Data fragmentation isn't going to be solved by making every company look like one big database. The organizations pulling ahead have accepted that acting on data doesn't require owning it, and have invested in the ability to collaborate across company boundaries.

For years, boards asked how to collapse the boundaries. A better question is how to act across them. That question has practical answers now, and I see more of them every quarter. The companies asking it aren't waiting for the consolidation project to finish.

"The data you need to understand your customer usually exists, it's just not all in your own systems. The companies pulling ahead have stopped trying to own it all, and started finding ways to act on it together."

We feature the best data migration tools.

This article was produced as part of TechRadar Pro Perspectives, our channel to feature the best and brightest minds in the technology industry today.

The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: https://www.techradar.com/pro/perspectives-how-to-submit

TOPICS

Co-founder and CEO of Decentriq.