Is your data ready? This is the biggest mistake businesses make when building AI systems

An AI face in profile against a digital background.
(Image credit: Shutterstock / Ryzhi)

Industry 4.0 is reshaping the manufacturing industry and one of the key catalysts to its success is AI. But, given its productivity and quality benefits, why do over 60% of UK manufacturers not have AI implemented into their operations? It’s not as simple as other plug-and-play add-ons, AI needs to be fed on quality data and that’s lacking in today's manufacturing industry.

AI systems are only as good as the data they are fed on. Here, Nicholas highlights the roadmap data needs to follow before it can be fed into AI systems to gain actionable insights – from data cleaning and storage to the hybrid pathway to smart factories.

AI success hinges on AI-ready data. So much so that Gartner predicts 60% of AI projects unsupported by AI-ready data will be abandoned by the end of 2026. There is no shortage of data in the manufacturing industry. Data comes from machine sensors, IoT devices, and control systems but this raw data is not AI-ready straight away. First, it must be cleaned, contextualized, structured, and processed.

AI can easily miss signals or raise false alarms when it’s fed on uncontextualized data. For manufacturers to ensure this is the case, a data platform with secure governance and quality control is required. This foundation can ensure AI is fed accurate and reliable data. Here are the steps manufacturers need to take to make AI-ready data a reality:

Nicholas Lea-Trengrouse

Head of Business Intelligence at Columbus.

1: Data hygiene – simplifying data ensures a smooth AI journey

The multiple sources of data in the manufacturing industry can cause headaches for AI when trying to interpret and understand the data. Ensuring data is accurate, consistent, and complete is crucial.

Contextual metadata such as machine ID, timestamps, digital product passports, and batch numbers can help manufacturers fix errors, handle missing values, validate sensor outputs, remove duplicates, and flag anomalies before data is fed into AI platforms.

2: Management & Security – clear data ownership is key in the most cyberattacked industry

For three years running, manufacturing has been the most cyberattacked industry and with data coming from multiple sources, the threat of cyberattacks is vast. Manufacturers need to ensure sensitive data is securely managed by utilizing role-based access control and encryption.

First, clear ownership of data needs to be established with access rights and compliance rules, then a data catalogue can be developed so stakeholders know where data is, when it’s available, and how to access it.

3: Make data silos a thing of the past – centralize and contextualize data

One of the biggest barriers to effective data use in manufacturing is the siloing of Operational Technology (OT) and Information Technology (IT). To overcome this, data must be brought into a central platform. But consolidation alone isn’t enough – standardized definitions are needed to align data from the top floor to the factory floor.

Using a unified namespace or knowledge base helps connect equipment, processes, and sensor streams, reducing confusion and enabling consistency across the business.

With these foundations in place, manufacturers can transform raw industrial data into a structured digital twin. The next step? Feeding that data into advanced analytics and machine learning.

4: Old legacy and batch processes won’t cut it – time to upgrade your system

Once raw manufacturing data has been cleaned and contextualized, the next challenge facing manufacturers is where the data can be stored and processed. Many manufacturers still operate with outdated legacy databases and nightly batch processes which are unable to operate at the speed of Industry 4.0. In fact, over one-quarter of UK organizations have stated legacy technology as a key barrier to AI growth.

Enter, modern data architectures. These systems are flexible, scalable, and capable of handling large datasets and enable high-stake manufacturing environments such as supply chains, production lines, and maintenance schedules to operate with real-time insights.

5: Unify your business with a data lakehouse

In enabling real-time data insights, data can no longer be stored in data lakes, a traditional centralized system that can store large quantities of raw data in its native format. These are great for encapsulating sensor readings or machine logs for deeper analysis but without governance or structure they can quickly turn into ‘data swamps’. To move to the next level manufacturers, need to adopt a data lakehouse.

A data architect that combines the scale and flexibility of data lakes with the governance and schema management of a data warehouse allows for all areas of the business to work from one unified platform. This means everyone from data scientists who are interested in the raw, unstructured data, to business analysts who want well-structured data tables can work and collaborate using the same system.

But that’s not all. Through enabling machine learning, business intelligence and predictive analysis, data lakehouses can store data cheaply while enforcing structure to foster collaboration and speed up analysis.

6: Speed is everything on the factory to reduce maintenance downtime

Given the fast-paced nature of the manufacturing industry, data loses value if it arrives late. Take a factory setting for example. If a crucial machine overheats or malfunctions, it needs to be reported and flagged instantly for maintenance to be actioned.

Real-time streaming technologies process sensor data the moment it’s generated, enabling immediate action when issues arise. But the benefits go further – automated fault detection can spot anomalies in machine temperature or vibration, while live dashboards give operators instant visibility into throughput and quality.

The result? Faster response times, fewer disruptions, and smarter process adjustments reduce downtime, minimize waste, and boost efficiency.

7: Unlock smart factories with a hybrid edge-to-cloud model

In today’s modern manufacturing industry, many data architectures process all data following an edge-to-cloud model. Edge computing devices in the factory handle the here-and-now tasks such as local inference for anomaly detection or filtering sensor noise and cloud computing devices store large-scale analytics, historical data analysis, and advanced AI model training. This hybrid model gives manufacturers low latency at the edge and the ability to tap into the vast quantities of data in the cloud systems.

The predictive maintenance aspect of the manufacturing digital transformation journey will benefit greatly from this approach as the edge devices do the real-time monitoring and the cloud utilizes aggregated data from multiple locations to refine AI models. This is crucial for manufacturers as a recent McKinsey report stated that predictive maintenance can reduce maintenance costs by 10 to 40% and downtime by 50% and increase asset lifetime by 20 to 40%.

Manufacturers must handle data with care to unlock AI’s true potential

The manufacturing industry is almost there in unlocking the true potential of AI. Manufacturers already have the data they need to make some strategic shifts and benefit from the powers of Industry 4.0 and AI. It’s not a simple process turning raw industrial data into AI-ready data, but the efficiency, quality, and profitability benefits make the process worth it!

We've featured the best IT Automation software.

This article was produced as part of TechRadarPro's Expert Insights channel where we feature the best and brightest minds in the technology industry today. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: https://www.techradar.com/news/submit-your-story-to-techradar-pro

TOPICS

Nicholas Lea-Trengrouse is Head of Business Intelligence at Columbus.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.