Why unstructured data poses a challenge for AI adoption

Robot hands emerging from laptop signifying AI
(Image credit: Shutterstock)

By 2030, AI is set to contribute up to $15.7 trillion to the global economy, driven by productivity gains from businesses automating processes and augmenting existing labor forces. The race is now on for organizations to incorporate AI into their tech stacks and capitalize on the vast potential of this evolving technology. According to IBM’s Global AI Adoption Index, 35% of companies are already using AI in their business, and an additional 42% are exploring how to implement the tech in the near future.

The business case for AI is clear. Automation can streamline workflows, increase employee productivity, assist in risk management and improve customer experience. However, most businesses seeking to adopt AI fail to realize that they need all of their data in one place before they can utilize AI to extract value.

Today, the majority of business data is unstructured. Unlike structured data, which is typically stored in a standardized tabular format on application platforms like Salesforce or Workday, unstructured data is often fragmented across various apps and content repositories. Unstructured data is the business-critical content that lies at the very heart of every organization, from deal-winning legal contracts and sales decks to eye-catching marketing graphics and breakthrough product designs.

But due to its inherent fragmentary nature, it is incredibly difficult to manage or analyze this data digitally, meaning that currently only around 0.5% of unstructured data is being used by businesses in any meaningful way. Given that by 2025, 80% of the world’s digital data will be unstructured, businesses considering AI adoption must get their unstructured data in order first. Otherwise, they will fail to unlock the true potential of what AI can bring to their organizations.

Sébastien Marotte

Sébastien Marotte is the President of Box EMEA.

The challenges of unstructured data

Businesses are generating more and more unstructured data in the form of digital content. Whether it is used for image search in the marketing department, call support in customer service or contract negotiations in the legal department, unstructured data lies at the very heart of a company’s business operations.

Unstructured data is fragmentary by its very nature and without an efficient content management strategy, it can easily get locked in different silos, making it very difficult to analyze. Alarmingly, according to the IDC, 90% of unstructured data is never analyzed, meaning that businesses are missing out on valuable insights from data within their company.

This siloed data can also impact productivity levels across an organization. When content is siloed, it can cause data retrieval delays, as employees are forced to navigate different systems or wait for access approval, which can slow down the pace of work. Similarly, siloed data can impede collaboration. Rather than being able to collaborate on a single piece of content in real time, employees may need to manually send files back and forth via email or messaging apps, which not only slows down the collaboration process but also can lead to version control issues.

Similarly, unstructured data can pose significant security risks to organizations. When data is siloed across different apps and locations, it becomes more labor-intensive and expensive to implement a holistic and coordinated security strategy. For instance, each silo may have its own security controls, leading to a duplication of efforts and increased administrative overheads. Furthermore, each data silo may have varying degrees of access and permission controls, meaning that content could be at risk of getting lost or leaked as a result of user error, or vulnerable to malicious behaviour, such as malware, hacks or fraud.

The importance of centralizing unstructured data

Statista predicts that by 2025, global data creation will grow to more than 180 zettabytes. For comparison, in 2020 there were 6.4 zettabytes of data worldwide. In order to get ahead and gain value from all of this data with AI, businesses must first remove silos and centralize their content in the cloud. This will improve productivity, collaboration and enable the creation of a singular centralized security strategy, helping to cut overhead costs.

For example, let’s think about how many teams collaborate on the preparation of a presentation to announce a new product. While the design team is building a deck, the sales team can draft copy together in real time. The product team can input new features directly, rather than sending them internally to the appropriate team. This helps streamline internal and external workflows and enables centralized security protocols, such as permissions management and end-to-end encryption.

Use a global financial services institution as an example. From pitches to client evaluation and documentation, most work gets done through document-centric processes. As such, a majority share of the firm’s intelligence is in documents, not in structured data. These documents live in 15 different systems in 5 different regions, making it impossible to simply search for all documents on a singular client - and even harder to secure that information. Before this company can even get to AI in order to drive intelligence and efficiency, it must centralize its content as the first step. In the case of an incident like Bernie Madoff, how could the firm look across all of its documents to determine what their exposure was to a bad actor? Without a centralized hub for business content or the use of AI, it would be a massive project not a query.

The benefits of AI for business content

Once an organization has a centralized content strategy, it can unlock the true potential of its data with AI. Content can be created in a matter of seconds based on archived materials in the content repository. Manual, repetitive and time-consuming business processes, such as document creation and classification, can be automated and done instantly.

Through image recognition, speech-to-text transcription and text analysis, previously untapped value can be extracted from content, helping to facilitate more data-driven decision making across all aspects of the organization. These can all help employees save time and focus on key strategic activities to drive business success.

For example, customer service operators can search records of customer conversations for keywords or product names. Graphic designers can search through endless image repositories more easily and quickly to find the right photo for a campaign. Paralegals, who need to understand the key points of a 100-page document, can use AI chatbots to extract the most useful insights in a few short paragraphs. Comms teams can use AI to draw on a company’s content archive to develop new materials, from blogs and press releases to social media posts, that align with the brand’s tone of voice and messaging.

In addition, AI-powered workflows are better protected when it comes to cybersecurity. They can automatically detect malware, hacks, fraud and other security risks and can also analyze typical user behavior - detecting anomalies and sending warnings when potential breaches are identified.

In the current economic climate, businesses are doing everything they can to get ahead. AI has vast potential in business use cases, but can be redundant if proper organizational structures are not put in place first. Siloed and unstructured datasets pose a huge challenge for AI processing, so businesses must first centralize and consolidate their content in one unified platform before making the leap to adopt these groundbreaking technologies and fully benefit from their true potential.

We've featured the best endpoint protection software.

Sébastien Marotte is the President of Box EMEA. Over a 30+ year career, he’s held executive roles at some of the world's highest-profile software companies including Google, Hyperion, and Oracle.