Making the most of data: from experimentation to action

Therefore, Hadoop is best adopted as part of a matrix of technologies which enable businesses to expand the boundaries of analytics and realise the full potential of their data.

This capability is known as a Unified Data Architecture – in this scenario, Hadoop provides a data lake capability, to source and store unlimited data volumes and types. Furthermore Hadoop is invaluable in preparing information for analytics most commonly undertaken in a data warehouse.

Data Warehouses have evolved to become core business information engines, providing a robust solution that supports analytics across hundreds or thousands of users, with a predictability of response time and reliable availability that ensures business operations can rely on analytics to drive tactical business decision.

Hadoop compliments a data warehouse by lowering the cost of data acquisition and preparation – a key element to the economics of data management.

Increasingly this ad-hoc analytics that has traditionally occurred in a data warehouse does not go far enough to support the wide range of discovery analytics that organisations want to undertake on new sources and with new types of analytics.

Often the data required for this discovery activity is held in raw from in Hadoop and an exploratory approach is required to unravel the structure and identify the elements of value.

A data warehouse is not the most efficient place to do this, but neither is Hadoop – most organisations lack the data scientists that are needed to write the complex coding in Hadoop.

Increasingly organisations are creating a 'Discovery' environment, which provides user-friendly access to a growing range of analytical techniques. Optimally this environment must be integrated with Hadoop and their existing Data Warehouse to simplify data management.

The next article looks at how fail fast 'discovery' adopts an iterative approach to problem solving as the basis of a more responsive and flexible development strategy.

  • Duncan Ross is the director of data science at Teradata UK. He works across the International Area and in all industries. Current areas of focus include social network analysis, social media analytics, the integration of big data with transactional data, and driving business decisions through analytics.