IBM to use Apache Spark to underpin pretty much everything it does

Big Blue to assign more than 3,000 IBM'ers to the open source project

Apache Spark has attracted its biggest supporter yet, with IBM promising to run most of the company's services on the open source cluster computing framework; that includes Bluemix, Watson Health Cloud and all its analytics and commerce platforms.

In addition, Big Blue said that more than 3,500 of its staff will work on Spark-related projects and will create a Spark Technology Center in San Francisco, California.

The company will also release its SystemML machine learning technology to the Spark open source ecosystem, and pledged to help educate more than one million data scientists and data engineers on Apache cluster computing technology.

A new Spark

IBM said that Spark could shake up the data analytics market just like Linux did to the internet years ago. The company draw parallel with the popular OS, calling Spark an analytics operating system.

Spark thrives partly because it excels at analysing data when it is stored in computer memory, with performance improvements of up to 100x in some cases.

Such technology became a reality as the average price of system memory has decreased to the point where for large organisations, it doesn't break the bank to have terabytes of memory if the gains are tangible.