IBM to use Apache Spark to underpin pretty much everything it does

Apache Spark has attracted its biggest supporter yet, with IBM promising to run most of the company's services on the open source cluster computing framework; that includes Bluemix, Watson Health Cloud and all its analytics and commerce platforms.

In addition, Big Blue said that more than 3,500 of its staff will work on Spark-related projects and will create a Spark Technology Center in San Francisco, California.

The company will also release its SystemML machine learning technology to the Spark open source ecosystem, and pledged to help educate more than one million data scientists and data engineers on Apache cluster computing technology.

A new Spark

IBM said that Spark could shake up the data analytics market just like Linux did to the internet years ago. The company draw parallel with the popular OS, calling Spark an analytics operating system.

Spark thrives partly because it excels at analysing data when it is stored in computer memory, with performance improvements of up to 100x in some cases.

Such technology became a reality as the average price of system memory has decreased to the point where for large organisations, it doesn't break the bank to have terabytes of memory if the gains are tangible.

Spark meets Cassandra: what it means for big data analytics

Désiré has been musing and writing about technology during a career spanning four decades. He dabbled in website builders and web hosting when DHTML and frames were in vogue and started narrating about the impact of technology on society just before the start of the Y2K hysteria at the turn of the last millennium.

A new Spark

Useful links