Why even small businesses need big data

Image credit: Shutterstock (Image credit: Shutterstock)

Throughout the past few years, one key-word took over the digital revolution that has been going on for nearly three decades. That key-word is data. 

After years of focusing on processing speed and sophisticated protocols, companies realized that the most valuable asset in this digital age is actually user-generated data, and started restructuring their models accordingly to benefit from every single piece of information generated by every single user.

This strategy was first adopted by companies that are data-centered to begin with, like Google for example, but other major corporations rapidly followed suit by putting in place data-driven plans and trying to generate as much value as possible from the data at their disposal.

Birth of big data

The three main obstacles that prevented a data-centered ecosystem from emerging sooner were the high storage cost, the difficulty of processing large volumes of data, and the insufficiency of the user-generated data that companies have immediate access to.

But with giant tech firms dedicating enormous resources to the subject, these three obstacles were rapidly solved via the birth of big data. Thanks to achievements at both Google (the MapReduce paper) and Yahoo (the Hadoop project),  distributed storage using commodity hardware became a reality. Rapidly afterwards, distributed memory-based processing followed suit, announcing the start of a new era of this digital age: the data-driven era.

In the years that followed, a continuous exponential drop of data-storage costs accompanied by a continuous exponential rise of the amounts of data being generated on a daily basis meant that big data was the only path forward for major corporations. The open-source Hadoop ecosystem became a crowded space filled with promising technologies and aspiring startups working on big data products.

Yet, many small businesses that work with data still distance themselves from this world and its tools and still rely on trusted SQL-centered software. The reasons such companies might give when asked about their decision not to go into big data might be valid, but only if the question is asked back in 2011.

Image credit: Shutterstock

Image credit: Shutterstock (Image credit: Shutterstock)

Why adopt big data now

We live in an era where data is not only getting more valuable by the minute, but also each minute more data is being produced than the one before. Deciding to stick with older technologies will simply delay your business's transition to big data, making it more complex and messy when it eventually happens.

Sure, the first few years were bumpy for big data. Many technologies were introduced as game-changers at the start of the decade but then failed to scale accordingly when put to the test in production environments. But twelve years after the first Hadoop release, the ecosystem matured enough to allow for a seamless transition to big data even for small businesses. 

The key advantages of the transition are the following:


This is the first reason that generated the need for big data in the first place. Putting in place a big data architecture means that your infrastructure will be able to scale with your business with no extra effort and it'll allow you to dedicate your resources to other needs on the long term instead of adapting your data infrastructure

Extended functionalities

After nearly a decade of development and improvement thanks to the efforts of the world's best talents, open-source big data technologies offer much more than what you get with old-school servers on every level. With a parallel-processing technology like Apache Spark you can work with your data in completely new ways and generate insights that you have no access to with old querying models. Additionally, with a tool like Apache Zeppelin you could have direct and immediate access to your data to query it in whatever way you like thanks to its interpreters, before visualizing the output immediately in the same notebook.


With distributed computing, you don't just do more, you do more in less time. And at this day and age, in-depth analysis of your data is an essential factor to thrive, and with big data you could extend the reach of your analysis while maintaining a near-real-time performance.

With these advantages in mind, sticking with pre-big data technologies becomes a pointless fight that simply prevents you from generating more value from your most valuable resource. Additionally, thanks to cloud-based services, you can adopt a big data architecture with no hardware hurdles. Going into big data has never been easier, so what are you waiting for?

Mahdi Karabiben, big data consultant at INVIVOO

Mahdi Karabiben

A data engineer based in Paris. Mahdi worked for a data-marketing firm before transitioning into financial institutions. Mahdi works with Big Data technologies on a daily basis and writes about data on his Medium page.

He is on a mission to prove that all of those petabytes of data stored in distant data centres actually hold the answers to all of our world's problems. After starting his career at Democracy International, he helped Tunisian ministries work with electoral data, he joined the data-marketing firm Numberly to build data pipelines using Big Data technologies. Afterward, he transitioned into the financial sector, working on innovative data projects using cutting-edge technologies.