Innovation, open source and data - how to get ahead

Digital clouds against a blue background.

(Image credit: Shutterstock / Blackboard)

Innovation is essential to stay ahead of your competition. Without new ways of working, new products or new digital services, companies will fall behind their competitors that are willing to try new things. For example, EY’s research into the technology, media and telecoms sector found that 87% of companies defined as leaders in the market believe that their success will depend on maintaining their levels of capital investment.

About the author

Bryan Kirshner, Vice President of Strategy, DataStax.

Yet these investments are tricky as well. Doing something completely new or different can lead to failure. According to the same EY report, 63% of respondents admitted that they failed to achieve the kind of returns that were forecast and planned for. At the heart of this is how companies plan ahead to turn their experiments into everyday processes.

Alongside this, our own research on the State of The Data Race found key attributes of “data leaders” - companies that excel at using data to deliver value to customers and are most likely to derive at least 20% of their revenue from data and analytics - include the ways that those companies use combinations of open source software in order to build their stacks. Open source plays a significant role in how these organizations graduate beyond small-scale initiatives around data and analytics that display promise and move them into production and keep delivering.

Innovation and open source

Open source makes it easier to innovate and to experiment around applications and around data. There are several reasons for this.

The first is the lower cost to deploy. As open source projects have no license costs, it’s easier for organizations to try them out. Alongside this, the community around each project will have experience in putting different projects together to meet specific requirements and create more extensive deployments. This experience can help you push ahead on your own experiments and look at how to use data more efficiently.

Similarly, the second element is how outside pressures can help to spur innovation. The global pandemic caused by COVID-19 pushed a lot of businesses to look at new ways of working. The pandemic proved a lot in terms of human resolve and kindness, but it also proved that organizations could reinvent many core business processes in days, rather than weeks or months. When it comes to speed of getting started, open source makes a big contribution to this.

Lastly, application modernization projects are also growing in importance. IDC found that 86% of respondents this year said they had modernized more than 50% of their legacy applications, up from 65% previously. These projects represent a great opportunity to use open source as a route to updating existing applications, reducing costs, and finding opportunities to innovate. The open source ecosystem offers technologies that range from de facto standards to best-of-breed for modern, cloud-native applications.

Each of these areas can act as a spur to innovate on their own. What we have now understood better is that organizations that put business leaders in charge of their data tend to drive more benefits from this invaluable asset. It might sound a bit contrived to say that “information is power” but data that drives insight and knowledge really is the way we can empower people to make positive change happen. Open source makes it easier to build these frameworks, but it also keeps the organization in control over how this will run in the future.

Open source also makes it easier to democratize how teams can innovate around data. Information and insight projects typically start with the IT function as this department is used to handling data. However, this approach should also be put in the hands of non-technical people to help innovate around more applied use cases. This data can be presented through abstractions and through visualization tools, making it easier to interact with this data over time, but also simplifying how those non-technical teams can start expressing themselves around what they want to discover.

In the past, we have all suffered from too much top-down development, driven by corporate roadmaps and what the C-suite perceives to be its market-facing targets. We can now look at where the roadblocks to democratizing innovation are and open up opportunities to foster more bottom-up innovation as well.

Putting this into practice

There are so many opportunities to innovate more with data, but we must also face up to the challenges that will follow on. As we all work more widely with data and the services it can drive, it’s important to ensure that we keep a lock on the operational side too, as there’s a risk of creating data sprawl and disconnected data silos.

To put this into context, developers can now build data-driven applications using microservices and containers to create each service element. Each of these services will be encapsulated, so any one service can be altered or updated without affecting the others, while new services can be added to the mix too. Where previously we may have started our architectural journey by asking ourselves what is the “most” we need to build, we can now ask what is the “least” we need to build to run right now. Over time, the data infrastructure to support new innovation can get added.

To make innovation work, we need to make the data stack work in practice. This includes helping businesses make data accessible and reusable, but also that processes are put in place where data structures are clearly defined and easy to engineer into other services across the business. With this in place, we can then scale up around that data over time more easily. In essence, we need to think about the journey before we start, understand the path ahead of us and then look at how to make that journey easier to carry out.

Open source obviously plays a central role here at each level of the infrastructure involved. For the database layer, options like Apache Cassandra can provide distributed data support for services that can scale up and support hundreds of thousands to millions of customers per second. Services like Apache Pulsar can act as the message bus to manage event streaming between application components and services. Underneath all of this, Kubernetes can act as a key facilitating orchestration platform, managing the infrastructure and scaling it as needed. This gives us the ability to approach the holy trinity of speed, resilience and scale with a new go-to attitude.

There will still be challenges along the way and some migrations to new cloud-centric data advantages can appear to be “heavy lift” jobs at the outset. However, most of these roadblocks are caused by issues of perception, process, or culture, not by the technology itself. Working on the people element to help around this is the most effective route to long term success and ensuring that developers and other staff across the company can build new services and innovate around data. The fact that many of the components that can be used are open source should help get over those hurdles, as they can be applied in the best way to meet specific team goals and the ambitions for the wider business.