As cloud is more widely adopted and its benefits realised, there is an increased focus on security. Keeping data and infrastructure safe from external threats, and often internal risk, is a key element of any cloud strategy.

However, beyond that, data hosted by a third party is subject to other types of risk such as unexpected downtime and failure.

Having a comprehensive disaster recovery (DR) and business continuation plan in place is therefore vital for any organisation with data, applications or infrastructure hosted either on premise or by a third party.

However, while 66% of respondents in recent research by analyst Forrester, rated this as a high or critical business priority, disaster recovery only accounted for 5.8% of the overall IT budget for companies in 2013.

Fine-tuning the DR strategy

The first consideration of any successful DR strategy is ensuring that there is a low risk of failure – whether the data is hosted on- or off-premise.

For externally hosted solutions, this means partnering with a provider that has the credentials (such as ISO 27001) and expertise to supply both the infrastructure, connectivity and support to guarantee uptime and availability.

The next step in the strategy is developing the actual DR service detailing the steps to be taken in the event of a failure. DR impacts on the entire organisation and includes aspects relating to the connectivity around it, dealing with the data and the switchover of users.

How will you access your environment in the event of a disruption? Via a dedicated communications link or a VPN? Does your organisation have this additional connectivity or is it required? What is the cost of that additional connectivity?

Another critical issue is whether the DR service can be effectively tested. Often it is difficult to evaluate and monitor it without experiencing a system disruption, and testing complex DR plans can also carry a degree of risk.

Rigorous testing

That said, regular redundancy tests are crucial to ensure that workloads will failover as planned once the DR service is activated.

Should an incident occur during a test the DR service can be properly evaluated and its performance measured against what was expected. In this way areas of improvement can be identified and adopted by the service post-test but without impact on the production service.

The requirements of an effective DR service might be hard to define, but the necessity of it remains. Needs may change over time as your infrastructure or resources evolve and continuous evaluation and test of the plan must be the final step in its development.

  • Russel Ridgley is an enthusiastic systems architect, developer and product designer at Pulsant. His specialties are hosting technologies, networking, Microsoft .NET development and database design.