Even minor drops in availability can result in lost revenues, reduced output, and decreased productivity.

Take a quick second to think about something right now: Can you guarantee beyond any reasonable doubt that the people using your web or online services will never experience downtime for more than a few minutes?

What do you think happens when service outages plague a system? If your system is a consumer-facing one, customers move elsewhere, or simply complain because of inaccessibility and poor service.

If it’s a service used for internal business processes, your employees lose productivity, or work simply stops.

This is especially true of time-sensitive services such as online banking, stock trading and the like, which require a strong and predictable level of reliability.

Any crippling downtime or delay can result in lost revenues, reduced output, and decreased productivity.

Enterprises that want to scale or online businesses that are already gaining traction will have to implement strategies for ensuring uptime, thereby also ensuring customer satisfaction.

In both the consumer-facing and B2B environment, one of the best ways to combat these issues is by employing high availability strategies, which keep services running at availability levels very close to 100%.

How to achieve high availability

High availability, often referred to as the “five 9s” due to the required 99.999% uptime in any given calendar year, refers to a system design and implementation that ensures optimal performance during a given period.

It requires elimination of single points of failure, reliable crossovers, and detecting potential failures in advance.

This type of system all but eliminates the chance that transactions won’t be completed due to traffic spikes or infrastructure overload.

In industries such as financial and banking services, or any industry that employs mission-critical applications, this will ensure that service-level agreements (SLAs) are adequately met.

Compliance with SLAs will mean happy customers and less likelihood of penalties or losses associated with failure to meet such performance targets.

Here are some of the things you can do to maintain high availability:

1. Use network clustering

High-availability clustering ensures that if an application crashes on the service level, there will be another “failover” server to pick up the slack and continue providing services.

This can involve provisioning failover clusters with on-premise server infrastructure. On a cloud approach, an example is establishing HBase clusters on Windows Azure or Cluster Compute Instances on AWS.

2. Make strong infrastructure choices

Before you purchase or rent a server, make sure that it’s something that you are going to be able to use for a very long time without having to upgrade at any point in the short term.

Upgrades require downtime.

3. Scale OUT your infrastructure instead of scaling up

Large enterprises, rather than improving the capacity of their existing servers, expand that network of servers by adding new ones. This helps divide the load on your infrastructure quickly and easily.

This is how large-scale services like Facebook, for instance, manages it’s over a billion daily users – through a combination of distributed infrastructure and load balancing.

4. Use load balancing

Your servers are each equipped with a certain amount of computing power and bandwidth. Load balancing allows you to distribute network traffic to servers that have more available resources to spare.

This involves various architectures and service layers, including on-premises appliance, cloud-based and purely DNS/software-based approach.

There is no one-size-fits-all approach to load balancing, although this resource on load balancing choices identifies various approaches and strategies that are applicable to enterprises of all sizes.

5. Learn your RTO/RPO

The recovery time objective (RTO) is the amount of time you need to be back up and running before your business can no longer function.

This determines how out-of-date your data will be by the time you manage to flip the switch back on.

Both of these values can help you construct a contingency plan revolving around the certainty that you do not take an excessive amount of time to recover from catastrophic outages.

6. Test your recovery plan

How do you know your plan will work? Have you tried it? Run a small-scale copy of your service locally and simulate a failure, then put your recovery plan into action.

How long did it take? Did you meet your RTO? A third-party approach may be necessary, such as the disaster recovery testing solutions by PlatformLab or RES-Q.

Maintaining high availability at all times reduces your risk of lost revenue, dwindling customer bases, and lower conversion. Even a downtime of a few minutes can cost upwards of a few thousand dollars.

Mitigating this risk is not only convenient for users, but also creates a stronger reputation for your capability as a business or service provider.

Jacob McMillen

Published 21 April, 2015 by Jacob McMillen

Jacob McMillen is a professional copywriter, marketing blogger, and the content director for CoachTube. Follow him on Twitter.

6 more posts from this author

You might be interested in

Comments (3)

Stuart McMillan

Stuart McMillan, Deputy Head of Ecommerce at Schuh

Hi Jacob, I'd actually say you have a point 0 (due to it's importance): caching. In my experience an effective caching strategy is the number one way of effectively scaling a web application.

Firstly: get your browser caching sorted. The easiest request to handle is the one never made.

Secondly: Server-side caching. With most websites, the HTML is 100% generated by some sort of scripting/programming language that interacts with the database; this can be a computationally expensive operation, as opposed to simply outputting a static HTML file. Generate this HTML as infrequently as possible and cache the result, either as static files or (better) in memory (for commonly accessed pages). RAM is cheap these days, so add tons of it to your servers and cache in-memory; serving files from desk is comparitvely slow (as most hard disks on servers are still spinning platters of plastic). Cache these pages as close to your network edge as possible (or even better, someone else's network edge). Do absolutely everything you can to avoid "work" within your application, if your application servers and database servers are only lukewarm and your caching layer is red hot, then you're doing it right. It's easier to scale caching.

over 3 years ago


Deri Jones, CEO at SciVisum Ltd

Jacob / Stuart - nice tips!

In case any marketers are thinking this stuff is 'just for the techie teams'...

In the real world, unless marketing and business teams are going to unite to make user experience a priority: then it doesn't happen. At least that's my experience across a range of organisations: my day job is helping companies ensure a day-in-day-out working, fast,, error-free site for mobile and non-mobile users.

Jacob expressed it nicely:

> Enterprises that want to scale or online businesses that are already gaining traction will have to implement strategies for ... customer satisfaction.

as those strategies can only start with marketing/commercial teams who have the best insight into what user types, and what online journeys matter most t the business.

over 3 years ago

Jacob McMillen

Jacob McMillen, Copywriter at JacobMcMillen.com

Great point Stuart!

For the purpose of this article, I assumed that readers interested in the above strategies would have already implemented a caching strategy. As you said, it is Step #0!

And Deri,

You are 100% correct! Site performance and availability should be as important to the marketing team as it is to the IT crew.

about 3 years ago

Save or Cancel

Enjoying this article?

Get more just like this, delivered to your inbox.

Keep up to date with the latest analysis, inspiration and learning from the Econsultancy blog with our free Digital Pulse newsletter. You will receive a hand-picked digest of the latest and greatest articles, as well as snippets of new market data, best practice guides and trends research.