[Rdo-list] HA pair instances

Mon May 2 15:19:38 UTC 2016

> From: rdo-list-bounces at redhat.com [mailto:rdo-list-bounces at redhat.com] On Behalf Of EXT Netravali, Ganesh
> Sent: Monday, May 02, 2016 5:46 AM
> Subject: [Rdo-list] HA pair instances

> Hi 
> I need to build a HA pair between two instances launched on Openstack. Which is the best
> solution available? Can someone please point me to the details?

Ganesh,
There are many aspects of HA, you aren't likely to get a simple answer which answers all of
the complexity of HA. Are you concerned with availability of a service, or the likelyhood
that a packet sent to a service is processed exactly once, regardless of faults occurring?
Are you concerned that a network link will fail, making your service unavailable? Are
you concerned that the cloud runs on commercial power and that a lightning strike could
disrupt the power? Are you concerned that a server will fail, and that both your instances
might be running on the same server? Are you concerned that a 100 year flood may put your
data center under water? Or are you just concerned that your application may have a bug causing
it to fail, and you want a second instance available to back it up? Does your application
maintain any state, and does that state need to be available even if an application fails?
Is it important to keep out hackers to ensure they don't disrupt your application? Are
you concerned about a denial-of-service attack? Is the latency of your application part
of its perceived availability?

There are approaches to addressing all of the above issues to a level of reliability, but
it's impossible to get to 100% availability. You can get quite close, with 5-6 nines
(99.999% to 99.9999% available), with the right design choices and long enough operation
to address all the issues in the application and HA support. There have been books written
about high availability, and you probably need to read some to avoid simplistic approaches
which may not meet your expectations. Some of the approaches which may be important include:
anti-affinity, network redundancy, emergency power, and geographic redundancy. But, you
may find that the cloud you expect to use will never give you the availability characteristics
you need.

Understanding the expense of your application being down versus the effort you need to address
each of the potential issues will help you understand what is really important. Will someone
die because your application is not available? Will you lose revenue when your application is
down, or will people wait until it's up again? Will you lose customers to competitors if your
site is unreliable?

Regards,
John Haller