Is your service highly available?
Is it 99.999% available?
The real question is, does it matter?
Availability is defined as the total ‘uptime’ of a given service, over a window of time. This metric is typically counted in minutes, and those with experience in a Software-as-a-Service world, typically want to know, “What’s the Availability of the Service?”
Availability becomes important when users are vetting which vendors or businesses they would like to work with. A high level of availability, provides a level of confidence with the vendor or business, so that users know that when they need to use the service, it will be online and available.
But, as technology grows, and man does it grow fast, availability is beginning to become a buzz word, and here’s why.
Availability is Beginning to become a Buzz Word
A service or application can be highly available. It can be online and available for 99.999% of the year, but the real question is, is it working as end users expect it to work?
Let me explain.
Availability, calculated by the number of minutes a service is online over a period of time, is often calculated that simply. Often times a simple ‘ping’ or health check against the hardware and endpoints of the service, is all that is needed to get an accurate reading on availability, but this doesn’t account for bugs, degradations, partial outages, low performance, or missed customer expectations.
If a user attempts to use a service your team is offering, and it’s online, but it takes far to long to complete it’s work, does the user really care that the service is available 99.999% of the time?
Availability metrics like this tell you almost nothing about the reliability, or the quality of the service, beyond that it is online. With the growth of cloud computing and big players like AWS, standing up a handful of servers that are able to reach five nines of availability has never been easier. Being highly available is no longer something that separates the good from the great, it’s a flat out expectation.
Being Highly Available is no longer something that separates the Good, from the Great. It’s a flat out Expectation.
Not to worry though, where High Availability is now becoming the norm, business specific SLO’s and SLA’s, will be the next phase that separates the good from the great. Proper SLO’s that are built around specific business use cases, that meet customers expectations, will soon become the topic of conversation that replaces the low value data around availability.
Superior SLO’s that are instead based on the business, and not the technology, will reign supreme. SLO’s such as:
- Users can execute the search functionality, and it completes successfully 99.9% of the time
- Users can execute a job that completes successfully 99.99% of the time
- Users can perform the key functionality of the service 99.9% of the time
These types of business specific SLO’s that ensure, not only is the system available, but that it’s working exactly as it’s expected to work, will be the next wave of, “How available is your service?”
I highly encourage all teams adopting SRE cultures and SLO’s, to take that next step into what meeting customer expectations is really all about. When your next business opportunity arises and the crowd wants to talk about “Availability” they’ll be blown away by the fact that you’re able to provide a level of detail around the health and quality of your services, that goes so much deeper and is so much more impactful, than simply, “Availability.”