Amazon goes down… Handling your website outage
Last week Amazon, yes the multi-billion dollar, pioneering, everyone knows it and trusts it Amazon, went down. During their approximately 3 hour “service interruption” many customers were unable to access the site at times and therefore orders were all but halted causing many investors to pull a quick trigger and dump the stock (shares closed down 4.59% after the outage but on a day where the NASDAQ fell 2.96% in aggregate). Of course many people have been quick to point out that 3 hours of downtime does not equate to 3 hours of lost business; indeed many people likely noticed the outage and returned later. Whether there was a substantial revenue loss or the stock deserved to suffer is something for a financial blog to ponder over – my take away from the outage was simply this – any site can do down – but if you’re running a smaller website without the brand power of an Amazon, how do you respond and handle the incident?
Last week Amazon, yes the multi-billion dollar, pioneering, everyone knows it and trusts it Amazon, went down. During their approximately 3 hour “service interruption” many customers were unable to access the site at times and therefore orders were all but halted causing many investors to pull a quick trigger and dump the stock (shares closed down 4.59% after the outage but on a day where the NASDAQ fell 2.96% in aggregate). Of course people have been quick to point out that 3 hours of downtime does not equate to 3 hours of lost business; indeed many people likely noticed the outage and returned later. Whether there was a substantial revenue loss or the stock deserved to suffer is something for a financial blog to ponder over – my take away from the outage was simply this – any site can do down – but if you’re running a smaller website without the brand power of an Amazon, how do you respond and handle the incident?
As a smaller site, going down means a few things. First you have an immediate loss of sales/ traffic for that period. Unless you offer something very particular or have a wonderful amount of name power only a small portion of the people who stumbled to your site are likely to return. The majority will instead wander of to the next site Google has so to show them, or forget the need altogether. The other half of the impact comes to your existing traffic and userbase who may lose faith in your offering and seek greener pastures, especially if the outage is prolonged or a recurring issue. This is especially problematic for community / social sites that exist off of the content, and loyalty of their users. Since practically every niche has multiple sites competing, an outage for a small and even a large community can be a reason for people to jump ship leaving the owners scrambling to rebuild years of work.
Even if your site doesn’t see a major drop off from an outage, being down and being silent about going down is a PR nightmare and yes, PR does matter. Combine this with any other issues or just a slow news day and all the sudden you’ll find the outage becomes more important that all your years of positive contribution to your niche.
Despite the several different negative impacts of an outage I still manage to stumble across a downed e-commerce site, blog or forum with no messaging and no responsiveness all the time. Sure if your site’s just a non-revenue generating hobby it can be easy to toss it aside but if you’ve got something going, why let it take the hit?
The solution to keeping an outage a positive customer experience (or more positive) is really rather simple – well simple at least from the perspective of saving some face – inform your customers (duh!). When someone types in your url or clicks to your site only to find it spitting out some strange error or simply not responding they’re gone. When they return and if they return is a matter of things that are not outside of your control. But what if instead of going down and spending all your energy to get back up, you spend a little time updating all the people who are trying to get to you right now and begin the process of controlling the spin and the reaction. In short, you want a version b to show your customers, a friendly message to an unfriendly reality – no one can visit your site right now but that’s ok because you understand their frustration, you know they want to be there and you’re working to get it back up right now.
What I generally suggest to smaller sites is to set a threshold. If after 10 or 15 minutes (even 5 may be good if you’ve got enough technology sense to know if the problem is big that quick) you can’t properly bring your complete site back up, it’s time to roll out your version b.
What should your version b include? Well that depends on what your site. For most sites a simple “We apologize” page is really all you need. A few lines describing that you’ve had an unexpected error and are working to resolve the problem is a perfect start. Optionally (and ideally) throw in a contact method (i.e. a phone number for an e-commerce company to capture sales through, an email for a content site, or even a link to another site to discuss the issue for a community or social site). For more advanced sites and more serious outages, letting people input their email address into a notification database for a reminder alert is a great way to get back to people.
If your outage becomes more severe than just a few minutes of reboots and tweaks it may be time to update your version b site with a bit more information. If you can provide a rough estimate of when the site will be back up (be conservative here, very conservative), that’s perfect. If not, explaining that it’s an extended outage will help set expectations again. And ideally you can also consider kicking over to a secondary server, even if it’s not a perfect replica at this point… but that requires having something in place…
At the risk of sounding overly commercial – there are also a few ways you cab benefit (at least in one way or another) during an outage. For community and content sites, outages can present an chance to push for paid memberships and subscriptions which loyal users are much more likely to spring for if they feel it will help insure their favorite site is up and running (this works best for sites with little visible revenue). For more commercial sites, you can try pointing people to other brands and properties. Indeed, some sites have been seeded out of outages. You’re not likely to equal out to having remained up but hey, it’s something and as a marketer or small business owner, getting something out of a problem is a whole lot better than getting nothing at all.
One of the most important parts of getting a handle on outages is to be prepared for them ahead of time. If you’re running a small site or really most any self-funded site, chances are you use only one hosting provider which means if you’re really down and not just unable to access a database or something limited, you can’t put up a version b. Even if your site is only experiencing a few issues, having to write that outage page takes time away from addressing the issue and slows down your notification process. My suggestion is always to have something available and ready at another datacenter or on another hosting provider with a third party DNS that you can just switch over…. 10 minutes of downtime and the switch gets flipped with people seeing the message a few minutes later.
The key takeaway from unfortunate examples like those of Amazon is that downtime is always a marketing issue, not just a technology one. If marketing works with IT to develop a plan, the effects are minimized, PR doesn’t go as far south and IT isn’t getting screamed at all day long. So while no one wants to go down for seconds let alone hours, being prepared for it is just like buying auto insurance – you have to do it and you really should do it right.
P.S. As a final thought on this topic and a bit more into the IT side of things. If you’re running a smaller site you really do need to think like a big site when it comes to outages and put notifications in place. Outages can happen 24×7 and they don’t care if your team is sitting by their desks or if you’re at a group retreat to the Swiss Alps and its 3am after a redeye flight… Be sure you get notified when something goes wrong or all the planning and slick marketing in the world won’t make a bit of difference.
P.P.S. Oh, and to get back to the opening paragraph and the inspiration for this post. Amazon had a message up shortly after their outage explaining that they were working on it (and since the site was up and down for a bit, explaining people should double check for their orders). They also had an active discussion going on in their forums which were apparently running separate from the main site.
Read more about the Amazon Outage at the NYT Online http://bits.blogs.nytimes.com/2008/06/06/amazons-web-site-goes-down-an-unplanned-event/?ref=technology
Anyone want to take bets on how long it is before this blog crashes now that I’ve made this post?
