I can guarantee that those affected by the RDS outage will be thinking long and hard about using it again – databases are the crown jewels for any company. Combine temporary data access loss with extremely poor corporate communications from Amazon, and you get a recipe for 96-hour heartburn.
I talked to a rep from AWS at the MySQL conference recently, and he confirmed (their biggest complaint actually) that you can’t setup your own MySQL slave to pull from RDS, and there are restrictions on how you can move their backup files – leaving you with mysqldump basically.
So what a lot of people do is to use two regular EC2 instances with MySQL installed on each in a master-slave pair. Since in a normal AMI you have root access, you can modify the MySQL configuration to enable replication.
I’ve used EC2 for a number of projects, but fortunately did not have any instances online this week.
Amusingly enough, I don’t use a lot of Web 2.0 properties like Reddit, Foursquare, etc, so I did not even notice the outage.
Most companies are better off using hosting providers over time, but they still need to understand HA and DR limitations, and have a safe copy of their databases and configurations.
And what I tell people is that when a big provider goes down, they will just be a number in a queue with their data floating in the cloud. Some of the people I mentioned that to in the past didn’t quite understand the impact of that, but now they do.
Disclaimer: I have previously done consulting for Amazon on database projects using AWS.
Amazon cloud still on fritz after 36 hours (April 22)
Amazon CTO’s distributed computing pal analyzes EC2 failure
justinsb’s posterous: AWS is down: Why the sky is falling
theregister.co.uk: Amazon fine print limits potential credits for cloud outage
Amazon’s cloud crash destroyed many customers’ data
theregister.co.uk: The Great Amazon Crash of 2011
AWS Blog: Summary of the Amazon EC2 and Amazon RDS Service Disruption in the US East Region (May 3)
theinquirer.net: Amazon says sorry for cloud datacentres outage