By Chad Upton – senior consultant
Flex is a great platform for building applications quickly. Amazon SimpleDB is a cloud database with similar characteristics along with some unique advantages and trade-offs compared to traditional databases.
SimpleDB is a non-relational database that focuses on small data sets. It is built in Erlang and intended to run in the Amazon cloud, but there is an offline/local implementation that you can use for testing called M/DB.
It is not a database you’d use for everything; it focuses specifically on data sets that require high availability and low consistency, or “eventual” consistency to be more accurate. If you’re new to this concept, you might not even be able to think of a database use case where consistency is not important. It is perfect for many “social” web applications and non-customer facing features too.
Here’s how it works. When you write some data and then immediately try to that read data, the data may not be available yet. This happens because the database is distributed (replicated) across many servers. This is generally invisible to you since you read and write data from and to the same destination. Behind the scenes, Amazon routes your read/write requests to many different servers to give you the fastest response time and the highest possible availability and redundancy. So, the server you wrote the data to, may not be the same server that you tried to read the data from. This means you end up with temporary inconsistencies.
In practice, that data may appear to be missing for a brief amount of time while it is replicated to all the servers. Until it is replicated across all the server, it’s hit or miss if you will be able to read that data. That’s the downfall of this approach and for some applications, that is not acceptable. But, there is a huge benefit too: the services are extremely fault tolerant and the database can handle extremely large loads because it is distributed across many servers and geographies. This means that all of your eggs are not in the same basket. If there is a critical failure in one location, you probably won’t even notice.
Amazon has indicated that NetFlix uses SimpleDB for part of their business. It’s hard to say what they use it for, but we can make some guesses. They probably don’t use it for storing which discs you currently have out. To avoid sending you more discs than you pay for, they would need to have high consistency on that data. But, when you rate a movie 5 stars, that data gets combined with ratings of thousands of other people. While your rating is important in the long run, it’s not a big deal if it is not immediately counted in the average rating of the movie. Amazon SimpleDB is a perfect fit for this type of social feature: high availability, eventual consistency.
SimpleDB is also great for storing things like usage and error logs or high score data from games. Anytime you want the server to always be online to receive the read/write requests, but lag in consistency is tolerable given the higher availability. It’s also great for some content management type data such as a product database. The high availability is great, you always want your visitors to see your product catalog. A short delay between the time you add the product and the time it shows up on your site is usually acceptable.
How long is that delay?
I haven’t seen any delay in my tests. But, with large data sets it is bound to happen. From my experience with web sites that rely on similar data storage techniques, I have seen replication lags up to ~15 seconds. For most of the examples I have described, that’s completely acceptable.
SimpleDB plays well with ActionScript applications, especially if you use Cafe Silencio’s library. I’m going to build an app that uses SimpleDB in a future post; in the meantime, check out this collection of links to help you get started using SimpleDB in your ActionScript projects.
If you want to learn more about Flex, check out our selection of onsite and online training classes.