Sid Anand, who writes the Practical Cloud Computing blog, has a series of posts entitled “SimpleDB Essentials for High Performance Users” in which he outlines a set of best practices and conventions for effectively leveraging SimpleDB. If you are using SimpleDB or are planning to, I highly recommend reading his points as they are super hip. Check out:
- SimpleDB Essentials for High Performance Users: Part 1
- SimpleDB Essentials for High Performance Users: Part 2
- SimpleDB Essentials for High Performance Users: Part 3
In particular, he advocates a form of sharding. That is, rather than putting all data into one SimpleDB domain, he recommends splitting domains up into small chunks so as to increase throughput. This makes a lot of sense; what’s more, sharding in this case isn’t terribly dangerous as SimpleDB doesn’t support cross domain queries to begin with and id management is up to an application anyway. Lastly, there are limits to the amount of space you can store in a domain; thus, sharding can facilitate growth nicely.
While not an entry in the aforementioned series, his article entitled “SimpleDB Performance: 5 Steps to Achieving High Write Throughput” is excellent too. Don’t forget to check out my two articles on SimpleDB:
Finally, I highly recommend reading Werner Vogels’ (the CTO of Amazon) “Eventually Consistent – Revisited” as it provides a base of knowledge for what’s behind SimpleDB.
