Amazon's approach to metered infrastructure

18.07.2006
In March, Amazon.com introduced S3 (Simple Storage Service), a metered storage service for arbitrary blobs of data (http://www.infoworld.com/4297). Recently, Amazon's adventure in metered Web services continued with the announcement that its SQS (Simple Queue Service), which had been in beta since well before the surprise announcement of S3, has now joined S3 as a commercial offering.

SQS is a Web-based queue to which you post messages and from which you read them back -- without worrying about pesky details such as scale, concurrency, reliability, or guaranteed delivery. A message ranges from one byte to 256KB. It costs a dime to transfer a thousand messages, plus the same 20 cents per gigabyte that it costs to pour data into and out of S3 buckets.

Like S3, SQS is an extremely general-purpose service offering that will undoubtedly be used in ways nobody can predict. It's therefore appropriate that Amazon has tailored both services to the broadest possible swath of developers. I haven't explored SQS in detail yet, but it looks a lot like S3 -- that is, a pragmatic mix of REST (Representational State Transfer), SOAP, plain old XML, and HTTP.

You can layer WS-* standards on top of the SOAP interfaces, but Amazon itself hasn't (at least not yet). Nor does it yet support advanced storage or messaging standards, such as WebDAV, JSR 170, or JMS. Why not? A service based on those advanced standards would have a fairly high activation threshold. To cross over you'd have to acquire a toolkit and learn how to use it. For lots of potential applications, though, that would be overkill. You just need to know that you can reliably store data and metadata in the cloud, serve it robustly from there, pump messages reliably, and pay a competitive rate.

Advanced toolkits are great when you need to use advanced infrastructure, but there's a trade-off. When you rely on a toolkit's encapsulation of a service, you don't really understand how the service works. Sometimes that's necessary, but in the case of S3 and SQS, it isn't. S3's REST interface, for example, is encapsulated by Amazon's own Java and .Net libraries, and also by third-party Python, Ruby, and other libraries. But I've hardly used these. Instead I've mainly used s3-curl, which is a simple Perl wrapper around curl, a URL-oriented command-line network transfer tool. As a result, S3 isn't a black box to me, and that's reassuring.

Arguably Amazon's biggest near-term challenge isn't to support advanced standards, but rather to connect S3 and SQS more gracefully to its own customer-facing services. Currently, you need an Amazon developer account to charge usage on these services. Some people are doing just that, but it would be better if developers could have customers charge usage to regular Amazon.com accounts, and if the S3 and SQS permissions mechanism made more use of Amazon's own identity system.