Guest Posts - Written by Bob Walsh on Monday, May 5, 2008 13:28 - 0 Comments

Amazon S3 - a boon for Micro ISVs

Tags: ,

by Saurabh Dani,
Chambal.com, Inc.

In early 2007, our team at Chambal.com was busy in building a desktop & site search product, which would index remote data on a local desktop and provide extensive searches and reports on that indexed data. We were exploring new data sources which none of our competitors were indexing and Amazon S3 was one of those. It seemed really great but there was no good interface available for us to use S3 efficiently.

At the same time, we had two interns join our team and we thought it would be a great starting project for them and we would get a tool to work with. When they completed the project, we realized that we really had a good product and since then we shifted most of our focus to improve and market that product, Bucket Explorer.

To give you some background, Amazon Web Services catalog comprises of a few core building block services for storage, database, computing and queuing. We will limit this article to some of the unique benefits of Simple Storage Service (S3), especially for Micro ISVs.

Amazon S3 is a “storage service” with some disruptive features. A lot of people refer to S3 as a Web Server replacement but Amazon S3 is not a Web Server. S3 offers some unique features which cause the web server confusion, but those features also make it truly disruptive. Let’s consider an example file “47hats.png” to take a closer look:

Every file has a URI and can be accessed via http protocol:

You store your files in a “bucket” on S3. Bucket is the top level container of files. S3 provides a URI for every bucket and every file stored in that bucket. It also allows you to access those files using http or https protocols. So if our example file was uploaded to a bucket named mybucket, it can be accessed via http using the following two URIs:

http:// mybucket.s3.amazonaws.com / 47hats.png

http:// s3.amazonaws.com/mybucket/47hats.png

You can also create a torrent URL by just adding “?torrent” at the end of any of these two URIs.

It allows different levels of permissions:

You can upload a file to S3 and keep in private. If you encrypt it before uploading, no-one can ever read that file. On the other extreme, you can make a file world readable. An object which has a URI, can be accessed via http and has ‘read’ permission to everyone is comparable to an object being served from a web server.

It allows DNS alias:

If the bucket name is a fully qualified host name, it can be accessed via a DNS alias. If we create a bucket named images.example.com, save the file 47hats.png in this bucket and point the CNAME entry for this bucket to s3.amazonaws.com, then we can access this file using this new URI- http:// images.example.com / 47hats.png

You can create up to 100 different buckets:

When I checked the home page of 47hats.com with web-site-grader, it warned for “too many images”. This site had 29 images on its home page that day. This is typical of most sites these days that when one page is served to the end user, it may download several individual files from the web server (images, css, java script etc). The browser usually limits 4 concurrent connections to a single host which increases the page load time. Using S3, you can host these files in multiple buckets and trick the browser to load all of them concurrently as it will treat them as different hosts:

Authenticated Requests with Expiration Time:

When you signup for Amazon S3, you get a public key and a secret key. Most http requests sent to S3 need to be signed using these keys. S3 allows you to add an “expires” header at the time of creating a signed request. This feature allows you to send someone a link to a private file stored on S3, which automatically expires at a future date & time.

Range GETs

You can ask S3 to give you specific bytes of a file. So you can download first few megabytes of a large file and start processing that data while you request the next chunk. This makes S3 a perfect “web server” :) to serve media files. A Flash player can download first few bytes and start playing those while it downloads more data.

User Metadata on files:

You can specify custom Metadata on every file. Any user Metadata is a key value pair. S3 just stores it for each file, and passes it back when you ask for it. This allows users to specify any “custom headers” that you would want to serve from a web server. There are many use cases for this feature, here are just two examples:

  • S3 automatically serves last modified header when a file is requested. So if a browser is caching our image file 47hats.png it will check with S3 if it is changed or not, and will not download again, if it’s not changed. You can make it even better by adding an ‘Expires’ header to 47hats.png, which can tell the browsers not to even check for updates until a future expiration date.
  • We can zip 47hats.png before uploading to S3 and add a header “Content-Encoding: gzip”. The browsers will automatically decompress the file and this will save us bandwidth and storage costs (read faster load times).

It can scale

S3 can scale, and it can scale massively.

  • All Amazon web services are designed for concurrency; you can send several simultaneous requests to S3 and expect it to respond well.
  • S3 supports RFC 1323 model for TCP window scaling. In short it makes the data transfer extremely fast.
  • When you release a newer version of a product or you are in the news and suddenly get a lot of traffic, every user will share the same 10 or 100 mbps pipe provided by your web host going to a single server. With S3, each of those users’ download speed is limited to their own internet connection and not on the server side.
  • Amazon has multiple data centers and there are always multiple copies of a file stored in different geographical region (reliable and always available).

As I mentioned in the beginning, S3 is not a web server, and Amazon has claimed that it is not a CDN (Content Delivery Network), as it does not provide edge caching, yet the support for all these features, scalability and the lowest “pay as you go” pricing in the market makes Amazon S3 the best “poor man’s CDN”.

If you are looking to try out Amazon S3, I can highly recommend Bucket Explorer ;). Seriously, we have some of the best reviews in this market. If you are interested in updates on our progress you can subscribe to my product blog at Bucket Explorer Blog.

These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Technorati
  • StumbleUpon
  • Fark


Leave a Reply

Comment

RSS

Subscribe   Subscribe via Email

MicroISV Sites that Sell!

Is your web site hurting your sales? You're are not alone. In this ebook I dig deep into how microISVs need to structure their Unique Selling Proposition in order to sell more. This 88-page ebook will help you substantially improve your microISV's sales. Buy it now, or read more here.[PayPal alternative]

Buy MicroISV Sites that Sell!

Buy Now

47Hats consulting services:

Most Popular Content

47Hats shared feed

47Hats ecommerce powered by:

E-junkie Shopping Cart and Digital Delivery

Ideas - Apr 24, 2008 12:30 - 6 Comments

DHH nailed it.

More In Marketing


Admin, Productivity, Resources - Jan 2, 2008 8:50 - 1 Comment

With a little help from my friends…

More In Productivity


Admin - Apr 25, 2008 1:00 - 3 Comments

Admittedly completely off topic…

More In Resources