HTML Direct Upload Data to Amazon S3 Part 1: S3 Details

Overview

We discuss details about Amazon S3 Bucket, Object, Access Control (e.g., CORS) in order to achieve the functionality of HTML Direct Upload Data to Amazon S3.

To get your whole system to successfully work and upload data through your browser you need to configure things correctly in your S3 console:

1) Create a bucket say “DirectUploads”

2) Configure the CORS property for “DirectUploads” (put the following into the configure file) to allow any domain to visit:

<CORSRule>
        <AllowedOrigin>*</AllowedOrigin>
        <AllowedMethod>GET</AllowedMethod>
        <AllowedMethod>POST</AllowedMethod>
        <AllowedMethod>PUT</AllowedMethod>
        <AllowedHeader>*</AllowedHeader>
 </CORSRule>

3) In your  Security Credentials download your access key and secret key and store them somewhere secure for later use

In next section, I just explain what all these are about and the key concepts you need to know about Amazon S3.

Amazon S3 Key Concepts

1) Objects: 

Amazon S3 Objects refer to any data you uploaded and stored in Amazon S3. For example, it could be an image, a video or a pdf file, you name it. Each Amazon S3 object has data, a key, and metadata. Object key (or key name) uniquely identifies the object in a bucket. Object metadata is a set of name-value pairs. Later on you will soon see the key part is quite important an attribute when you upload the data through http form structure.

Remarks: you probably will see create folder operation in your Amazon S3 console, which give your impression that there is a folder structure with a bucket, but I would more like to treat the folder names just as key name prefixes than true folder structures. Another way to treat the key is to treat it as full path of an object within some bucket. Quoted from official document: “The Amazon S3 data model is a flat structure: you create a bucket, and the bucket stores objects. There is no hierarchy of subbuckets or subfolders; however, you can emulate a folder hierarchy.”.

2) Bucket:

An Amazon S3 bucket is a container for objects stored in Amazon. I would like to consider a bucket pretty much as disk “C:” or “D:” in Windows Operating Systems. And bucket is quite an important part integrated into the URL to which you send the request. As a result, we can image that a bucket is really referring to a physical hardware storage place inside Amazon cloud where you put or store all your data.

Remarks: You do not have to create the folders first before you upload any data with a specific key. That is, for instance, say you want to upload README.txt into a bucket B1, and you want to put it into some folder (so you have a better management over your data) say Dev/Proj1. That is, the full path or key of this object text file is Dev/Proj1/README.txt  within B1, the only thing you need to do is to specify the key as Dev/Proj1/README.txt  when you send the request to upload this data. Amazon S3 would then create these folders correspondingly.

3) Access Control (e.g., CORS and sign request): 

There are several different aspects I can think of about access control over Amazon S3.

  1. CORS

The configure content in previous step just tells S3 to allow any domain access to the bucket and that requests can contain any headers, which is generally fine for testing. However, when deploying, a better practice would be to change the ‘AllowedOrigin’ to only accept requests from our own domain.

  1. Sign request:

As the official documentation says: “Requests to AWS must be signed—that is, they must include information that AWS can use to authenticate the requestor. Requests are signed using the access key ID and secret access key of an account or of an IAM user”. So this is why we need step 3 to download our confidential data for later use. I explain more on that:

Generally we can create access key ID and the secret access key of our account in our Security Credentials however the account credentials provide unlimited access to our AWS resources which is not recommended.

The best practice would be creating and using AWS Identity and Access Management (IAM) users with limited permissions. However, this would complicate things and bury us with the whole bunch of stuff and documentation to read (which is what I don’t like about Amazon S3 documentation, it is really clunky sometimes). The key thing to do this by using temporary confidential is

x-amz-security-token: if you are using IAM temporary credentials then you need to stick your session token here

And then to figure that out, we will have to read: “http://docs.aws.amazon.com/AmazonS3/latest/dev/walkthrough1.html”, “http://docs.aws.amazon.com/STS/latest/APIReference/Welcome.html“, .etc. You probably will get a feeling that have no idea which part to start, we just don’t bother to go for this in our test example, I mean, for test purpose, I think it is just fine to simply all the things first and the key point is to make it work.

Remarks: one interesting fact we would find about the access key and secret key is that we only got one opportunity to download the credentials. If we don’t download and save the credentials the second we created it, we will have to create a new access key for usage later.

Summary

To summarize, the details about Amazon S3 Bucket, Object, Access Control (e.g., CORS and sign request) regarding with how to achieve the functionality of HTML Direct Upload Data to Amazon S3 are discussed here. I just explain what I think of important things you need to know about Amazon S3 and hope that could get you started without really figuring out any other unrelated issues on Amazon S3 (my experience just told me that Amaozn S3 is really clunky). Please feel free to leave any comments.


(Please specify the source  烟客旅人 sigmainfy — http://www.sigmainfy.com  as well as the original permalink

URL for any reprints,  and please do not use it for commercial purpose)

Written on October 7, 2014