HTML Direct Upload Data to Amazon S3 Part 5: Memory Data

Overview

Upload data residing in memory (e.g., Byte Array) directly into Amazon S3 needs some transformation by first convert it to Uint8Array and then to Blob object.

Details and JavaScript Code

I try to answer and explain how to upload array data within memory rather than disk files you select through the HTML file input to Amazon S3. The specific scenario would be somehow you got an array containing some meaningful data like XML, Text, or even pdf/image data. Again, I want to emphasize by saying meaningful data, I mean, you cannot randomly generate say an integer array and try to upload an array of integers into S3, you can imagine and yourself might raise the question that what exactly is it? You totally have no idea how this kind of data could be stored in disk, then how could you expect S3 could successfully handle this job?

Get back to the specific problem, suppose we indeed have some meaningful data store in an array: array_origin. For example, it could an xml file and each element in the array just stores one character of the xml data, or it could be an image, and array stores each byte of that image. All these real data make sense, and it turns out we need to do the following to upload such memory data successfully into Amazon S3 (First take a look at how and I will explain why based on my own understanding):

var ary_origin = new Array();
        // (1) do things and add the real content into the array
        // ...
        var ary_byte  = new Uint8Array(ary_origin);     // (2) Convert to Byte Array
        var ary_obj   = ary_byte.buffer;
        var data_blob = new Blob([ary_obj]);            // (3) Convert to Blob

        // (4) the following are the same as in the last post
	var fd = new FormData();
	filekeypath = 'Prefix/' + filename;
	fd.append('key', filekeypath);
	fd.append('AWSAccessKeyId', document.getElementsByName("accesskey").item(0).value);
	fd.append('acl', 'public-read');
	fd.append('policy', document.getElementsByName("policy").item(0).value);
	fd.append('signature', document.getElementsByName("signature").item(0).value);
	fd.append("file", data_blob, filekeypath); // (5) put the real data_blob here

	theURL = "http://s3.amazonaws.com/YourOwnBucketName/";
	var xhr = new XMLHttpRequest();
	xhr.open('POST', theURL);
	xhr.send(fd);

Well, fact is fact, after many times of try and error, I got the above code working! You could generate like an xml by javascript to try it out by yourself! Now let me try to explain what we saw:

  1. Pure Array() does not make any sense, this is not very difficult to interpret because javascript Array() could even contain different types, my first initial try is just put such ary_origin in the file attribute commented as (5). It totally makes no sense!

  2. Convert to Byte array in (2), now that ary_origin does not fit into our purpose, the reasons are it is too flexible, we don’t even know how to determine the format of the object with the content in ary_origin, more specifically, at least we need to know the size of each element in the array to correct segment things right (e.g., some files have the first 16 bytes as meta data and similar stuff)? So one easy approach is just convert it into Unit8Array() which is is similar to an Array() where each item is an 8 bit (1 byte) unsigned integer. Try to think about it: the object we finally upload is just a byte array right? Things become consistent here.

  3. The convering to Blob object cannot be omitted either. Think about his, we finally need to store the data as Object or Physical Files on Amazon S3, array is not such Physical File at all. Then one thing make things consistent again, that is exactly Blob object: Blobs are immutable objects that represent raw data. Blobs allow you to construct file like objects on the client that you can pass to apis that expect urls instead of requiring the server provides the file. For example, you can construct a blob containing the data for an image, use URL.createObjectURL() to generate a url, and pass that url to HTMLImageElement.src to display the image you created without talking to a server. By the offical definition of Blob, this is exactly what we want.

Based on the three points I describe, I myself can understand (And I hope this could make a bit more sense to you too) why we need to convert the array data in such a way to successfully upload the memory array data into Amazon S3.

Summary

I try to explain why we need such a specific transformation in my code to upload data residing in memory (e.g., Byte Array) directly into Amazon S3 by first convert it to Uint8Array and then to Blob object. The reasoning here is not that strict though, so I do hope the readers could give more accurate (probably related to low level details) or insightful ideas about how such objects things work with Amazon S3 and share with us. Thanks a lot!


(Please specify the source  烟客旅人 sigmainfy — http://www.sigmainfy.com  as well as the original permalink

URL for any reprints,  and please do not use it for commercial purpose)

Written on October 7, 2014