Sunday, November 29, 2009

A brief on using amazon web services.

Recently we decided to use aws for doing some benchmarks. Why - it is cheaper than getting a system, setting them up and configuring them. But using aws for a long time may prove costly.

So, i went to and signed up with the service. You need to provide a valid credit card no and your phone no before the services are activated. The UI is quite confusing and at times you are not aware whether you are logged in or not. There might also be a case where you think that you are logged in but you would still be asked your password to access some other service.

The best place to start is the "Your Account" section on the top right hand corner. There are two services which are of interest.

Amazon s3 : Simple storage service
Amazon ec2 : Elastic compute cloud

There are two ways to access these services. All ways are available in the "security credentials" tab on the left hand panel after logging in. One way is to create an access key - which provides a access key id and a secret access key - similar to a username & password. Both need to be passed while communicating with the machines. Another way is to generate a X.509 certificate - it also generates a public and a private key.

S3 is a storage space on amazon. The easiest way to access it is using the s3cmd tool available at You can create your own directory and put and get files from the s3 store. Another way of accessing S3 is by using the firefox extension - elasticfox. Once the keys are in place in elasticfox, it provides an interface similar to fileftp with drag-drop features. It looks cool. S3 could only be used to store and retrieve data. If you are thinking about mounting an s3 drive on an ec2 instance to transfer data - better forget it. It would require you to mount the s3 device using fuse and it might take a lot of time to set it up. And there is another way of storing data for EC2.

EC2 is a virtual machine available for computing. You can start an instance of the virtual machine and run your processing on it. The good thing is that there is no need to install the OS of your choice. All you need to do is simply choose the type of machine and the OS and within seconds, the machine is available. You can simply ssh into the machine and start running your processing on it. The bad thing about ec2 is that it does not store your data. Once your processing is done and you turn off or terminate the machine, there is no way of getting back the machine or the data that was there on the machine. The data is lost in the cloud. Like discussed earlier storing data in S3 and mounting it on EC2 might be an option. But a better option is to create an EBS(Elastic Block Storage) volume and attach it to the EC2 instance.

The way to go about all this is logging in to the AWS management console - which is again on the left hand panel. It gives you a dashboard. First of all try launching an instance. It would ask you for the type of instance you want to launch - so you can launch a 32 bit centos instance or a 64 bit ubuntu instance and all you have to do is choose the related AMI (Amazon Machine Image). But be careful, there are some paid AMIs as well - so something like the RHEL instance might be paid and you will have to pay to launch it.

For creating an EBS, you will have to go to the left hand panel again and create a volume. You have to specify the size and the availability zone. An EBS can only be mounted on an Instance if it is in the same availability zone. So be careful about it. Select the zone, where your instance is running. You can create as large a volume you want because you are charged for the data in the volume and not for the volume size. So, if you create a 100 GB volume and put data of only 5 GB in it, you will be charged only for 5GB. Once the volume is created - attach it to the instance - the option is availabe on the web interface.

Now to access the volume you will need to create a filesystem on it and mount it. So, for example if you have attached the EBS volume at /dev/sdf on the instance currently running simply create a filesystem on it.

ssh -i private_key.pem [login to your ec2 instance]
mke2fs -j /dev/sdf [create ex3 filesystem]
mkdir /mnt/vol
mount /dev/sdf /mnt/vol [mount the volume]

Now you can work on your ec2 machine and store data on /mnt/vol. When you are done, it is better to take a snapshot of your volume using the tools in the amazon web console and then turn off your instance. Next time you need to work, simply mount the EBS volume on a new instance that you have started and all your data is readily available.

Another way of going about it is creating an amazon AMI after you have done setup of your machine - and use the AMI to boot further machines that you would need. You can download amazon ec2 api tools and the amazon ec2 ami tools which can help in creating AMIs and running them.

If you want to setup networking between the multiple instances, you need to do some extra effort. Of course by default all amazon running instances could ssh to each other using their private IPs and public DNSes. But to communicate on other ports, the ports need to be opened up. To do this simply use the "security groups" link on the left hand panel of the AWS management console. Select a group - default if you did not create any other group and open up the ports that you need.

And the most important advice - do not forget to turn down your instance after you are through with it - remember, you are billed by the hour.


S3 Browser Team said...

If you are on Windows, you can use S3 Browser( ) to manage your Amazon S3 Buckets and files.

andy said...

I always enjoy learning how other people employ Amazon S3 online storage. I am wondering if you can check out my very own tool CloudBerry Explorer that helps to manage S3 on Windows .