This is the third part of the EC2 storage series. Today, we will discuss Amazon S3 and instance store.
These will be the topics of this part:
- * Instance stores
- * Introductory notions about Amazon S3
- * Operations related to Amazon S3 buckets and objects
Until now we discussed Amazon Elastic Block Store as a storage method for EC2 instances.
There are another two options: instance store and Amazon S3.
We will discuss later about Amazon S3, so let’s start with instance store.
Instance store is the disk storage that an instance can access from the disks that are attached physically to the host device where the EC2 instance is running. Basically, you have a server that has a hard drive. The EC2 instance spawned on that server will use disk space from the local hard drive. That will be instance store.
This means that although the instance store is dedicated to that particular instance, the underlying disk is shared between all the EC2 instances from that host device.
And as a picture does more than 1,000 words, this picture will show you what an instance store is for AWS:
Instance stores can be used by only one instance meaning that two instances cannot use the same instance store. The instance stores cannot be attached or detached from the instance while it’s running.
The data that is in that instance store is available only as long as the instance that uses that instance store is running. If that instance reboots, regardless the reason, the data from the instance store is still available. However, the data from the instance store doesn’t persist if the hard drive on which the instance store sits fails or if the instance is terminated.
Each instance store has volumes and these volumes are virtualized as block devices. The block devices are called ephemeral and they can be numbered from 0 to 23, like ephemeral, ephemeral1 and so on up to ephemeral23.
If an instance has only one instance store volume, then that volume is called ephemeral0. If it has four, then the volumes are called ephemeral0, ephemeral1, ephemeral2 and ephemeral3.
Instance store volumes are great when you need to store information that changes very often. This type of data might be caches, buffers.
The second part of the article will discuss Amazon Simple Storage Service (S3).
This will be on an introductory level just for the reader to familiarize with Amazon S3. Amazon S3 is an AWS service on its own and it can get pretty complex. Amazon S3 will be discussed in more detail in a dedicated future series.
Amazon S3 is a storage for Internet. Amazon S3 provides fast and reliable storage infrastructure. It allows the user to store and retrieve data whenever it is needed, either from Amazon EC2 or from anywhere else from the Internet.
Whatever data you put in Amazon S3, then data is stored redundantly and it can be accessed (read or write) by multiple applications in the same time.
What does Amazon EC2 have to do with Amazon S3? Well, in the second part of the series, we discussed snapshots. Whenever a volume snapshot is taken, then that snapshot is stored in Amazon S3.
Also, the Amazon Machine Images (AMI) used by Amazon EC2 are stored in Amazon S3. As you know AMIs are used to launch EC2 instances.
The entities that Amazon S3 works with are called objects. Every object must be part of what is called a bucket. The bucket identifies the owner of the storage. Each object has a unique key value.
A user can have more than one bucket for which permissions are set. When a new bucket is created, the user can control who can add, delete, list objects in the bucket, where the bucket will be stored (what region). There are many other things that the user can control regarding Amazon S3 buckets.
The rest of the article will show you how you can create a bucket, add objects in it and how an EC2 instance can access the data from an Amazon S3 bucket.
So let’s create a S3 bucket.
From the AWS Management console, choose ‘S3’ from ‘Storage and Content Delivery’:
On the next page, click on ‘Create Bucket’, fill out the information required and click on ‘Create’:
Next you will have a list of your buckets on the left side and in the right side you can see the properties of the selected bucket. As you can see, there are lot of properties that you can play around with:
Let’s add a few files to the bucket. Click on the bucket where you want to add the files and then on ‘Upload’ button. Add the files you want and then click on ‘Start Upload’:
Now, inside the bucket, you will have the list of the files. Let’s select one and see what we have on the properties page:
You can access the files by using the following URL: https://s3.amazonaws.com/bucket_to_store_things/instance_storage.png
By default, the files in the bucket are not accessible by anyone else other than the owner. This means that, if I would paste the link in the browser, I will not see the file. Let’s grant public access to the PNG file and see if you can view it in the browser compared with the TXT file.
To make it public, select the file and from ‘Actions’ menu select ‘Make Public’:
Once we do this, let’s access both files and see the result. This is for the PNG file:
And this is for the TXT file:
As you can see, the TXT file is not accessible because it doesn’t have the right permissions for this.
This is how you can delete an object from a bucket. Select the file and from ‘Actions’ menu select ‘Delete’:
And this is how you can delete a bucket. From the buckets list, select the bucket and from ‘Actions’ menu select ‘Delete’:
In order to delete a bucket, you must make sure that it’s empty first.
Now, let’s see how an EC2 instance can benefit from S3 bucket content.
If the file has the correct permissions, you can retrieve the file using wget command.
In my case, I have a Linux EC2 instance already running and I’m able to get the TXT file:
[ec2-user@ip-172-31-29-28 ~]$ pwd /home/ec2-user [ec2-user@ip-172-31-29-28 ~]$ ls -l total 0 [ec2-user@ip-172-31-29-28 ~]$ wget https://s3.amazonaws.com/bucket_to_store_things/TEST.txt --2014-08-01 08:17:05-- https://s3.amazonaws.com/bucket_to_store_things/TEST.txt Resolving s3.amazonaws.com (s3.amazonaws.com)... 22.214.171.124 Connecting to s3.amazonaws.com (s3.amazonaws.com)|126.96.36.199|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 42 [text/plain] Saving to: âTEST.txtâ 100%[=======================================================================================================================================>] 42 --.-K/s in 0s 2014-08-01 08:17:05 (2.20 MB/s) - âTEST.txtâ saved [42/42] [ec2-user@ip-172-31-29-28 ~]$ cat TEST.txt FILE THAT WAS UPLOADED TO AMAZON S3 BUCKET [ec2-user@ip-172-31-29-28 ~]$
One other option is to copy the file using the AWS Command Line Interface (CLI). There is a specific procedure on how to install it and configure it which is outside the scope of this article. But this is how is done:
[ec2-user@ip-172-31-29-28 ~]$ sudo aws s3 cp s3://bucket_to_store_things/TEST.txt TEST.txt download: s3://bucket_to_store_things/TEST.txt to TEST.txt [ec2-user@ip-172-31-29-28 ~]$ cat TEST.txt FILE THAT WAS UPLOADED TO AMAZON S3 BUCKET [ec2-user@ip-172-31-29-28 ~]$
You can install AWS CLI on Windows as well and you can perform multiple operations with objects from S3 buckets: upload objects to buckets, copy objects between buckets.
And with this, we reached the end of the article regarding instance stores and Amazon S3.
Considering the other two parts of the series, now you should have a pretty good understanding of EC2 storage options.
You should now know what the purpose of EBS volumes and snapshots is and how to use them. Also, you should be familiar with instance stores and Amazon S3 storage and for what are they used.
The last part of the series will discuss EBS-backed AMIs.