This article will discuss two S3 bucket features. These topics will be covered in this article:
Objects lifecycle concepts
First we will discuss the concepts for both features and then demonstrate what they do and how they can be configured.
VMware Training – Resources (Intense)
Access logging is a feature that one can use to track the requests for access to S3 buckets. Each record has details about a single access request. A benefit of this feature is that it can be helpful in security audits and that you can learn more about from where your users are accessing the objects and what objects are the most accessed.
By default, this feature is disabled. To enable the access logging, there are two steps needed.
The first one is, of course, enabling the access logging on the bucket for which you want to have the logs. This is commonly called the source bucket.
Secondly, you need to grant the S3 log delivery group write permission on the bucket where you want to save the logs. This is called the target bucket.
You can save the logs to the same bucket for which you are collecting the logs.
When the logging is enabled on the source bucket, you have to specify the target bucket and optionally a prefix to be assigned to all log objects. The prefix should help more easily identifying the log files.
The access records are collected periodically and uploaded in the target bucket as log objects. Access logs are delivered on a best effort basis; that is, most of the access requests will generate a log record and they will be delivered in few hours from the moment when they were generated. However, there is no guarantee that all the log records will be delivered. Also, there is no fixed time interval after which a log record will be delivered once the access request was received.
The second feature is lifecycle configuration. The lifecycle configuration controls how Amazon S3 manages objects during their lifetime. Using lifecycle configuration rules, you can define how types of objects are treated. For instance, log files: You can upload to S3 log files and keep them for a predefined period of time. After that, you could either archive them in Amazon Glacier or delete them. You can find more about Amazon Glacier from this article Amazon Web Services (AWS): Understanding Glacier.
There are few things regarding archiving:
Objects in the GLACIER class are not available instantly.
The transition to GLACIER class is one-way direction.
GLACIER objects are available only through Amazon S3 and not also through Amazon Glacier.
The lifecycle configuration is composed of rules.
Each rule has the following:
Rule ID(element ID)—Up to 1000 rules can have a configuration lifecycle.
Status element—It can be enabled or disabled. If it’s disabled, then the actions from that rule are not executed.
Prefix element—The rule can apply to one or more objects. The rule can be applied to a series of objects whose name starts with a prefix specified in the configuration.
Action elements—These are the actions performed on the objects. There are few predefined actions: Transition, Expiration, NoncurrentVersionTransition, and NoncurrentVersionExpiration.
As we know, you can enable versioning on the buckets. Based on the presence or not of the versioning, the action elements mentioned above can be applicable to buckets or not.
Let’s discuss each action:
Transition action is a non-versioned bucket; it sets the class of the object to GLACIER. In a version-enabled bucket, it sets the object class of the current object to GLACIER class.
Expiration action—In a non-versioned bucket, the object is deleted. In a versioning enabled bucket, it has no effect on the noncurrent version. For the current version, it’s recording the current version as noncurrent version by introducing a delete marker as the new current version.
NoncurrentVersionTransition action specifies how long a noncurrent version of an object will stay in the standard class before it moves to GLACIER class.
NoncurrentVersionExpiration action specifies how long the noncurrent version of the object is stored before it’s definitively deleted.
So let’s configure lifecycle configuration for versioning buckets.
Select the bucket from S3 Console and from the “Properties” menu, expand the “Lifecycle” menu and then click on “Add Rule”:
The wizard will guide through the configuration. In the first step you need to specify to what resource you want to apply the rule. It can be whole bucket or a folder:
Then you need to configure the rule. My rule should be configured like this:
I want to archive the current version of the object in the same day that I put it in the bucket. This means that I need to specify “0” days.
I want to have the current version expire after 10 more days after it was initially created.
I want the non-current version to be archived after 10 days they expired.
I want that archived non-current version objects to be deleted after 20 more days(30 days since expiration).
So, this is what I need to configure:
In the next step, you need to specify a name for the rule and then create and activate the rule:
Now the rule has been applied. Note that you can modify the rule after it was created:
How do you know when it’s the expiring date? Just check the properties of the objects that you are interested in. For instance, this is for the current version of the object:
As you can see, it’s pretty straightforward how you can play around with lifecycle configuration. The hardest part is to understand what each action means in the context of non-versioned and versioning-enabled buckets. For instance, expiration action means different things for the two buckets.
Let’s go ahead and see how access logging can be enabled.
I created another bucket that will be used to store the logging enabled for one of my S3 buckets.
So this is how you can enable the logging.
From the S3 console, select the bucket for which you would like to enable the logging, then choose “Properties” and expand the “Logging” menu. Check the box that says “Enabled” and fill in the fields asking for the bucket name that will store the logs and the folder inside that bucket.
Then on the bucket holding the logs, expand the “Permissions” menu and add a grant allowing the write permission to the “Log Delivery” group. as shown below:
After a while, you will see the logs being generated in the destination bucket in the folder that we configured:
Let’s check the oldest log file (the top one) and see what it contains:
bffa05e48f23ca7987afe981d3b1800fcc73b8c949a2ae31f5a7d627acecae6f s3-bucket-001-03 [01/Nov/2014:15:36:01 +0000] 10.194.64.24 bffa05e48f23ca7987afe981d3b1800fcc73b8c949a2ae31f5a7d627acecae6f 733EE326D98BAA55 REST.GET.VERSIONING – “GET /s3-bucket-001-03?versioning HTTP/1.1” 200 – 113 – 19 – “-” “S3Console/0.4” –
The logs have a specific format from where you can get a lot of information about the bucket and object that was accessed, from which IP address, at what time, the size of the object and other information. You can find here the complete of the log records field: Server Access Log Format.
To disable the logging, it’s enough to clear the checkbox from “Logging” section of the bucket properties.
You can combine the lifecycle feature that we discussed before to keep the logs for a number of days, months, or years and then automatically delete these objects at the end of their life.
And this is it about lifecycle configuration and access logging.
Up to this point we discussed about the operation of these two features and showed how to enable access logging. Additionally, I showed you how you can create a rule as part of lifecycle management.