#How to Use AWS Object Lambda to Transform S3 Objects on Request – CloudSavvy IT

Table of Contents

“#How to Use AWS Object Lambda to Transform S3 Objects on Request – CloudSavvy IT”

Object Lambda lets you put a Lambda function in front of S3 objects, allowing them to be transformed on request by your own custom code. Since it runs automatically on Lambda, you don’t have to worry about running your own proxy layer.

What Is Object Lambda?

Object Lambda basically takes the place of an API in front of S3. Previously, you’d have to set up a proxy layer on your own infrastructure to handle transforming objects on request. This adds complexity, so AWS added a better solution.

RELATED: What Are Lambda Functions, And How Do You Use Them?

Instead of accessing objects directly, you’ll do so through an Object Lambda Access Point. When you make a GET request for a file in an S3 bucket, the Lambda function for that access point will be automatically called, allowed to access the original object, and return a transformed object back to the application.

When you make a GET request for a file in an S3 bucket, the Lambda function for that access point will be automatically called, allowed to access the original object, and return a transformed object back to the application.

The uses for this can be basic, like redacting info or converting JSON to XML, but since it’s your own code, you can do whatever you’d like. You could, for example, run a database lookup and return a transformed object with new data, or make requests to external APIs.

You can have multiple access points per bucket, which can each represent multiple “views” of the underlying data. To use different access points, you won’t need to update any client code. Simply change the bucket name to the ARN of the Object Lambda Access Point.

s3.get_object( 
    Bucket='arn:aws:s3-object-lambda:us-east-1:123412341234:accesspoint/myolap', 
    Key='s3.txt' )

You also don’t need to access the original object by the exact name. For example, your application could request picture_1920x1080.jpg, which would find picture.jpg and resize it to the given dimensions. In this case, the Lambda function would need extra permissions to access the bucket contents.

Of course, you’ll need to pay for all the time spent running Lambda functions. If you’re running a lot of functions through a user-facing access point, this could start to add up. If your transformations are static, you might want to consider caching the objects in a separate S3 bucket. For example, if you have a function that applies filters/compression to an image, you might want to cache the results instead of rebuilding on every request. For things that depend on external state, though, this won’t be possible.

RELATED: How To Backup an S3 Bucket (And Why You’d Even Want To)

Using Object Lambda

Head over to the S3 Management Console to get started. Each Object Lambda Access Point needs a regular access point behind it. You’ll need to create this from Access Points > Create in the sidebar.

Each Object Lambda Access Point needs a regular access point behind it. You'll need to create this from Access Points > Create in the sidebar.

Enter a name and select a bucket, and make sure to select “Internet” unless this bucket is limited to a single VPC. Once it’s created, copy the ARN for the access point.

Enter a name and select a bucket, and make sure to select "Internet" unless this bucket is limited to a single VPC. Once it's created, copy the ARN for the access point.

Create an Object Lambda Access Point:

Give it a name and paste it in the ARN of the access point, and the console should display the name of the underlying bucket.

At this point, you’ll need to select a Lambda function. If you have one prepared, you can enter the ARN or select it from the list. Otherwise, you’ll need to head over to the Lambda Management Console to create one.

If you have one prepared, you can enter the ARN or select it from the list. Otherwise, you'll need to head over to the Lambda Management Console to create one.

At this point, the code is up to you, although AWS provides the following example, which takes the original object and transforms it to uppercase. No matter what language you end up using, you’ll need to grab the event context, make a request to S3 using the URL, transform the object, and then write the response using the new WriteGetObjectResponse API, returning an HTTP status code afterward.

import boto3
import requests

def lambda_handler(event, context):
    print(event)

    object_get_context = event["getObjectContext"]
    request_route = object_get_context["outputRoute"]
    request_token = object_get_context["outputToken"]
    s3_url = object_get_context["inputS3Url"]

    
    response = requests.get(s3_url)
    original_object = response.content.decode('utf-8')

    
    transformed_object = original_object.upper()

    
    s3 = boto3.client('s3')
    s3.write_get_object_response(
        Body=transformed_object,
        RequestRoute=request_route,
        RequestToken=request_token)

    return {'status_code': 200}

The event object that Lambda receives will look something like this:

{
    "xAmzRequestId": "1a5ed718-5f53-471d-b6fe-5cf62d88d02a",
    "getObjectContext": {
        "inputS3Url": "https://myap-123412341234.s3-accesspoint.us-east-1.amazonaws.com/s3.txt?X-Amz-Security-Token=...",
        "outputRoute": "io-iad-cell001",
        "outputToken": "..."
    },
    "configuration": {
        "accessPointArn": "arn:aws:s3-object-lambda:us-east-1:123412341234:accesspoint/myolap",
        "supportingAccessPointArn": "arn:aws:s3:us-east-1:123412341234:accesspoint/myap",
        "payload": "test"
    },
    "userRequest": {
        "url": "/s3.txt",
        "headers": {
            "Host": "myolap-123412341234.s3-object-lambda.us-east-1.amazonaws.com",
            "Accept-Encoding": "identity",
            "X-Amz-Content-SHA256": "e3b0c44297fc1c149afbf4c8995fb92427ae41e4649b934ca495991b7852b855"
        }
    },
    "userIdentity": {
        "type": "IAMUser",
        "principalId": "...",
        "arn": "arn:aws:iam::123412341234:user/myuser",
        "accountId": "123412341234",
        "accessKeyId": "..."
    },
    "protocolVersion": "1.00"
}

There are two important pieces of info here—the userRequest section, which contains info about the initial request, like URL and HTTP headers, and the userIdentity section, which can be used to personalize the response based on IAM user.

RELATED: AWS IAM Users Versus. IAM Roles: Which One Should You Use?

If you liked the article, do not forget to share it with your friends. Follow us on Google News too, click on the star and choose us from your favorites.