AWS CDK - S3 Same-Region Replication

AWS CDK - S3 Same-Region Replication

Ever needed to replicate objects from AWS bucket A to AWS bucket B? This post shows how to setup a simple CDK stack for replication, with two buckets located in same account and region. The example code is written in TypeScript.

Prerequisites

You should be able to create a AWS CDK TypeScript project and to deploy or destroy a stack. Check out the official documentation when needed.

Requirements

  • Source bucket with versioning enabled ...

  • Target bucket with versioning enabled ...

  • IAM role that will be used in replication configuration off source bucket ...

  • Permissions that will be added to the Policy of this IAM role ...

  • Attach replication configuration to source bucket

Source and target bucket

Creating the source and target bucket is easy and can be done using this code snippet.

The must have setting here is versioned: true. This is required for setting up replication. See userguide replication-requirements.

The other settings are mainly to keep your bucket private and ensure that the content, and the bucket itself, will be deleted when the stack is destroyed running cdk destroy.

// Create source bucket
const sourceBucket = new s3.Bucket(this, 'sourceBucket', {
    blockPublicAccess: BlockPublicAccess.BLOCK_ALL,
    encryption: BucketEncryption.S3_MANAGED,
    versioned: true,
    removalPolicy: RemovalPolicy.DESTROY,
    enforceSSL: true,
    autoDeleteObjects: true,
    objectOwnership: ObjectOwnership.BUCKET_OWNER_ENFORCED,
    lifecycleRules: [{
        expiration: Duration.days(1),
        id: 'expirationLifeCycleRule',
    }]
});

// Create target bucket
const targetBucket = new s3.Bucket(this, 'targetBucket', {
    blockPublicAccess: BlockPublicAccess.BLOCK_ALL,
    encryption: BucketEncryption.S3_MANAGED,
    versioned: true,
    removalPolicy: RemovalPolicy.DESTROY,
    enforceSSL: true,
    autoDeleteObjects: true,
    objectOwnership: ObjectOwnership.BUCKET_OWNER_ENFORCED,
    lifecycleRules: [{
        expiration: Duration.days(1),
        id: 'expirationLifeCycleRule',
    }]
});

IAM role and Policy Statements

Now let's create the IAM role that will be used for replication. For details see userguide replication configuration permissions.

Next snippet will create the trust relationship, allowing the s3 service to assume this role.

const replicationRole = new iam.Role(this, 'ReplicationRole', {
    assumedBy: new iam.ServicePrincipal('s3.amazonaws.com'),
});

The result for the IAM role in the AWS console will look like this:

Next we need to set the permissions needed for replication. Details on what is needed can be found in AWS documentation, using the link at the start of this paragraph.

// Attach policy for reading replication config from 
// source bucket and list contents of source bucket
replicationRole.addToPolicy(new iam.PolicyStatement({
    actions: [
        "s3:GetReplicationConfiguration",
        "s3:ListBucket"
    ],
    resources: [sourceBucket.bucketArn]
}));

// Attach policy for reading from the source bucket
replicationRole.addToPolicy(new iam.PolicyStatement({
    actions: [
        "s3:GetObjectVersionForReplication",
        "s3:GetObjectVersionAcl",
        "s3:GetObjectVersionTagging"
    ],
    resources: [
        `${sourceBucket.bucketArn}/*`,
    ]
}));

// Attach policy for destination bucket
replicationRole.addToPolicy(new iam.PolicyStatement({
    actions: [
        "s3:ReplicateObject",
        "s3:ReplicateDelete",
        "s3:ReplicateTags"
    ],
    resources: [
        `${targetBucket.bucketArn}/*`
    ]
}));

The result for the IAM role in the AWS console will look like this.

Full JSON that will be generated (bucket names will of course be different for your stack).

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "s3:GetReplicationConfiguration",
                "s3:ListBucket"
            ],
            "Resource": "arn:aws:s3:::<your-source-bucket>",
            "Effect": "Allow"
        },
        {
            "Action": [
                "s3:GetObjectVersionAcl",
                "s3:GetObjectVersionForReplication",
                "s3:GetObjectVersionTagging"
            ],
            "Resource": "arn:aws:s3:::<your-source-bucket>/*",
            "Effect": "Allow"
        },
        {
            "Action": [
                "s3:ReplicateDelete",
                "s3:ReplicateObject",
                "s3:ReplicateTags"
            ],
            "Resource": "arn:aws:s3:::<your-target-bucket>/*",
            "Effect": "Allow"
        }
    ]
}

Replication configuration on source bucket

What I found most difficult is how to configure the replication on the source bucket. As it turns out the L2 construct Bucket does not allow to configure replicationConfiguration. However the L1 construct CfnBucket does allow to configure replicationConfiguration.

First I created a CfnBucket as source bucket, with most of the features mentioned above. That way I could set the replication configuration. But losing the benefits you get for free with the L2 Bucket construct, like the AutoDeleteObjects lambda, was not so nice. It works, but definitely more lines of code, and more difficult to build and maintain. Low level cloudformation work.

Turns out the is a better and easier way: using an escape hatch to access the lower level CfnBucket that is embedded in the L2 Bucket construct. For details see https://docs.aws.amazon.com/cdk/v2/guide/cfn_layer.html#develop-customize-escape.

With one simple line of code, we can create a const pointing to the lower level CfnBucket embedded in the sourceBucket. Like this:

const cfnBucket = sourceBucket.node.defaultChild as s3.CfnBucket;

And then we can set the replication configuration to the source bucket. The console.info statements first show an empty configuration and then the added configuration upon cdk synth or cdk deploy.

const cfnBucket = sourceBucket.node.defaultChild as s3.CfnBucket;
console.info('Cloud formation resource: ', cfnBucket.replicationConfiguration);
cfnBucket.replicationConfiguration = {
    role: replicationRole.roleArn,
    rules: [
        {
            destination: {
                bucket: targetBucket.bucketArn
            },
            status: "Enabled"
        }
    ]
};
console.info('Cloud formation resource: ', cfnBucket.replicationConfiguration);

The result in the AWS Console will look like this

Check replication status using console

Upload a couple of files to the source bucket using AWS console. Check the replication by refreshing the contents of the target bucket. It may take a short while for replication to complete.

contents of source bucket

contents of target bucket

The replication status of source and target object can be viewed by checking the details of the object. As an example I used hello1.txt.

The replication status on the object in the source bucket is COMPLETED. Other values for the objects replication status are PENDING and FAILED. The replication status on the object in the target bucket is REPLICA.

replication status of hello1.txt in source bucket

replication status of hello1.txt in target bucket

Check replication status using CLI

Replication status can also be checked using the CLI:

[cloudshell-user@ip-00-000-00-000 ~]$ aws s3api head-object 
--bucket s3replicationimprovedstack-sourcebucket5c83c9d6-hffu2kwh9oop 
--key hello1.txt          

{
    "AcceptRanges": "bytes",
    "Expiration": "expiry-date=\"Mon, 08 Apr 2024 00:00:00 GMT\", rule-id=\"expirationLifeCycleRule\"",
    "LastModified": "2024-03-31T19:09:38+00:00",
    "ContentLength": 167,
    "ETag": "\"ad6519a04147294cfde21ed43fe4d54a\"",
    "VersionId": "90feKks1MJJ4iBO5OfZuwPynpMoEm6K8",
    "ContentType": "text/plain",
    "ServerSideEncryption": "AES256",
    "Metadata": {},
    "ReplicationStatus": "COMPLETED"
}
[cloudshell-user@ip-11-111-11-111 ~]$ aws s3api head-object 
--bucket s3replicationimprovedstack-targetbucket19145fbb-gugdliipgzkk 
--key hello1.txt

{
    "AcceptRanges": "bytes",
    "Expiration": "expiry-date=\"Mon, 08 Apr 2024 00:00:00 GMT\", rule-id=\"expirationLifeCycleRule\"",
    "LastModified": "2024-03-31T19:09:38+00:00",
    "ContentLength": 167,
    "ETag": "\"ad6519a04147294cfde21ed43fe4d54a\"",
    "VersionId": "90feKks1MJJ4iBO5OfZuwPynpMoEm6K8",
    "ContentType": "text/plain",
    "ServerSideEncryption": "AES256",
    "Metadata": {},
    "ReplicationStatus": "REPLICA"
}

Full code

Link to the full source code of the replication stack on GitHub: S3 Replication Stack

Avoid costs: clean up your stack!

If you decide to deploy and test the above stack, make sure to cdk destroy the stack when done to avoid costs. SSE-S3 encryption does not incur extra costs. Using SSE-KMS does, regardless of free tier, so be careful with the encryption.

As long as you are in AWS free tier you probably won't have costs. But nevertheless, always clean up your experimental stacks. Better be safe than sorry.