I tried CRUDing the new S3 Annotations feature and performing cross-search with Annotation Table + Athena

I tried working with Amazon S3 Annotations, announced on June 16, 2026, using boto3 and the AWS CLI. I verified Put/Get/List/Delete operations for Annotations, enabled the Annotation Table, and tested cross-search using SQL from Athena.

suzuki.ryo

2026.06.17

This page has been translated by machine translation. View original

Introduction

On June 16, 2026, a new Amazon S3 feature called "S3 Annotations" was announced. This feature allows you to attach up to 1,000 custom metadata items (annotations), each up to 1 MB, to S3 objects.

A comparison with conventional S3 metadata is shown below.

Item	user-defined metadata	object tags	S3 Annotations
Maximum count	—	10	1,000
Size limit	2KB within request header	Key up to 128 chars, Value up to 256 chars	1MB each
Format	Key-Value (ASCII)	Key-Value	Any (JSON, text, etc.)
Update	Object re-PUT required	Individual update possible	Individual update possible
Lifecycle integration	—	✅	—
Access control integration	—	✅ (condition keys)	—
Cross-object search (Athena)	—	Via S3 Inventory	Annotation Table

Tags are a mechanism used for lifecycle rules and IAM condition key integration, and Annotations are not a superset of these.

In this article, we will verify CRUD operations for Annotations using boto3 and AWS CLI, enable the Annotation Table, and perform cross-object searches using Athena.

Verification Environment

Item	Value
Region	ap-northeast-1
Bucket type	General-purpose S3 bucket (versioning disabled)
Python	3.14 (rc1)
boto3	1.43.31 (released 2026-06-16)
botocore	1.43.31
AWS CLI	v2.35.6 (released 2026-06-17)
Athena	Engine version 3

The operations in this article are performed using boto3 and AWS CLI.

Annotation CRUD Operations (boto3)

Annotation-related methods available in boto3 1.43.31:

import boto3

s3 = boto3.client('s3', region_name='ap-northeast-1')
[m for m in dir(s3) if 'annot' in m.lower()]
# ['delete_object_annotation', 'get_object_annotation', 'list_object_annotations',
#  'put_object_annotation', 'update_bucket_metadata_annotation_table_configuration']

The following operations are performed with a test object test-object.txt already placed in the bucket.

import json

BUCKET = "my-annotation-demo-bucket"
KEY = "test-object.txt"

PutObjectAnnotation (JSON)

annotation_json = json.dumps({
    "project": "annotation-test",
    "owner": "demo-user",
    "created": "2026-06-17"
})

resp = s3.put_object_annotation(
    Bucket=BUCKET,
    Key=KEY,
    AnnotationName='test-metadata',
    AnnotationPayload=annotation_json.encode()
)

{
    'ETag': '"1fa459dad748f9fcc3be1e3dcc50ea82"',
    'Key': 'test-object.txt',
    'AnnotationName': 'test-metadata',
    'ResponseMetadata': {
        'RequestId': 'XXXXXXXXXXXX',
        'HostId': 'XXXXXXXXXXXX',
        'HTTPStatusCode': 200
    }
}

ResponseMetadata will be omitted from here on.

PutObjectAnnotation (Plain Text)

resp = s3.put_object_annotation(
    Bucket=BUCKET,
    Key=KEY,
    AnnotationName='ai-summary',
    AnnotationPayload=b'AI-generated summary: A test file for demonstrating S3 Annotations.'
)

{
    'ETag': '"403c26f2a55cdc54cf931b03be006b75"',
    'AnnotationName': 'ai-summary'
}

ListObjectAnnotations

resp = s3.list_object_annotations(Bucket=BUCKET, Key=KEY)

{
    'AnnotationCount': 2,
    'Annotations': [
        {
            'AnnotationName': 'ai-summary',
            'Size': 67,
            'ETag': '"403c26f2a55cdc54cf931b03be006b75"',
            'LastModified': datetime(2026, 6, 17, 1, 37, 36, tzinfo=tzutc()),
            'ChecksumAlgorithm': ['CRC32']
        },
        {
            'AnnotationName': 'test-metadata',
            'Size': 78,
            'ETag': '"1fa459dad748f9fcc3be1e3dcc50ea82"',
            'LastModified': datetime(2026, 6, 17, 1, 37, 36, tzinfo=tzutc()),
            'ChecksumAlgorithm': ['CRC32']
        }
    ]
}

The ETag returned by List is not the ETag of the object itself, but the value returned for each annotation.

GetObjectAnnotation

resp = s3.get_object_annotation(
    Bucket=BUCKET,
    Key=KEY,
    AnnotationName='test-metadata'
)
body = resp['AnnotationPayload'].read().decode()

# body
'{"project": "annotation-test", "owner": "demo-user", "created": "2026-06-17"}'

# resp (excluding AnnotationPayload)
{
    'ETag': '"1fa459dad748f9fcc3be1e3dcc50ea82"',
    'ContentLength': 78
}

AnnotationPayload is of type StreamingBody, and the body is retrieved with .read().

DeleteObjectAnnotation

resp = s3.delete_object_annotation(
    Bucket=BUCKET,
    Key=KEY,
    AnnotationName='ai-summary'
)

{}  # HTTPStatusCode: 204

After deletion, retrieve the List again to confirm.

resp = s3.list_object_annotations(Bucket=BUCKET, Key=KEY)

{
    'AnnotationCount': 1,
    'Annotations': [
        {
            'AnnotationName': 'test-metadata',
            'Size': 78,
            'ETag': '"1fa459dad748f9fcc3be1e3dcc50ea82"',
            'LastModified': datetime(2026, 6, 17, 1, 37, 36, tzinfo=tzutc()),
            'ChecksumAlgorithm': ['CRC32']
        }
    ]
}

We confirmed that ai-summary has been removed and only test-metadata remains.

Operations via AWS CLI (v2.35.6)

Annotation operation commands were added in AWS CLI v2.35.6 (released 2026-06-17). We introduce them along with the main differences from boto3.

PutObjectAnnotation

--annotation-payload is a streaming blob, and you specify the file path directly. The file:// or fileb:// prefixes cannot be used.

echo -n '{"source":"cli","version":"2.35.6"}' > /tmp/payload.txt

aws s3api put-object-annotation \
  --bucket my-annotation-demo-bucket \
  --key videos/sample.mp4 \
  --annotation-name "cli-test" \
  --annotation-payload /tmp/payload.txt \
  --region ap-northeast-1

{
    "ETag": "\"39ce0435575e8e057d4a919c727ffe0a\"",
    "ChecksumCRC64NVME": "SvqIamuCqI0=",
    "ChecksumType": "FULL_OBJECT",
    "ServerSideEncryption": "AES256",
    "Key": "videos/sample.mp4",
    "AnnotationName": "cli-test"
}

GetObjectAnnotation

The output destination for the payload is specified as a positional argument (same pattern as s3api get-object).

aws s3api get-object-annotation \
  --bucket my-annotation-demo-bucket \
  --key videos/sample.mp4 \
  --annotation-name "cli-test" \
  --region ap-northeast-1 \
  /tmp/output.txt

cat /tmp/output.txt
# {"source":"cli","version":"2.35.6"}

ListObjectAnnotations / DeleteObjectAnnotation

# List
aws s3api list-object-annotations \
  --bucket my-annotation-demo-bucket \
  --key videos/sample.mp4 \
  --region ap-northeast-1

# Delete
aws s3api delete-object-annotation \
  --bucket my-annotation-demo-bucket \
  --key videos/sample.mp4 \
  --annotation-name "cli-test" \
  --region ap-northeast-1

Differences from boto3

Item	boto3	AWS CLI (v2.35.6)
Payload specification	`AnnotationPayload=bytes`	`--annotation-payload <filepath>` (`file://` not allowed)
Payload retrieval	`StreamingBody.read()`	Output destination file specified as positional argument
Checksum	CRC32 in this verification	CRC64NVME in this verification
Annotation on copy	—	Copied during `s3 cp/mv/sync` with `--copy-props all`

Copying with --copy-props all

The --copy-props all option added in v2.35.6 allows you to copy annotations, metadata, and tags together when copying between S3 locations.

aws s3 cp s3://my-annotation-demo-bucket/videos/sample.mp4 \
  s3://my-annotation-demo-bucket/videos/sample-copy.mp4 \
  --copy-props all \
  --region ap-northeast-1

Cross-Object Search with Annotation Table (Athena)

In addition to retrieving annotations for individual objects via API, you can perform cross-object SQL searches across all annotations in a bucket. We verified the procedure for enabling the S3 Metadata Annotation Table and querying it from Athena.

Relationship with S3 Metadata

The Annotation Table is an extension of the "S3 Metadata" infrastructure announced at re:Invent 2024. S3 Metadata previously provided a Journal Table that records object creation and deletion events, and an Inventory Table that is a snapshot of the object listing. With this S3 Annotations release, an Annotation Table has been added for cross-object searching of annotation payloads.

All of these are configured under the same MetadataConfiguration and stored on S3 Tables (Apache Iceberg).

Comparison with Conventional Architecture

Previously, when you wanted to store metadata with S3 objects and perform cross-object searches, the common approach was to save the data to an external database (such as DynamoDB). DevelopersIO has also introduced architectures like the following:

Comparing these architectures with Annotations yields the following:

Perspective	Conventional architecture (S3 + Lambda + DynamoDB)	S3 Annotations
Metadata storage location	DynamoDB table	Attached directly to the S3 object
Synchronization mechanism	EventBridge → Lambda / Step Functions	No synchronization to external DB required (reflection to Annotation Table is asynchronous)
Additional components	Lambda, DynamoDB, EventBridge, etc.	No custom sync process or external DB required
Cross-object search	DynamoDB Query / Scan / GSI	Annotation Table + Athena
Latency	DynamoDB: millisecond-level	Athena: second-level (suited for batch)
Cost structure	Lambda execution + DynamoDB RCU/WCU	S3 API requests + Athena scan

Annotations allow you to significantly simplify the architecture for metadata management. On the other hand, if millisecond-level low-latency access is required, the conventional architecture is more appropriate.

For the following verification, we added annotations to multiple objects to demonstrate the usefulness of cross-object search. The targets are videos/sample.mp4, videos/another.mp4, and docs/report.pdf.

Creating an IAM Role

Create a service role used by S3 Metadata when reflecting annotation information to the Annotation Table.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "metadata.s3.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

Permission policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObjectAnnotation",
        "s3:GetObjectVersionAnnotation",
        "s3:ListBucket",
        "s3:ListBucketVersions"
      ],
      "Resource": [
        "arn:aws:s3:::my-annotation-demo-bucket",
        "arn:aws:s3:::my-annotation-demo-bucket/*"
      ]
    }
  ]
}

Enabling Metadata Configuration

s3.create_bucket_metadata_configuration(
    Bucket=BUCKET,
    MetadataConfiguration={
        'JournalTableConfiguration': {
            'RecordExpiration': {'Expiration': 'DISABLED'}
        },
        'InventoryTableConfiguration': {
            'ConfigurationState': 'DISABLED'
        },
        'AnnotationTableConfiguration': {
            'ConfigurationState': 'ENABLED',
            'Role': 'arn:aws:iam::123456789012:role/S3MetadataAnnotationRole'
        }
    }
)

At the time of verification in this article, we executed this using boto3, but we also confirmed it can be executed with AWS CLI v2.35.6.

Backfill and ACTIVE Confirmation

In this verification, the TableStatus immediately after creation was BACKFILLING. The process of reflecting existing annotations into the table runs at this stage.

resp = s3.get_bucket_metadata_configuration(Bucket=BUCKET)
config = resp['GetBucketMetadataConfigurationResult']['MetadataConfigurationResult']
print(config['AnnotationTableConfigurationResult']['TableStatus'])

BACKFILLING

In this verification, with a small-scale environment of 3 objects and 3 annotations, it took approximately 25 minutes after Metadata Configuration creation to reach ACTIVE.

Creating a Federated Catalog in Glue Data Catalog

To query the Annotation Table from Athena, create a federated catalog for S3 Tables in Glue Data Catalog.

import boto3

glue = boto3.client('glue', region_name='ap-northeast-1')
glue.create_catalog(
    Name='s3tablescatalog',
    CatalogInput={
        'FederatedCatalog': {
            'Identifier': 'arn:aws:s3tables:ap-northeast-1:123456789012:bucket/*',
            'ConnectionName': 'aws:s3tables'
        },
        'CreateDatabaseDefaultPermissions': [
            {
                'Principal': {'DataLakePrincipalIdentifier': 'IAM_ALLOWED_PRINCIPALS'},
                'Permissions': ['ALL']
            }
        ],
        'CreateTableDefaultPermissions': [
            {
                'Principal': {'DataLakePrincipalIdentifier': 'IAM_ALLOWED_PRINCIPALS'},
                'Permissions': ['ALL']
            }
        ]
    }
)

Annotation Table Schema

When checking the Annotation Table in Athena after it became ACTIVE, the column structure was as follows:

Column	Description
bucket	Bucket name
object_key	Object key
object_version_id	Version ID (NULL in non-versioned environments)
name	Annotation name
last_modified_date	Annotation last modified date/time
size	Annotation size (bytes)
e_tag	Annotation ETag
checksum_algorithm	Checksum algorithm
text_value	Annotation payload (text)

The JSON/text format annotations created in this verification were stored as strings in the text_value column. For annotations saved as JSON strings, Athena's json_extract_scalar can be used to extract internal fields.

Athena Queries

The table path is "s3tablescatalog/aws-s3"."b_<bucket-name>"."annotation".

Retrieve All Records

SELECT object_key, name, text_value
FROM "s3tablescatalog/aws-s3"."b_my-annotation-demo-bucket"."annotation"
LIMIT 10;

object_key	name	text_value
videos/sample.mp4	mediainfo	`{"codec":"H.265","resolution":"3840x2160","audio_tracks":12}`
videos/another.mp4	mediainfo	`{"codec":"H.264","resolution":"1920x1080","audio_tracks":2}`
docs/report.pdf	classification	`{"category":"finance","sensitivity":"internal"}`

We confirmed that all 3 annotations are stored.

Filter by JSON Field

You can use json_extract_scalar to specify conditions on JSON-format annotations. Filter by annotation name first, then extract the JSON field.

SELECT object_key, name, text_value
FROM "s3tablescatalog/aws-s3"."b_my-annotation-demo-bucket"."annotation"
WHERE name = 'mediainfo'
  AND CAST(json_extract_scalar(text_value, '$.audio_tracks') AS INTEGER) > 8;

object_key	name	text_value
videos/sample.mp4	mediainfo	`{"codec":"H.265","resolution":"3840x2160","audio_tracks":12}`

The condition audio_tracks > 8 correctly returned only 1 result. We confirmed that the contents of annotations attached to S3 objects can be cross-searched using SQL.

Summary

We performed Put / Get / List / Delete operations for Annotations and cross-object searches using Annotation Table + Athena, all via boto3 and AWS CLI. With boto3 1.43.31 / botocore 1.43.31 and AWS CLI v2.35.6 used in this verification, we were able to execute the Annotation CRUD operations covered in this article.

S3 Annotations is a mechanism that allows you to attach larger amounts of information to S3 objects than conventional user-defined metadata or object tags. In this verification, we confirmed that JSON strings and text can be saved as annotations and retrieved via individual APIs or CLI.

Additionally, by enabling the Annotation Table, we were able to query saved annotations via SQL from Athena. For annotations saved as JSON strings, it was also possible to search using internal fields as conditions with json_extract_scalar.

Previously, when you wanted to cross-search metadata associated with S3 objects, it was sometimes necessary to manage it separately by combining Lambda, DynamoDB, and other services. By using S3 Annotations and the Annotation Table, depending on the use case, you can set up everything from attaching metadata to cross-object searches with Athena without preparing your own synchronization infrastructure or external DB.

However, reflection to the Annotation Table is asynchronous. Even in this small-scale verification environment, it took approximately 25 minutes from Metadata Configuration creation to reaching ACTIVE. If millisecond-level low-latency access or searches using GSI and similar features are required, conventional architectures such as DynamoDB may still be more appropriate.

Annotation operation commands were added in AWS CLI v2.35.6, and Put / Get / List / Delete could be executed from the CLI as well. Being able to use it not only from the SDK but also from the CLI makes verification and scripting easier.

I tried CRUDing the new S3 Annotations feature and performing cross-search with Annotation Table + Athena

Introduction

Verification Environment

Annotation CRUD Operations (boto3)

PutObjectAnnotation (JSON)

PutObjectAnnotation (Plain Text)

ListObjectAnnotations

GetObjectAnnotation

DeleteObjectAnnotation

Operations via AWS CLI (v2.35.6)

PutObjectAnnotation

GetObjectAnnotation

ListObjectAnnotations / DeleteObjectAnnotation

Differences from boto3

Copying with --copy-props all

Cross-Object Search with Annotation Table (Athena)

Relationship with S3 Metadata

Comparison with Conventional Architecture

Creating an IAM Role

Enabling Metadata Configuration

Backfill and ACTIVE Confirmation

Creating a Federated Catalog in Glue Data Catalog

Annotation Table Schema

Athena Queries

Retrieve All Records

Filter by JSON Field

Summary

Reference Links

AWS Topics

Trending Topics

Products & Services

Features and Series