Aurora DSQL Supports IDENTITY / SEQUENCE - I Also Verified Migration Methods for Existing Tables

Aurora DSQL Supports IDENTITY / SEQUENCE - I Also Verified Migration Methods for Existing Tables

Amazon Aurora DSQL now supports the long-awaited IDENTITY columns and SEQUENCES! External numbering with DynamoDB is no longer necessary, allowing for a simple configuration that can be completed on the database side. We will explain the key points of CACHE settings to maximize throughput for parallel INSERTs, as well as the migration procedure from existing tables.
2026.02.14

This page has been translated by machine translation. View original

On February 13, 2026, support for IDENTITY columns and SEQUENCE objects in Amazon Aurora DSQL was announced.

https://aws.amazon.com/jp/about-aws/whats-new/2026/02/amazon-aurora-dsql-adds-identity-columns-sequence/

This update enables automatic generation of integer IDs on the database side. Since I was implementing an alternative solution in a system under development, I'll introduce it along with actual use cases.

Update Overview

The following features are now available in Aurora DSQL:

  • GENERATED AS IDENTITY (IDENTITY columns)
  • CREATE SEQUENCE / nextval() (sequence objects)

Standard SQL syntax compatible with PostgreSQL can be used as is. Available in all regions where Aurora DSQL is provided.

Use Case: Numbering Issue in AI Tag Management Table

Table Structure

We're developing a system that uses LLM to automatically generate tags for blog posts. Since multiple tags are linked to one article (1:N), the tag table needed an independent primary key (surrogate key).

Column Name Type Description
id BIGINT Primary key (auto-numbered)
post_id VARCHAR Article ID
tag_id VARCHAR Tag ID (corresponds to tag master)
created_at TIMESTAMP Creation date and time

Why Numbering Was an Issue

Aurora DSQL did not previously support SERIAL or SEQUENCE.

The method of "numbering with SELECT MAX(id) + 1" is an anti-pattern. It causes ID duplication with concurrent execution, and the scan cost of MAX(id) increases with the number of records. Inconsistencies can also occur depending on the transaction isolation level.

Previous Solution: Numbering Management with DynamoDB

We had set up a numbering management table in DynamoDB and were issuing IDs exclusively with Atomic Counter (ADD operation in UpdateItem).

DynamoDB was chosen because its Atomic Counter updates are atomic with reliable locking, allowing safe numbering without putting load on the DSQL side.

However, this increased dependencies on external services and made it complex to ensure transactional consistency between numbering and INSERT. Also, to ensure exclusive control of numbering, the tag assignment process was limited to one concurrent execution, preventing parallel processing and creating throughput challenges for tagging large numbers of articles. This was particularly time-consuming for batch operations like tagging tens of thousands of past articles.

Before (DynamoDB numbering method):

After (IDENTITY method):

With numbering completed on the DB side, a numbering table becomes unnecessary. Parallel execution becomes possible, allowing throughput improvement through Step Functions Map processing and eliminating concurrent execution limits.

Practical Verification: Testing IDENTITY Columns

Basic Operation

I created a test table to verify automatic numbering, explicit ID specification, and bulk INSERT.

As an Aurora DSQL specific constraint, explicit specification of CACHE size is mandatory. Omitting it results in an error (requires CACHE >= 65536 or = 1). This is likely designed to batch process sequence values on each compute node to avoid global lock contention in a distributed architecture.

Before (previously): ID numbering managed by the application

CREATE TABLE post_tag (
    id BIGINT NOT NULL PRIMARY KEY,
    post_id VARCHAR NOT NULL,
    tag_id VARCHAR NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- INSERT with application-generated id
INSERT INTO post_tag (id, post_id, tag_id)
VALUES (12345, 'post-001', 'aws');

After (IDENTITY): ID automatically numbered by the DB

CREATE TABLE post_tag (
    id BIGINT GENERATED BY DEFAULT AS IDENTITY (CACHE 65536) PRIMARY KEY,
    post_id VARCHAR NOT NULL,
    tag_id VARCHAR NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Simply omit id for automatic numbering
INSERT INTO post_tag (post_id, tag_id)
VALUES ('post-001', 'aws')
RETURNING id;

※ tag_id corresponds to a key in the tag master table

When defining an IDENTITY column, you need to specify the numbering mode and CACHE size. I verified both.

GENERATED BY DEFAULT vs GENERATED ALWAYS

  • GENERATED BY DEFAULT — Automatic numbering, but allows explicit ID specification. Useful for migrating existing data
  • GENERATED ALWAYS — Always numbered by the DB. Specifying an ID results in an error
  • ALTER TABLE ... SET GENERATED ALWAYS / BY DEFAULT allows switching for existing tables

CACHE Performance Comparison

I compared CACHE 1 and CACHE 65536 with a bulk INSERT of 1000 records. The comparison results are as follows:

CACHE 1 CACHE 65536
Time required 6.597 seconds 0.061 seconds
Speed ratio 1x about 108x
ID sequential nature Perfectly sequential Sequential within a single session (gaps possible during failures)

With CACHE 65536, each node independently caches different ranges of IDs, so while IDs are unique, their generation order may not match their values. Records inserted earlier might have larger IDs, and large gaps can occur. If strict creation order sorting is needed, use created_at in conjunction.

CACHE 1 was slow because it persists sequence values for each record.
For surrogate key usage where performance is prioritized and strict sequential numbering isn't required, CACHE 65536 is recommended. When performing massive bulk INSERTs, this cache setting significantly impacts processing time, so careful consideration based on requirements is necessary.

Verification of Adding to Existing Tables

Can ALTER TABLE be Used?

First, I tried the simplest method.

ALTER TABLE post_tag
    ALTER COLUMN id ADD GENERATED BY DEFAULT AS IDENTITY (CACHE 65536);

The result was not supported in DSQL.

ERROR: unsupported ALTER TABLE ALTER COLUMN ... ADD IDENTITY statement

While possible in standard PostgreSQL, DSQL currently doesn't support this ALTER TABLE syntax.

Create New Table → Migrate Data → Rename Method

Since ALTER TABLE wasn't available, I verified using the method of creating a new table and migrating data.

-- 1. Create new table with IDENTITY
CREATE TABLE post_tag_new (
    id BIGINT GENERATED BY DEFAULT AS IDENTITY (CACHE 65536) PRIMARY KEY,
    post_id VARCHAR NOT NULL,
    tag_id VARCHAR NOT NULL,
    created_at TIMESTAMP
);

-- 2. Migrate existing data preserving IDs
INSERT INTO post_tag_new (id, post_id, tag_id, created_at)
SELECT id, post_id, tag_id, created_at
FROM post_tag;

-- 3. Set starting value
ALTER TABLE post_tag_new ALTER COLUMN id RESTART WITH <MAX(id) + 1>;

-- 4. Swap tables
ALTER TABLE post_tag RENAME TO post_tag_old;
ALTER TABLE post_tag_new RENAME TO post_tag;

Verification Results

Results from sampling and verifying 1000 production records:

Test Item Result
ALTER TABLE ... ADD IDENTITY ✗ Not supported in DSQL
ID-preserving INSERT to new table with IDENTITY
Setting start value with RESTART WITH
Auto-numbering after migration ✓ Numbering starts after existing IDs
ID duplication None ✓
RENAME TABLE ✓ Supported

With GENERATED BY DEFAULT, we could INSERT with explicit existing IDs, and after migration, automatic numbering began when omitting id. Table swapping was completed instantly with RENAME.

I attempted to verify with Aurora DSQL's hard limit of less than 3000 rows per transaction, but for large-scale migrations, implementing logic to commit in units of less than 3,000 rows by specifying ID ranges on the application side is necessary.

Maximum number of table and index rows that can be mutated in a transaction block

3,000 rows per transaction.

https://docs.aws.amazon.com/aurora-dsql/latest/userguide/CHAP_quotas.html

Summary

Although adding IDENTITY via ALTER TABLE wasn't possible, migration using the new table + rename method was confirmed viable. Application changes only require omitting the id column from INSERT statements and removing numbering logic. The DynamoDB numbering table and related logic can be completely eliminated, also resolving operational constraints prohibiting concurrent execution.

The performance difference with CACHE settings is significant at 108 times, making it important to choose an appropriate value for your use case.

This is likely a long-awaited update for those who had to set up external mechanisms for numbering. For the article tag management system introduced as a practical example, we plan to start migration as well.

Share this article

FacebookHatena blogX

Related articles