Testing Aurora DSQL write and read performance from AWS Lambda
This page has been translated by machine translation. View original
Aurora DSQL, AWS's new serverless distributed database. To understand its performance characteristics in real-world workloads, I measured write and read performance using AWS Lambda.
This article examines the impact of parallelism and physical distance between regions (cross-region), as well as performance differences based on DB connection reuse. Additionally, I attempted to verify the tradeoffs in write performance between single-region and multi-region configurations.
1. Test Environment
| Item | Value |
|---|---|
| Lambda runtime | Node.js 22.x / ARM64 (Graviton) |
| Lambda memory | 1,769MB (equivalent to 1 vCPU) |
| Lambda region | ap-northeast-1 (Tokyo) |
| DSQL clusters | ap-northeast-1 (Tokyo) / us-west-2 (Oregon) single-region, Tokyo+Seoul multi-region (witness: Osaka) |
| Parallel execution | Step Functions Map state (parallelism controlled with MaxConcurrency) |
| Connector | @aws/aurora-dsql-postgresjs-connector |
Test table definition:
CREATE TABLE perf_test (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
data TEXT,
created_at TIMESTAMPTZ DEFAULT now()
);
2. Least Privilege Design with IAM Roles
Step Functions role only has Lambda invocation permissions, while the Lambda role only has DSQL connection permissions.
Lambda Execution Role Policy
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "dsql:DbConnect",
"Resource": [
"arn:aws:dsql:ap-northeast-1:<account-id>:cluster/<cluster-id-tokyo>",
"arn:aws:dsql:us-west-2:<account-id>:cluster/<cluster-id-oregon>"
]
}
]
}
dsql:DbConnect is an action that allows connection as a regular user. By limiting the Resource to specific cluster ARNs, connections to other clusters are also prevented.
Binding DB Roles to IAM
-- connect as admin to execute
CREATE ROLE bench_user WITH LOGIN;
AWS IAM GRANT bench_user TO 'arn:aws:iam::<account-id>:role/<lambda-role-name>';
GRANT USAGE ON SCHEMA public TO bench_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO bench_user;
AWS IAM GRANT is DSQL-specific syntax that binds an IAM role ARN to a DB role. This configuration is required for each cluster (automatically propagated in multi-region configuration). GRANT ... ON ALL TABLES is only effective for tables that exist at the time of execution. Since DSQL doesn't support ALTER DEFAULT PRIVILEGES, GRANT needs to be reapplied when tables are added.
3. Lambda Function Implementation
import { auroraDSQLPostgres } from "@aws/aurora-dsql-postgresjs-connector";
const ENDPOINTS = {
"ap-northeast-1": process.env.DSQL_ENDPOINT_TYO,
"us-west-2": process.env.DSQL_ENDPOINT_PDX,
};
let cachedSql = null;
let cachedRegion = null;
function getSql(region) {
if (cachedSql && cachedRegion === region) return cachedSql;
cachedSql = auroraDSQLPostgres({
host: ENDPOINTS[region],
username: "bench_user",
});
cachedRegion = region;
return cachedSql;
}
export const handler = async (event) => {
const op = event.op || "insert";
const region = event.region || "ap-northeast-1";
const cache = event.cache !== false;
const start = Date.now();
const sql = cache
? getSql(region)
: auroraDSQLPostgres({ host: ENDPOINTS[region], username: "bench_user" });
try {
if (op === "insert") {
const id = crypto.randomUUID();
await sql`INSERT INTO perf_test (id, data) VALUES (${id}, ${"bench"})`;
} else if (op === "select") {
await sql`SELECT id, data FROM perf_test LIMIT 1`;
}
const ms = Date.now() - start;
if (!cache) await sql.end();
return { ok: true, op, region, cache, ms };
} catch (e) {
const ms = Date.now() - start;
if (!cache) try { await sql.end(); } catch {}
return { ok: false, op, region, cache, ms, error: e.message?.slice(0, 200) };
}
};
cachedSql is stored in a global variable to reuse connections during warm starts. Since the official connector handles IAM authentication token generation and refresh transparently, there's no need to manage token expiration when reusing connections.
For comparison, an implementation that connects each time is also available by specifying false for event.cache.
4. Connection Reuse Effect
Parallelism 1
Comparison between without reuse (new DB connection each time) and with reuse (DB connection retained).
| Pattern | Without Reuse | With Reuse |
|---|---|---|
| Tokyo→Tokyo INSERT | 222ms | 21ms |
| Tokyo→Tokyo SELECT | — | 6ms |
| Tokyo→Oregon INSERT | 1,512ms | 106ms |
| Tokyo→Oregon SELECT | — | 102ms |
Without reuse, TCP/TLS handshakes and IAM token generation occur each time. Especially in cross-region scenarios, the impact is amplified by the number of handshake round trips due to high RTT, resulting in 1,512ms for Tokyo→Oregon INSERT. With reuse, Tokyo→Tokyo INSERT takes only 21ms, and even Oregon operations are reduced to 102-106ms, almost matching pure RTT.
Parallel Execution (Without Reuse)
Lambda functions executed in parallel using Step Functions Map state, with each Lambda connecting to and operating on DSQL once.
Tokyo→Tokyo
| Parallelism | op | med(ms) | p95(ms) | rps |
|---|---|---|---|---|
| 50 | INSERT | 162 | 612 | 28.3 |
| 100 | INSERT | 105 | 518 | 44.7 |
| 200 | INSERT | 95 | 263 | 48.7 |
| 50 | SELECT | 88 | 118 | 53.8 |
| 100 | SELECT | 88 | 143 | 48.8 |
| 200 | SELECT | 90 | 133 | 53.4 |
Without reuse, rps plateaus around 50.
Tokyo→Oregon
| Parallelism | op | med(ms) | p95(ms) | rps |
|---|---|---|---|---|
| 50 | INSERT | 1,338 | 1,473 | 21.9 |
| 100 | INSERT | 1,188 | 1,473 | 24.3 |
| 200 | INSERT | 1,165 | 1,408 | 27.3 |
| 50 | SELECT | 1,064 | 1,301 | 24.4 |
| 100 | SELECT | 1,062 | 1,302 | 27.8 |
| 200 | SELECT | 1,057 | 1,125 | 28.3 |
For Oregon, the RTT between Tokyo and Oregon (about 100ms) multiplied by the number of handshake round trips is added, resulting in median values exceeding 1,000ms.
Parallel Execution (With Reuse)
Results with connection reuse.
Tokyo→Tokyo
| Parallelism | op | med(ms) | p95(ms) | rps |
|---|---|---|---|---|
| 50 | INSERT | 10 | 12 | 48.4 |
| 100 | INSERT | 10 | 12 | 58.1 |
| 200 | INSERT | 9 | 11 | 63.3 |
| 50 | SELECT | 4 | 5 | 64.0 |
| 100 | SELECT | 4 | 5 | 63.9 |
| 200 | SELECT | 4 | 5 | 66.4 |
Tokyo→Oregon
| Parallelism | op | med(ms) | p95(ms) | rps |
|---|---|---|---|---|
| 50 | INSERT | 106 | 1,487 | 36.4 |
| 100 | INSERT | 106 | 107 | 62.3 |
| 200 | INSERT | 105 | 107 | 60.0 |
| 50 | SELECT | 100 | 102 | 62.3 |
| 100 | SELECT | 100 | 102 | 74.5 |
| 200 | SELECT | 100 | 101 | 69.6 |
With connection reuse, Tokyo→Tokyo INSERT improved to 9-10ms median, and SELECT to 4ms. Oregon operations also decreased to 100-106ms median, reflecting just the pure RTT.
5. Write Throughput
Each Lambda sequentially inserted 1,000 records, with varying degrees of parallelism. Connection reuse enabled, Tokyo→Tokyo configuration. Each parallelism level was executed sequentially to eliminate interference from concurrent execution.
| Parallelism | Total Records | wall(s) | Total rps | Single Lambda rps |
|---|---|---|---|---|
| 1 | 1,000 | 9.7 | 103 | 105 |
| 10 | 10,000 | 10.0 | 1,002 | 111 |
| 20 | 20,000 | 9.8 | 2,048 | 113 |
| 30 | 30,000 | 9.8 | 3,065 | 117 |
| 40 | 40,000 | 10.2 | 3,905 | 116 |
| 50 | 50,000 | 10.0 | 4,996 | 116 |
| 100 | 100,000 | 20.0 | 5,000 | 117 |
Single Lambda INSERT throughput remained stable at about 115 rps, and scaled almost linearly with parallelization. At 50 parallel executions, it reached about 5,000 rps. The total rps not increasing at c=100 is due to internal batch splitting in the Step Functions Map state, not reaching DSQL limits.
6. Single vs Multi-Region
Performance differences between single-region and multi-region (Tokyo+Seoul, witness: Osaka) for INSERT/SELECT were verified. Two sets of clusters were created and measured separately to confirm reproducibility. Lambda connected from Tokyo to the Tokyo endpoint.
Cluster Creation Time
| Configuration | Time Required |
|---|---|
| Single-region | About 8 seconds |
| Multi-region | About 100-115 seconds |
Multi-region requires 3 steps (creating clusters in each region + mutual peering) and takes about 2 minutes to become ACTIVE.
INSERT (with reuse, Tokyo→Tokyo)
| Parallelism | Single med(ms) | Multi med(ms) | Ratio |
|---|---|---|---|
| 50 | 10 | 65 | 6.5x |
| 100 | 10 | 64 | 6.4x |
| 200 | 9 | 64 | 7.1x |
SELECT (with reuse, Tokyo→Tokyo)
| Parallelism | Single med(ms) | Multi med(ms) | Ratio |
|---|---|---|---|
| 50 | 4 | 4 | 1.0x |
| 100 | 4 | 4 | 1.0x |
| 200 | 4 | 4 | 1.0x |
INSERT became about 6-7 times slower in multi-region configuration. This is due to the synchronization overhead between regions during commits to ensure strong consistency. On the other hand, SELECT operations showed no difference as they are completed within the local region.
7. Summary
Though this was a small-scale verification with simple INSERT/SELECT operations on a single table, it clarified DSQL's performance characteristics and key points when using it from serverless environments.
The most important point is DB connection reuse. With reuse, INSERT operations take less than 10ms and SELECT operations take 4-6ms, improving by 10-20 times compared to connecting each time. Write throughput was also confirmed to scale almost linearly from a single Lambda (about 115 rps) to 5,000 rps with 50 parallel executions.
In multi-region configurations, INSERT operations become about 6-7 times slower due to synchronization overhead to ensure strong consistency. However, SELECT performance, which is completed locally, remains unchanged. Caution is needed regarding the tradeoff between availability and write performance.
An important note is DSQL's connection rate limit (100 connections/second, a hard limit that cannot be raised). When deploying with Lambda in production, connection reuse is essential, and using Provisioned Concurrency or controlling concurrent execution is crucial to avoid concentrated cold starts during spikes.
In cases requiring high throughput in multi-region or cross-region scenarios, it's important to reduce direct access to DSQL. For writes, asynchronous bulk processing with buffers like SQS is effective. For reads, since read replicas can't be relied upon, consider cloud-native overall designs using application-side caching or page caching with CloudFront.


