Testing Aurora DSQL write and read performance from AWS Lambda

I conducted performance measurements connecting Lambda to Aurora DSQL in various configurations. This summary covers performance trade-offs you should know before deploying a distributed database in production, including latency differences in parallel execution and multi-region configurations, and practical connection management techniques to avoid the 100 connections/second limit.

suzuki.ryo

2026.03.09

This page has been translated by machine translation. View original

Aurora DSQL, AWS's new serverless distributed database. To understand its performance characteristics in real-world workloads, I measured write and read performance using AWS Lambda.
This article examines the impact of parallelism and physical distance between regions (cross-region), as well as performance differences based on DB connection reuse. Additionally, I attempted to verify the tradeoffs in write performance between single-region and multi-region configurations.
 1. Test Environment

Item
Value


Lambda runtime
Node.js 22.x / ARM64 (Graviton)

Lambda memory
1,769MB (equivalent to 1 vCPU)

Lambda region
ap-northeast-1 (Tokyo)

DSQL clusters
ap-northeast-1 (Tokyo) / us-west-2 (Oregon) single-region, Tokyo+Seoul multi-region (witness: Osaka)

Parallel execution
Step Functions Map state (parallelism controlled with MaxConcurrency)

Connector
@aws/aurora-dsql-postgresjs-connector

Test table definition:
CREATE TABLE perf_test (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  data TEXT,
  created_at TIMESTAMPTZ DEFAULT now()
);
 2. Least Privilege Design with IAM RolesStep Functions role only has Lambda invocation permissions, while the Lambda role only has DSQL connection permissions.
 Lambda Execution Role Policy{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "dsql:DbConnect",
      "Resource": [
        "arn:aws:dsql:ap-northeast-1:<account-id>:cluster/<cluster-id-tokyo>",
        "arn:aws:dsql:us-west-2:<account-id>:cluster/<cluster-id-oregon>"
      ]
    }
  ]
}
dsql:DbConnect is an action that allows connection as a regular user. By limiting the Resource to specific cluster ARNs, connections to other clusters are also prevented.
 Binding DB Roles to IAM-- connect as admin to execute
CREATE ROLE bench_user WITH LOGIN;
AWS IAM GRANT bench_user TO 'arn:aws:iam::<account-id>:role/<lambda-role-name>';
GRANT USAGE ON SCHEMA public TO bench_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO bench_user;
AWS IAM GRANT is DSQL-specific syntax that binds an IAM role ARN to a DB role. This configuration is required for each cluster (automatically propagated in multi-region configuration). GRANT ... ON ALL TABLES is only effective for tables that exist at the time of execution. Since DSQL doesn't support ALTER DEFAULT PRIVILEGES, GRANT needs to be reapplied when tables are added.
 3. Lambda Function Implementationimport { auroraDSQLPostgres } from "@aws/aurora-dsql-postgresjs-connector";

const ENDPOINTS = {
  "ap-northeast-1": process.env.DSQL_ENDPOINT_TYO,
  "us-west-2": process.env.DSQL_ENDPOINT_PDX,
};

let cachedSql = null;
let cachedRegion = null;

function getSql(region) {
  if (cachedSql && cachedRegion === region) return cachedSql;
  cachedSql = auroraDSQLPostgres({
    host: ENDPOINTS[region],
    username: "bench_user",
  });
  cachedRegion = region;
  return cachedSql;
}

export const handler = async (event) => {
  const op = event.op || "insert";
  const region = event.region || "ap-northeast-1";
  const cache = event.cache !== false;

  const start = Date.now();
  const sql = cache
    ? getSql(region)
    : auroraDSQLPostgres({ host: ENDPOINTS[region], username: "bench_user" });

  try {
    if (op === "insert") {
      const id = crypto.randomUUID();
      await sql`INSERT INTO perf_test (id, data) VALUES (${id}, ${"bench"})`;
    } else if (op === "select") {
      await sql`SELECT id, data FROM perf_test LIMIT 1`;
    }
    const ms = Date.now() - start;
    if (!cache) await sql.end();
    return { ok: true, op, region, cache, ms };
  } catch (e) {
    const ms = Date.now() - start;
    if (!cache) try { await sql.end(); } catch {}
    return { ok: false, op, region, cache, ms, error: e.message?.slice(0, 200) };
  }
};
cachedSql is stored in a global variable to reuse connections during warm starts. Since the official connector handles IAM authentication token generation and refresh transparently, there's no need to manage token expiration when reusing connections.

For comparison, an implementation that connects each time is also available by specifying false for event.cache.
 4. Connection Reuse Effect Parallelism 1Comparison between without reuse (new DB connection each time) and with reuse (DB connection retained).


Pattern
Without Reuse
With Reuse


Tokyo→Tokyo INSERT
222ms
21ms

Tokyo→Tokyo SELECT
—
6ms

Tokyo→Oregon INSERT
1,512ms
106ms

Tokyo→Oregon SELECT
—
102ms

Without reuse, TCP/TLS handshakes and IAM token generation occur each time. Especially in cross-region scenarios, the impact is amplified by the number of handshake round trips due to high RTT, resulting in 1,512ms for Tokyo→Oregon INSERT. With reuse, Tokyo→Tokyo INSERT takes only 21ms, and even Oregon operations are reduced to 102-106ms, almost matching pure RTT.
 Parallel Execution (Without Reuse)Lambda functions executed in parallel using Step Functions Map state, with each Lambda connecting to and operating on DSQL once.
Tokyo→Tokyo


Parallelism
op
med(ms)
p95(ms)
rps


50
INSERT
162
612
28.3

100
INSERT
105
518
44.7

200
INSERT
95
263
48.7

50
SELECT
88
118
53.8

100
SELECT
88
143
48.8

200
SELECT
90
133
53.4

Without reuse, rps plateaus around 50.
Tokyo→Oregon


Parallelism
op
med(ms)
p95(ms)
rps


50
INSERT
1,338
1,473
21.9

100
INSERT
1,188
1,473
24.3

200
INSERT
1,165
1,408
27.3

50
SELECT
1,064
1,301
24.4

100
SELECT
1,062
1,302
27.8

200
SELECT
1,057
1,125
28.3

For Oregon, the RTT between Tokyo and Oregon (about 100ms) multiplied by the number of handshake round trips is added, resulting in median values exceeding 1,000ms.
 Parallel Execution (With Reuse)Results with connection reuse.
Tokyo→Tokyo


Parallelism
op
med(ms)
p95(ms)
rps


50
INSERT
10
12
48.4

100
INSERT
10
12
58.1

200
INSERT
9
11
63.3

50
SELECT
4
5
64.0

100
SELECT
4
5
63.9

200
SELECT
4
5
66.4

Tokyo→Oregon


Parallelism
op
med(ms)
p95(ms)
rps


50
INSERT
106
1,487
36.4

100
INSERT
106
107
62.3

200
INSERT
105
107
60.0

50
SELECT
100
102
62.3

100
SELECT
100
102
74.5

200
SELECT
100
101
69.6

With connection reuse, Tokyo→Tokyo INSERT improved to 9-10ms median, and SELECT to 4ms. Oregon operations also decreased to 100-106ms median, reflecting just the pure RTT.
 5. Write ThroughputEach Lambda sequentially inserted 1,000 records, with varying degrees of parallelism. Connection reuse enabled, Tokyo→Tokyo configuration. Each parallelism level was executed sequentially to eliminate interference from concurrent execution.


Parallelism
Total Records
wall(s)
Total rps
Single Lambda rps


1
1,000
9.7
103
105

10
10,000
10.0
1,002
111

20
20,000
9.8
2,048
113

30
30,000
9.8
3,065
117

40
40,000
10.2
3,905
116

50
50,000
10.0
4,996
116

100
100,000
20.0
5,000
117

Single Lambda INSERT throughput remained stable at about 115 rps, and scaled almost linearly with parallelization. At 50 parallel executions, it reached about 5,000 rps. The total rps not increasing at c=100 is due to internal batch splitting in the Step Functions Map state, not reaching DSQL limits.
 6. Single vs Multi-RegionPerformance differences between single-region and multi-region (Tokyo+Seoul, witness: Osaka) for INSERT/SELECT were verified. Two sets of clusters were created and measured separately to confirm reproducibility. Lambda connected from Tokyo to the Tokyo endpoint.
Cluster Creation Time


Configuration
Time Required


Single-region
About 8 seconds

Multi-region
About 100-115 seconds

Multi-region requires 3 steps (creating clusters in each region + mutual peering) and takes about 2 minutes to become ACTIVE.
INSERT (with reuse, Tokyo→Tokyo)


Parallelism
Single med(ms)
Multi med(ms)
Ratio


50
10
65
6.5x

100
10
64
6.4x

200
9
64
7.1x

SELECT (with reuse, Tokyo→Tokyo)


Parallelism
Single med(ms)
Multi med(ms)
Ratio


50
4
4
1.0x

100
4
4
1.0x

200
4
4
1.0x

INSERT became about 6-7 times slower in multi-region configuration. This is due to the synchronization overhead between regions during commits to ensure strong consistency. On the other hand, SELECT operations showed no difference as they are completed within the local region.
 7. SummaryThough this was a small-scale verification with simple INSERT/SELECT operations on a single table, it clarified DSQL's performance characteristics and key points when using it from serverless environments.
The most important point is DB connection reuse. With reuse, INSERT operations take less than 10ms and SELECT operations take 4-6ms, improving by 10-20 times compared to connecting each time. Write throughput was also confirmed to scale almost linearly from a single Lambda (about 115 rps) to 5,000 rps with 50 parallel executions.
In multi-region configurations, INSERT operations become about 6-7 times slower due to synchronization overhead to ensure strong consistency. However, SELECT performance, which is completed locally, remains unchanged. Caution is needed regarding the tradeoff between availability and write performance.
An important note is DSQL's connection rate limit (100 connections/second, a hard limit that cannot be raised). When deploying with Lambda in production, connection reuse is essential, and using Provisioned Concurrency or controlling concurrent execution is crucial to avoid concentrated cold starts during spikes.
In cases requiring high throughput in multi-region or cross-region scenarios, it's important to reduce direct access to DSQL. For writes, asynchronous bulk processing with buffers like SQS is effective. For reads, since read replicas can't be relied upon, consider cloud-native overall designs using application-side caching or page caching with CloudFront.
 ReferencesAmazon Aurora DSQL Identity and Access Management
Amazon Aurora DSQL Quotas
Securing Amazon Aurora DSQL: Access control best practices
Amazon Aurora DSQL Pricing

Testing Aurora DSQL write and read performance from AWS Lambda

1. Test Environment

2. Least Privilege Design with IAM Roles

Lambda Execution Role Policy

Binding DB Roles to IAM

3. Lambda Function Implementation

4. Connection Reuse Effect

Parallelism 1

Parallel Execution (Without Reuse)

Parallel Execution (With Reuse)

5. Write Throughput

6. Single vs Multi-Region

7. Summary

References

AWS Topics

Trending Topics

Products & Services

Features and Series

Item	Value
Lambda runtime	Node.js 22.x / ARM64 (Graviton)
Lambda memory	1,769MB (equivalent to 1 vCPU)
Lambda region	ap-northeast-1 (Tokyo)
DSQL clusters	ap-northeast-1 (Tokyo) / us-west-2 (Oregon) single-region, Tokyo+Seoul multi-region (witness: Osaka)
Parallel execution	Step Functions Map state (parallelism controlled with MaxConcurrency)
Connector	`@aws/aurora-dsql-postgresjs-connector`

Pattern	Without Reuse	With Reuse
Tokyo→Tokyo INSERT	222ms	21ms
Tokyo→Tokyo SELECT	—	6ms
Tokyo→Oregon INSERT	1,512ms	106ms
Tokyo→Oregon SELECT	—	102ms

Parallelism	op	med(ms)	p95(ms)	rps
50	INSERT	162	612	28.3
100	INSERT	105	518	44.7
200	INSERT	95	263	48.7
50	SELECT	88	118	53.8
100	SELECT	88	143	48.8
200	SELECT	90	133	53.4

Parallelism	op	med(ms)	p95(ms)	rps
50	INSERT	1,338	1,473	21.9
100	INSERT	1,188	1,473	24.3
200	INSERT	1,165	1,408	27.3
50	SELECT	1,064	1,301	24.4
100	SELECT	1,062	1,302	27.8
200	SELECT	1,057	1,125	28.3

Parallelism	op	med(ms)	p95(ms)	rps
50	INSERT	10	12	48.4
100	INSERT	10	12	58.1
200	INSERT	9	11	63.3
50	SELECT	4	5	64.0
100	SELECT	4	5	63.9
200	SELECT	4	5	66.4

Parallelism	op	med(ms)	p95(ms)	rps
50	INSERT	106	1,487	36.4
100	INSERT	106	107	62.3
200	INSERT	105	107	60.0
50	SELECT	100	102	62.3
100	SELECT	100	102	74.5
200	SELECT	100	101	69.6

Parallelism	Total Records	wall(s)	Total rps	Single Lambda rps
1	1,000	9.7	103	105
10	10,000	10.0	1,002	111
20	20,000	9.8	2,048	113
30	30,000	9.8	3,065	117
40	40,000	10.2	3,905	116
50	50,000	10.0	4,996	116
100	100,000	20.0	5,000	117

Configuration	Time Required
Single-region	About 8 seconds
Multi-region	About 100-115 seconds