Aurora DSQL Python Connector has been released, so I measured and compared performance in Lambda (psycopg2 vs asyncpg)

Aurora DSQL Python Connector has been released, so I measured and compared performance in Lambda (psycopg2 vs asyncpg)

Aurora DSQL Python Official Connector Quick Test on Lambda! Performance comparison between psycopg2 and asyncpg, plus real measurement data for x86 vs Graviton. We've compiled connection reuse techniques to eliminate connection overhead and strategies to avoid "connection rate limits" faced in production environments.
2026.02.20

This page has been translated by machine translation. View original

On February 19, 2026, official connectors for Aurora DSQL were released for three languages: Go, Python, and Node.js.

https://aws.amazon.com/jp/about-aws/whats-new/2026/02/aurora-dsql-launches-go-python-nodejs-connectors/

https://dev.classmethod.jp/articles/aurora-dsql-launches-go-python-nodejs-connectors/

The Python package aurora-dsql-python-connector supports three drivers (psycopg / psycopg2 / asyncpg) and is a transparent authentication layer that automates IAM authentication token generation when connecting to DSQL.

Until now, to connect to DSQL from Python, you needed to create a DSQL client with boto3, generate a token with generate_db_connect_admin_auth_token, and pass it as the password to psycopg2.connect.

import psycopg2
import boto3

def get_dsql_connection():
    client = boto3.client('dsql', region_name='us-west-2')
    token = client.generate_db_connect_admin_auth_token(DSQL_ENDPOINT, 'us-west-2')
    return psycopg2.connect(
        host=DSQL_ENDPOINT, port=5432, database='postgres',
        user='admin', password=token, sslmode='require'
    )

With the new connector, this becomes a single line:

import aurora_dsql_psycopg2 as dsql

conn = dsql.connect(host=DSQL_ENDPOINT, region='us-west-2', user='admin')

The boto3 call, token generation, and sslmode specification are all no longer necessary. Similarly, with asyncpg, you can simply import aurora_dsql_asyncpg to use it.

Motivation: Why Test on Lambda?

Following the release of the new connectors, I wanted to understand the improvement effects for Lambda workloads using psycopg2. There were three challenges in our current environment:

We relied on the psycopg2-binary Lambda layer for x86 and couldn't migrate to arm64 (Graviton). Some of our Lambdas run for up to 230 minutes via Step Functions, potentially exceeding DSQL's maximum connection lifetime (60 minutes) and causing token expiration. And the same get_dsql_connection() function was copied across multiple files, causing maintainability issues.

Since the new connector also supported asyncpg, I decided to test the improvement effects, including driver changes, in a Lambda production environment.

Note that while the new connector also supports psycopg (v3), this test focused on psycopg2 and asyncpg, assuming migration from an existing environment.

Testing Method

Test Configuration

I created two Lambda functions using SAM: one for x86_64 and one for arm64 (Graviton), each with layers containing psycopg2 and asyncpg. Runtime was Python 3.12, memory 256 MB, and the DSQL cluster was in us-west-2, same as the Lambda. After testing, the stack could be removed with sam delete.

Test Patterns

Six patterns were measured:

# Pattern Architecture Each Iteration Process
1 psycopg2 (connect each time) x86_64 Auto token generation → TCP → TLS → PG auth → SQL → disconnect
2 asyncpg (connect each time) x86_64 Same as above (with asyncpg driver)
3 asyncpg (connection reuse) x86_64 Connect once initially, then reuse the same connection
4 psycopg2 (connect each time) arm64 Pattern 1 run on Graviton
5 asyncpg (connect each time) arm64 Pattern 2 run on Graviton
6 asyncpg (connection reuse) arm64 Pattern 3 run on Graviton

The "connect each time" patterns are intentional anti-patterns to accurately measure the overhead per connection establishment, not intended for production use.

Measurement Method

Using SELECT 1 as the query minimized processing on the DB side. Each Lambda invoke performed 30 iterations, measuring "connection establishment time" and "query execution time" separately with time.perf_counter(). Multiple invokes were done for each architecture (x86: 6 times, arm64: 3 times), and the median values were used.

Measurements were taken in a warm-start handler loop, excluding Init Duration. "Connection time" includes IAM token generation + TCP connection + TLS handshake + PostgreSQL authentication.

Reference: Cold Start

x86_64 arm64
Init Duration 149ms 123ms
First connection ~1.0-1.5sec ~1.0sec
Cold start total ~1.2-1.7sec ~1.1-1.2sec

These times provide ample margin within API Gateway's default timeout (29 seconds).

Below is a conceptual diagram of the Lambda lifecycle and connection establishment timing:

This test measured the handler internal loop during warm start.

Test Results

x86_64 Lambda (median, 6 runs)

Pattern Connection(avg) Query(avg)
psycopg2 (connect each time) 873.0ms 12.95ms
asyncpg (connect each time) 807.7ms 7.53ms
asyncpg (connection reuse) 0.02ms 2.50ms

arm64 (Graviton) Lambda (median, 3 runs)

Pattern Connection(avg) Query(avg)
psycopg2 (connect each time) 673.6ms 12.56ms
asyncpg (connect each time) 633.1ms 6.67ms
asyncpg (connection reuse) 0.02ms 2.44ms

Analysis

Driver Comparison: Small Connection Differences, Clear Query Differences

The difference in connection establishment time between drivers was small: 7% on x86 and 6% on arm64. Most of the connection time is spent on IAM token generation (STS call via boto3), with driver-specific processing being only a fraction.

However, there was a clear difference in query execution time. asyncpg was 42% faster on x86 and 47% faster on arm64. This is due to PostgreSQL's binary protocol and Cython-based result parsing.

However, in Lambda environments, connection establishment (600-870ms) dominates the processing time, making the few millisecond differences in query time relatively insignificant.

Connection Reuse: Connection Cost Reduced to Zero

With connection reuse, connection acquisition time was reduced from 600-870ms to 0.02ms. Using an already established connection eliminates the need for TCP connections and token generation.

While asyncpg.create_pool(min_size=1) was used in the implementation, since Lambda processes one request serially, it's essentially equivalent to storing a single connection in a global variable. In Lambda, it's simpler to maintain the connection in global scope:

import aurora_dsql_asyncpg as dsql

conn = None

async def handler(event, context):
    global conn
    if conn is None or conn.is_closed():
        conn = await dsql.connect(host=DSQL_ENDPOINT, region='us-west-2', user='admin')
    row = await conn.fetchrow("SELECT ...")
    return row

During warm starts, global variables are preserved, allowing connection establishment to be skipped for subsequent invokes.

Architecture Comparison: arm64 was 22-23% Faster for Connection Establishment

Reductions of 23% (873→674ms) for psycopg2 and 22% (808→633ms) for asyncpg were observed. Graviton likely processes CPU-bound tasks like IAM token generation and signature calculation more efficiently. Query execution times were nearly identical as they are dominated by network I/O.

Operational Considerations for Lambda × DSQL

Connection Rate Limit: The Biggest Pitfall

DSQL has a hard limit of 100 connections per second. When Lambda scales out with spike traffic, each container tries to establish a connection simultaneously, quickly hitting this limit. Since DSQL doesn't have an intermediary like RDS Proxy, control on the Lambda side is essential.

Countermeasures include limiting simultaneous execution with Reserved Concurrency, retrying with exponential backoff on connection errors, and using SQS or EventBridge to absorb spikes.

Here's a summary of DSQL constraints and their impact on Lambda:

DSQL Constraint Value Impact on Lambda
Connection Rate 100 connections/sec Most dangerous during concurrent scale-out
Maximum Connection Lifetime 60 minutes Requires reconnection for long-running batches
Maximum Transaction Time 5 minutes Large INSERTs need to be split
Maximum Rows per Transaction 3,000 rows Affects batch size design
Maximum Connections 10,000 Needs consideration for large-scale Lambda operations

Should You Adopt asyncpg?

While asyncpg query execution was 42-47% faster, Lambda adoption decisions aren't based on speed alone.

asyncpg is suitable when you want to break free from x86 psycopg2 layer dependencies for arm64 migration, execute multiple queries in parallel with asyncio.gather, or when connection reuse is important for batch processing.

On the other hand, for simple processes with 1-2 queries per request, the psycopg2 connector is sufficient. It has lower adoption costs as SQL doesn't need to be rewritten, and you still get the benefits of eliminating manual token generation and reducing boilerplate code.

Considerations When Migrating to asyncpg

When adopting asyncpg, several code changes are required.

SQL placeholders change from %s to $1, $2, ..., and parameters are passed as positional arguments rather than tuples:

# psycopg2
cur.execute("SELECT * FROM t WHERE id = %s AND status = %s", (article_id, 'pending'))
row = cur.fetchone()

# asyncpg
row = await conn.fetchrow("SELECT * FROM t WHERE id = $1 AND status = $2", article_id, 'pending')

Method names change from fetchonefetchrow and fetchallfetch.

Transaction handling also differs. psycopg2 defaults to autocommit=False, implicitly starting a transaction, while asyncpg defaults to autocommit. To use transactions explicitly with asyncpg, wrap operations in async with conn.transaction():

# psycopg2
conn.commit()

# asyncpg
async with conn.transaction():
    await conn.execute("INSERT ...")
    await conn.execute("UPDATE ...")

asyncpg's default autocommit behavior is safer for DSQL's 5-minute transaction time limit.

Conclusion

Through testing on Lambda, I confirmed the effectiveness of the new connectors.

Connection establishment costs in Lambda environments are significant (600-870ms) and can be reduced to zero with connection reuse. arm64 speeds up connection establishment by 22-23% compared to x86, with an expected 20% cost reduction. The connector's automatic IAM token generation also eliminates authentication expiration risks in long-running batches.

If you're looking for a way to connect to DSQL, I recommend starting with the psycopg2 connector, which can be introduced without changing existing code. Consider upgrading to the asyncpg connector if you're planning an arm64 migration or batch processing optimization.

Note that these measurements were taken in a warm-start environment within the same region (us-west-2) and represent median values from 30 iterations × multiple invokes. Results may vary depending on network environment and DSQL cluster load conditions.

Reference: Test Code

Here are the SAM template and Lambda handler used for testing. You can deploy with sam build && sam deploy, run with sam remote invoke, and clean up with sam delete.

template.yaml
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: Aurora DSQL Connector Benchmark on Lambda

Parameters:
  DsqlEndpoint:
    Type: String
  DsqlRegion:
    Type: String
    Default: us-west-2
  BenchIterations:
    Type: String
    Default: '30'

Globals:
  Function:
    Runtime: python3.12
    Timeout: 300
    MemorySize: 256
    Environment:
      Variables:
        DSQL_ENDPOINT: !Ref DsqlEndpoint
        DSQL_REGION: !Ref DsqlRegion
        BENCH_ITERATIONS: !Ref BenchIterations

Resources:
  Psycopg2Layer:
    Type: AWS::Serverless::LayerVersion
    Properties:
      LayerName: dsql-bench-psycopg2-x86
      ContentUri: layers/psycopg2-x86/
      CompatibleRuntimes: [python3.12]
      CompatibleArchitectures: [x86_64]
    Metadata:
      BuildMethod: python3.12
      BuildArchitecture: x86_64

  AsyncpgLayer:
    Type: AWS::Serverless::LayerVersion
    Properties:
      LayerName: dsql-bench-asyncpg-arm64
      ContentUri: layers/asyncpg-arm64/
      CompatibleRuntimes: [python3.12]
      CompatibleArchitectures: [arm64]
    Metadata:
      BuildMethod: python3.12
      BuildArchitecture: arm64

  BenchX86:
    Type: AWS::Serverless::Function
    Properties:
      FunctionName: dsql-bench-x86
      Handler: handler.lambda_handler
      CodeUri: src/
      Architectures: [x86_64]
      Layers: [!Ref Psycopg2Layer]
      Policies:
        - Statement:
            - Effect: Allow
              Action: dsql:DbConnectAdmin
              Resource: !Sub arn:aws:dsql:${DsqlRegion}:${AWS::AccountId}:cluster/*

  BenchArm64:
    Type: AWS::Serverless::Function
    Properties:
      FunctionName: dsql-bench-arm64
      Handler: handler.lambda_handler
      CodeUri: src/
      Architectures: [arm64]
      Layers: [!Ref AsyncpgLayer]
      Policies:
        - Statement:
            - Effect: Allow
              Action: dsql:DbConnectAdmin
              Resource: !Sub arn:aws:dsql:${DsqlRegion}:${AWS::AccountId}:cluster/*
handler.py
import asyncio
import json
import os
import platform
import statistics
import time

DSQL_ENDPOINT = os.environ['DSQL_ENDPOINT']
DSQL_REGION = os.environ.get('DSQL_REGION', 'us-west-2')
ITERATIONS = int(os.environ.get('BENCH_ITERATIONS', '30'))
QUERY = 'SELECT 1'
ARCH = platform.machine()

def fmt(s):
    return round(s * 1000, 2)

def calc_stats(connect_times, query_times):
    n = len(connect_times)
    total = sum(connect_times) + sum(query_times)
    return {
        'connect_avg_ms': fmt(statistics.mean(connect_times)),
        'connect_p50_ms': fmt(statistics.median(connect_times)),
        'connect_p95_ms': fmt(sorted(connect_times)[int(n * 0.95)]),
        'query_avg_ms': fmt(statistics.mean(query_times)),
        'query_p50_ms': fmt(statistics.median(query_times)),
        'total_ms': fmt(total),
        'qps': round(n / total, 1),
    }

def bench_psycopg2():
    import aurora_dsql_psycopg2 as dsql
    ct, qt = [], []
    for _ in range(ITERATIONS):
        t0 = time.perf_counter()
        conn = dsql.connect(host=DSQL_ENDPOINT, region=DSQL_REGION, user='admin')
        t1 = time.perf_counter()
        ct.append(t1 - t0)
        cur = conn.cursor()
        cur.execute(QUERY)
        cur.fetchall()
        qt.append(time.perf_counter() - t1)
        cur.close()
        conn.close()
    return {'pattern': 'psycopg2 (every connect)', **calc_stats(ct, qt)}

async def bench_asyncpg_no_pool():
    import aurora_dsql_asyncpg as dsql
    ct, qt = [], []
    for _ in range(ITERATIONS):
        t0 = time.perf_counter()
        conn = await dsql.connect(host=DSQL_ENDPOINT, region=DSQL_REGION, user='admin')
        t1 = time.perf_counter()
        ct.append(t1 - t0)
        await conn.fetch(QUERY)
        qt.append(time.perf_counter() - t1)
        await conn.close()
    return {'pattern': 'asyncpg (every connect)', **calc_stats(ct, qt)}

async def bench_asyncpg_reuse():
    import aurora_dsql_asyncpg as dsql
    t0 = time.perf_counter()
    pool = await dsql.create_pool(
        host=DSQL_ENDPOINT, region=DSQL_REGION, user='admin',
        min_size=1, max_size=3)
    pool_create = time.perf_counter() - t0
    ct, qt = [], []
    for _ in range(ITERATIONS):
        t1 = time.perf_counter()
        async with pool.acquire() as conn:
            t2 = time.perf_counter()
            ct.append(t2 - t1)
            await conn.fetch(QUERY)
            qt.append(time.perf_counter() - t2)
    await pool.close()
    stats = calc_stats(ct, qt)
    stats['pool_create_ms'] = fmt(pool_create)
    return {'pattern': 'asyncpg (connection reuse)', **stats}

def lambda_handler(event, context):
    results = []
    results.append(bench_psycopg2())
    loop = asyncio.get_event_loop()
    results.append(loop.run_until_complete(bench_asyncpg_no_pool()))
    results.append(loop.run_until_complete(bench_asyncpg_reuse()))
    body = {
        'arch': ARCH,
        'runtime': f"python{platform.python_version()}",
        'iterations': ITERATIONS,
        'query': QUERY,
        'endpoint': DSQL_ENDPOINT,
        'results': results,
    }
    print(json.dumps(body, ensure_ascii=False, indent=2))
    return body

Share this article

FacebookHatena blogX