
Bedrock Knowledge Bases Retrieve API Does Not Work with OpenSearch Serverless NextGen — Cause and Workaround
This page has been translated by machine translation. View original
Introduction
While exploring cost reduction using the scale-to-zero feature of OpenSearch Serverless NextGen, I fell into an unexpected pitfall.
The issue was: "Creating a NextGen collection with Bedrock Knowledge Bases and syncing (ingesting) data succeeds, but calling the Retrieve API (search) returns a 403 error."
To get straight to the conclusion: Bedrock KB's Retrieve API is not compatible with OpenSearch Serverless NextGen. It was a fairly hard-to-notice problem where ingestion works but only retrieval is broken.

In this article, I'll share the background of how I noticed this problem, my investigation into the cause, and the workaround I ultimately adopted (direct querying via opensearch-py).
Setup Flow — How Far Things Go Smoothly
Here's a summary of the steps to use a NextGen collection as a vector store for Bedrock KB. Since things proceed smoothly up to a point, the problem is even harder to notice.
1. Creating an OpenSearch Serverless NextGen Collection
In the OpenSearch Serverless console, create a collection group.
- Collection type: Vector search
- Collection creation method: Express create


Create a collection within the collection group. In NextGen, collections are placed under a "collection group" — this differs from Classic.

2. Manually Creating a Vector Index
Once the NextGen collection is created, go to the collection detail screen and select Indexes > Create index.
Switch to JSON mode and paste the following.
{
"settings": {
"index": {
"knn": true
}
},
"mappings": {
"properties": {
"bedrock-knowledge-base-default-vector": {
"type": "knn_vector",
"dimension": 1024,
"method": {
"name": "hnsw"
}
},
"AMAZON_BEDROCK_METADATA": {
"type": "text",
"index": false
},
"AMAZON_BEDROCK_TEXT_CHUNK": {
"type": "text",
"index": true
}
}
}
}
The index name can be anything, but make note of it since you'll need to specify it in Bedrock KB later (e.g., bedrock-kb-index).

Note: Adjust dimension to match the embedding model you use (1024 for Titan Embeddings V2). Also, specifying the engine parameter in method will cause an error in OpenSearch Serverless. "name": "hnsw" alone is sufficient.
3. Connecting from Bedrock Knowledge Base Using "Use Existing OpenSearch Collection"
When creating a knowledge base in the Bedrock KB console, on the vector store selection screen choose "Use existing OpenSearch Serverless collection", and specify the ARN of the NextGen collection you just created along with the index name you created in step 2. For field mappings, specify the field names exactly as used when creating the index.
- Configure the data source (S3 bucket, etc.)
- Embedding model and vector store
- Embedding model: Select the same one as your existing KB
- Vector store: "Use existing vector store"
- Select OpenSearch Serverless and enter the following
- Collection ARN: The one you noted in Step 1
- Vector index name:
new-rag-index(the name created in Step 3) - Vector field name:
bedrock-knowledge-base-default-vector - Text field name:
AMAZON_BEDROCK_TEXT_CHUNK - Metadata field name:
AMAZON_BEDROCK_METADATA
- Review and create

4. Data Source Sync (Ingest) — This Succeeds
Add an S3 data source and run a sync. Chunking, vectorization, and index registration all complete successfully.
Checking the index from the OpenSearch dashboard, you can confirm that documents are actually registered under the name bedrock-kb-*-index. The fields include Bedrock KB standard fields such as AMAZON_BEDROCK_TEXT_CHUNK and bedrock-knowledge-base-default-vector.
5. Retrieve API (Search) — This Fails
Since everything had gone smoothly so far, you'd expect search to work too — but calling Bedrock KB's Retrieve API returns the following error.
Request failed: [security_exception] Request Content Checksum Verification Failed

The same error occurs in Bedrock KB's Playground (test search feature). The data is definitely there, but search alone doesn't work.
Cause: SigV4 Signature Content Checksum Mismatch
After investigation, the root cause turned out to be a mismatch between the SigV4 signature made by Bedrock KB's internal client and what OpenSearch Serverless NextGen expects.
Specifically, there is a discrepancy in how the x-amz-content-sha256 header (the SHA256 hash of the request body) is computed for POST requests, and the NextGen side rejects it during checksum verification.
Helpful references:
- Next-generation Amazon OpenSearch Serverless is now generally available with zero standby costs — trying out vector search — Investigation into the combination of Bedrock KB and OpenSearch Serverless
Why Does Ingestion Work?
This is speculation, but ingestion processing goes through a different internal path on the Bedrock KB side (likely a separate client for batch processing), where SigV4 signing is done correctly. On the other hand, the Retrieve API accesses OpenSearch via a separate real-time search client, and that client's SigV4 signing implementation doesn't match NextGen's verification.
Since this issue stems from AWS's internal implementation, it may be resolved in a future AWS update.
Workaround: Direct Querying via opensearch-py for Retrieval Only
If Bedrock KB's Retrieve API doesn't work, the workaround is to query OpenSearch directly just for the search part — that's the approach taken here.
The key point is continuing to use Bedrock KB as-is for ingestion (data sync). Since only the Retrieve API is broken, only the retrieval part needs to change. There's no need to re-ingest documents; you can search the data that Bedrock KB already put into the NextGen collection as-is.
Architecture Change
Ingestion (no change):
S3 → Bedrock KB Ingest → OpenSearch NextGen ← This still works
Retrieval (only this changes):
Before: App → Bedrock KB Retrieve API → OpenSearch NextGen ← ❌ Broken
After: App → [embed query] → opensearch-py → OpenSearch NextGen ← ✅
(Titan V2) (SigV4 / aoss)
You reproduce in your own application what Bedrock KB's Retrieve API was doing internally (embed the query → KNN search → return results). There's no need to re-embed all documents or recreate indexes — you only need to vectorize each search query once with Titan V2.
Implementation Details
1. Embedding the Search Query
Call the same model that Bedrock KB was using internally (Amazon Titan Embeddings V2, 1024 dimensions) directly via the bedrock-runtime Invoke Model API. Vectorize the query text on each search (one API call, taking tens of milliseconds).
import boto3, json
client = boto3.client("bedrock-runtime", region_name="ap-northeast-1")
def embed_text(text: str) -> list[float]:
response = client.invoke_model(
modelId="amazon.titan-embed-text-v2:0",
body=json.dumps({
"inputText": text,
"dimensions": 1024,
"normalize": True,
}),
)
return json.loads(response["body"].read())["embedding"]
2. OpenSearch Connection (SigV4 Authentication)
Use opensearch-py's AWSV4SignerAuth with the service name aoss for signing.
from opensearchpy import OpenSearch, RequestsHttpConnection, AWSV4SignerAuth
credentials = boto3.Session().get_credentials()
auth = AWSV4SignerAuth(credentials, "ap-northeast-1", "aoss")
client = OpenSearch(
hosts=[{"host": "<collection-id>.aoss.ap-northeast-1.on.aws", "port": 443}],
http_auth=auth,
use_ssl=True,
verify_certs=True,
connection_class=RequestsHttpConnection,
)
The key point here is that opensearch-py's SigV4 signing works correctly with NextGen. Only Bedrock KB's internal client signing is broken, so signing correctly with SigV4 yourself solves the problem.
3. KNN Search (Using the Existing Index's Field Names as-is)
The index created by Bedrock KB during ingestion uses Bedrock KB's own field names. When querying directly, specify these field names as-is.
search_body = {
"size": 5,
"_source": [
"AMAZON_BEDROCK_TEXT_CHUNK",
"AMAZON_BEDROCK_METADATA",
"x-amz-bedrock-kb-source-uri",
],
"query": {
"knn": {
"bedrock-knowledge-base-default-vector": {
"vector": query_embedding,
"k": 5,
}
}
},
}
response = client.search(index="bedrock-kb-<your>-index", body=search_body)
| Bedrock KB Field Name | Purpose |
|---|---|
bedrock-knowledge-base-default-vector |
Vector embedding (KNN search target) |
AMAZON_BEDROCK_TEXT_CHUNK |
Chunk text content |
AMAZON_BEDROCK_METADATA |
Metadata |
x-amz-bedrock-kb-source-uri |
Source file's S3 URI |
Pitfall: OpenSearch Serverless Two-Layer Authentication
After implementing direct querying, the first test returned 403 Forbidden. I had already added the EC2 role to the OpenSearch data access policy, so I was confused about why I was still getting a 403.
The cause was that OpenSearch Serverless requires two layers of authentication. Without configuring both, you get a 403.
| Layer | Where to Configure | Required Permissions |
|---|---|---|
| IAM policy | Attached to IAM role (EC2, etc.) | aoss:APIAccessAll |
| Data access policy | OpenSearch Serverless console | aoss:ReadDocument, aoss:DescribeIndex, etc. |
Layer 1: IAM Policy
Attach the following inline policy to the EC2 instance role (or the IAM role used by your application).
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "aoss:APIAccessAll",
"Resource": "arn:aws:aoss:ap-northeast-1:<account-id>:collection/*"
}
]
}
Layer 2: Data Access Policy
In the OpenSearch Serverless console, configure the data access policy for the collection.
[
{
"Rules": [
{
"ResourceType": "index",
"Resource": ["index/<collection-name>/*"],
"Permission": [
"aoss:ReadDocument",
"aoss:DescribeIndex"
]
}
],
"Principal": [
"arn:aws:iam::<account-id>:role/<your-ec2-role>"
]
}
]
Note: When you create a knowledge base in Bedrock KB, a data access policy for Bedrock is automatically configured — but that's for the Bedrock KB service role. When querying directly from your application, you need to separately add your application's IAM role as a Principal.
Debugging Tips
When a 403 is returned, you can get details from the opensearch-py error object.
try:
client.search(index=index_name, body=search_body)
except Exception as e:
detail = getattr(e, "info", None) or getattr(e, "body", None)
logger.error("OpenSearch failed: %s | detail=%s", e, detail)
This lets you retrieve details like {'status': 403, 'error': {'reason': '403 Forbidden', 'type': 'Forbidden'}}. Unfortunately, OpenSearch Serverless 403 errors don't distinguish between IAM-side issues and data access policy issues, so check both.
Summary
| Item | Status |
|---|---|
| Bedrock KB → NextGen ingestion | ✅ Works |
| Bedrock KB → NextGen retrieval | ❌ Fails with SigV4 checksum error |
| opensearch-py → NextGen direct query | ✅ Works |
Bedrock Knowledge Bases' Retrieve API does not currently work with OpenSearch Serverless NextGen. Since data sync (ingestion) works normally, you end up in a confusing state where "data went in but can't be searched."
The key point of the workaround is that only the retrieval part needs to change.
- Ingestion: Keep using Bedrock KB as-is (works normally with NextGen too)
- Retrieval: Vectorize search queries with Titan V2 via opensearch-py → KNN search
Since the index and field names created by Bedrock KB are used as-is, no re-ingestion of data is required. You only need to add query vectorization (a few tens of milliseconds per query) to take full advantage of the existing data.
Note that when querying directly from opensearch-py, you need to configure OpenSearch Serverless's two-layer authentication (IAM policy + data access policy). Either one alone will result in a 403, so be careful.
This issue may be resolved in the future with an AWS update, but as of the current date (June 1, 2026), direct retrieval querying is necessary when using Bedrock KB with NextGen.
