[Update] Amazon S3 Vectors now supports retrieval of up to 10,000 similar search results per query (100x increase over previous limit)

[Update] Amazon S3 Vectors now supports retrieval of up to 10,000 similar search results per query (100x increase over previous limit)

Amazon S3 Vectors has expanded the maximum number of search results retrieved per query from 100 to 10,000, a 100-fold increase. This article introduces the impact this change brings to multi-stage search pipelines and how to implement it.
2026.06.17

This page has been translated by machine translation. View original

This is Ishikawa from the Cloud Business Division. With Amazon S3 Vectors, the maximum number of results retrievable in a single similarity search query has been expanded 100x, from the previous limit of 100 to a maximum of 10,000.

For those building multi-stage search pipelines that assume downstream processing such as re-ranking and deduplication, this is a welcome update that allows you to retrieve a broader candidate set with a single query.

https://aws.amazon.com/jp/about-aws/whats-new/2026/06/s3-vectors-supports-10000-search-results-per-query/

What does retrieving up to 10,000 similarity search results per query (100x the previous limit) mean?

With this update, you can now specify up to 10,000 for topK (the number of nearest neighbor vectors to retrieve) in the QueryVectors API.

The main changes are as follows:

  • The maximum number of results per query has been expanded 100x, from 100 to 10,000
  • You can specify up to 10,000 for topK (top-K nearest neighbors) in QueryVectors API requests
  • Query results are returned split across multiple pages, allowing you to immediately process the first page while retrieving subsequent pages as needed
  • Since you can retrieve a broader and more comprehensive candidate set, it becomes easier to leverage for additional processing such as re-ranking, aggregation, and deduplication

Being able to retrieve more candidates in a single query allows you to more efficiently build "multi-stage search pipelines" — which perform re-ranking and filtering after retrieving search results — with fewer API calls.

Impact on pricing

A "data-returned fee" is charged based on the amount of data returned by the query.

  • The first 512 KB of returned data per query is free
  • For data exceeding 512 KB, charges are based on the total size of returned results (at the time of writing, $0.01/GB)
  • Each result is charged at a minimum of 256 bytes, reflecting the cost of lookup and return

Since exact pricing may vary by region and time, please check the S3 pricing page for the latest information.

https://aws.amazon.com/jp/s3/pricing/

What are the benefits of expanding top-k to 10,000?

While vector search (approximate nearest neighbor search: ANN) is fast, it is ultimately an "approximation," and highly relevant results are not necessarily contained within the top 100. When the topK limit was 100, this tended to become a ceiling on search quality. Being able to retrieve up to 10,000 results makes it easier to build the following multi-stage search pipelines:

  1. Increase candidates for re-ranking to improve final accuracy
  2. Can be used for hybrid search and multi-result fusion
  3. Enough results remain even after metadata filtering and deduplication
  4. Directly supports use cases that require returning many results, such as aggregation, recommendations, and exhaustive search

Note that the larger the top-k, the greater the returned data volume, latency, and cost. It is important to identify the number of candidates that is necessary and sufficient for your use case, based on the principle of "cast a wide net, then narrow down intelligently in downstream stages." Since large numbers of results are returned page by page, you process the first page while retrieving subsequent pages. The specific implementation method for retrieving all pages varies depending on the SDK version you are using, so please follow the latest SDK reference.

Notes on usage

  • Updating to the latest AWS SDK is required to use this feature
  • Increasing the number of results also increases the amount of returned data, and any amount exceeding the 512 KB free tier will be subject to data-returned fees. Setting an appropriate topK is key to cost optimization
  • S3 Vectors is a service optimized for infrequent queries. For workloads requiring high QPS and low latency, consider integrating with Amazon OpenSearch Service

Conclusion

The per-query result retrieval limit for Amazon S3 Vectors has been expanded 100x, from the previous 100 to a maximum of 10,000. You can specify up to 10,000 for topK in the QueryVectors API, and results are returned in paginated form. Pricing is based on the amount of returned data, with the first 512 KB free.

If you are building multi-stage search pipelines or RAG applications that involve re-ranking and deduplication, why not update to the latest AWS SDK and consider improving search accuracy by leveraging a broader candidate set?

Share this article

AWSのお困り事はクラスメソッドへ