I tried feeding DevelopersIO blog articles into Cloudflare AI Search to see if I could search them using natural language

2026.06.01

This page has been translated by machine translation. View original

This is Konishi from the Berlin office.
Cloudflare has released a managed search service called AI Search (currently in open beta). When you pass content to it, it automatically handles chunking, vectorization, and index construction, making it searchable in natural language.
Our company blog, DevelopersIO, manages approximately 60,000 articles in Contentful. This time, I fed a portion of these into AI Search. Here is a summary of what I tested, from setup to Japanese search accuracy.
 What is Cloudflare AI SearchIt is a managed service that, when you upload or have it crawl content, handles chunking, vectorization, and index construction for you.
Key features:
Data sources come in 3 types: website crawling, R2 buckets, and direct upload via the Items API
Hybrid search support: runs vector (semantic) search and BM25 keyword search in parallel and merges the results
Search interfaces: can be called from Wrangler CLI, REST API, Workers binding, and MCP server
Pricing: free within limits during the beta period (Workers AI and AI Gateway usage is billed separately)
 What I Did This TimeI retrieve articles from the DevIO backend, feed them into AI Search, and make them searchable in natural language.
Tools used:
Cloudflare account (Free plan)
Wrangler CLI
Node.js script (for data retrieval and upload)
※ Details of the Node.js script are omitted this time.

 Setting Up the AI Search InstanceFirst, create an instance with the Wrangler CLI. It is also possible from the dashboard or REST API.
$ wrangler ai-search create devio-search
You will be asked a few questions interactively. Since I am uploading data directly via the Items API from the CMS through a local connection, I selected Builtin (Cloudflare-managed storage) as the source type.
✔ Select the source type: › Builtin
✔ Configure custom metadata fields? (optional) … no
Creating AI Search instance "devio-search" in namespace "konishi-test"...
Successfully created AI Search instance "devio-search"
  Name:       devio-search
  Namespace:  konishi-test
  Type:       builtin
  Source:     -
  Model:
  Embedding:  @cf/qwen/qwen3-embedding-0.6b
The embedding model @cf/qwen/qwen3-embedding-0.6b was automatically assigned.
Check the status with stats.
$ wrangler ai-search stats devio-search --namespace konishi-test
┌────────┬────────────┬─────────┬─────────┬──────────┬────────┐
│ Queued │ Processing │ Indexed │ Skipped │ Outdated │ Errors │
├────────┼────────────┼─────────┼─────────┼──────────┼────────┤
│ 0      │ 0          │ 0       │ 0       │ 0        │ 0      │
└────────┴────────────┴─────────┴─────────┴──────────┴────────┘
Everything is zero right after creation. Pass the namespace name specified during instance creation to --namespace.
 About MetadataIn AI Search, you can attach metadata to indexed documents. Metadata can be used for filtering results and adjusting rankings.
Default Metadata
The following three fields are automatically attached.


Field
Content


filename
File name

folder
Folder path

timestamp
Last updated datetime

Custom Metadata
In addition to the above, you can define up to 5 custom fields per instance. The supported types are text (up to 500 characters), number, boolean, and datetime (ISO 8601).
These can also be used for weighting during search.
This time, I added 3 fields to match the CMS article data: slug, first-published-at, and author.

Methods for adding them include setting them interactively during instance creation, adding them later via the REST API's update, or adding them from the dashboard.
custom_metadata:
  - field_name: slug,          data_type: text
  - field_name: first-published-at, data_type: datetime
  - field_name: author,        data_type: text
This makes it possible to do things like...
Return the slug of a matched article
Filter by "author": "konishi-ryo"
Narrow down by a range of first-published-at
during search.
 Uploading Article DataI use the Items API to upload article data to AI Search. Since there is no upload command in the CLI, I use the REST API.
A wide range of file formats are supported; plain text formats (.md, .txt, .log, etc.) are indexed as-is, while rich formats (.pdf, .html, .docx, .xlsx, .csv, etc.) are automatically converted to Markdown by Cloudflare before being indexed.
DevelopersIO's backend stores articles in Markdown, which is convenient.
First, I download the following article (title and body) from the CMS in Markdown format, and then upload it directly to Cloudflare.
https://dev.classmethod.jp/articles/n26-aws-summit/
After downloading the MD file, uploading to Cloudflare looks something like this:
$ curl -X POST "https://api.cloudflare.com/client/v4/accounts/<ACCOUNT_ID>/ai-search/namespaces/konishi-test/instances/devio-search/items" \
  -H "Authorization: Bearer <API_TOKEN>" \
  -F "file=@articles.md" \
  -F 'metadata={"slug":"n26-chargeback-ai","first-published-at":"2026-05-28T12:00:00Z","author":"konishi-ryo"}'
!The API token requires both AI Search: Edit and AI Search: Run permissions. Please create one from API Tokens in the dashboard.
Response:
{
  "success": true,
  "result": {
    "id": "928ad9f2e3ed4535a81f7377d3731aa6",
    "key": "article.md",
    "status": "queued",
    "next_action": "INDEX",
    "namespace": "konishi-test",
    "source_id": "builtin",
    "created_at": "2026-05-29 15:24:09"
  }
}
The upload succeeded with status: "queued". Index processing starts automatically. Looking at the stats, you can see it has entered Processing.
$ pnpm wrangler ai-search stats devio-search --namespace
┌────────┬────────────┬─────────┬─────────┬──────────┬────────┐
│ Queued │ Processing │ Indexed │ Skipped │ Outdated │ Errors │
├────────┼────────────┼─────────┼─────────┼──────────┼────────┤
│ 0      │ 1          │ 0       │ 0       │ 0        │ 0      │
└────────┴────────────┴─────────┴─────────┴──────────┴────────┘
After waiting a little, you can see it has been indexed.
┌────────┬────────────┬─────────┬─────────┬──────────┬────────┐
│ Queued │ Processing │ Indexed │ Skipped │ Outdated │ Errors │
├────────┼────────────┼─────────┼─────────┼──────────┼────────┤
│ 0      │ 0          │ 1       │ 0       │ 0        │ 0      │
└────────┴────────────┴─────────┴─────────┴──────────┴────────┘
Files uploaded directly can also be confirmed from the dashboard.
It was split into 5 chunks and indexed (these parameters can also be changed, as described later). The file size is 14.59 kB, the status is "Indexed," and the index processing completed successfully.

The custom metadata could also be confirmed.
 Trying a SearchOnce indexing is complete, let's try searching. Starting with the CLI.
$ wrangler ai-search search devio-search --namespace konishi-test --query "N26 決済トラブル AI"
Search query: "N26 決済トラブル AI"  (6 results)

┌───┬────────┬───────────────────────────┬───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┬──────┐
│ # │ score  │ key                       │ text                                                                                                                      │ type │
├───┼────────┼───────────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼──────┤
│ 1 │ 0.7265 │ 5q7lkIqA71nFKRxjHxNl8t.md │ # ドイツのネット銀行が、AIをUIに「組み込まずに」決済トラブルSLAを48→85%に改善した話 - AWS Summit Hamburg 2026 レポート... │ text │
├───┼────────┼───────────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼──────┤
│ 2 │ 0.5867 │ 5q7lkIqA71nFKRxjHxNl8t.md │ ## 安全性の担保                                                                                                           │ text │
│   │        │                           │                                                                                                                           │      │
│   │        │                           │ 規制業界のため安全性は3つのレイヤーで担保しています。                                                                     │      │
│   │        │                           │                                                                                                                           │      │
│   │        │                           │ **AI Guardrails**: 判定リクエストがAIに到達する前にバリデー...                                                            │      │
├───┼────────┼───────────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼──────┤
│ 3 │ 0.5754 │ 5q7lkIqA71nFKRxjHxNl8t.md │ ## 課題: チャージバック処理がパンクしていた                                                                               │ text │
│   │        │                           │                                                                                                                           │      │
│   │        │                           │ ### チャージバックとは                                                                                                    │      │
│   │        │                           │                                                                                                                           │      │
│   │        │                           │ チャージバック（Chargeback）はカード決済に対する異議申し立て（dis...                                                      │      │
├───┼────────┼───────────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼──────┤
│ 4 │ 0.5632 │ 5q7lkIqA71nFKRxjHxNl8t.md │ ## 学んだこと                                                                                                             │ text │
│   │        │                           │                                                                                                                           │      │
│   │        │                           │ Alex氏が共有してくれた3つの学びが下記です:                                                                                │      │
│   │        │                           │                                                                                                                           │      │
│   │        │                           │ **Lesson 1: まずインプットを直せ。** 導入前は自由テキスト1つとファイルア...                                               │      │
├───┼────────┼───────────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼──────┤
│ 5 │ 0.5580 │ 5q7lkIqA71nFKRxjHxNl8t.md │ ## ドメイン理解から始めた                                                                                                 │ text │
│   │        │                           │                                                                                                                           │      │
│   │        │                           │ Alex氏のチームはもともとチャージバックのUI提出フローと一部の自動化ルール（不正利用系）を担当していました。ただ、Autho...  │      │
├───┼────────┼───────────────────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼──────┤
│ 6 │ 0.4349 │ 1Asus6wtB8Q0HKXEjGclyH.md │ ## 注意点                                                                                                                 │ text │
│   │        │                           │                                                                                                                           │      │
│   │        │                           │ - AI Actionsはフィールド単位で実行されます。テキストの部分選択や、エントリー全体に対する適用は現時点ではできません。      │      │
│   │        │                           │ - 権限管...                                                                                                               │      │
└───┴────────┴───────────────────────────┴───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
5 out of 6 results are the N26 article's chunks arranged in score order, showing that results are returned at the chunk level rather than the article level. The 6th result mixes in another test article with a score of 0.43, which was pulled in by the keyword "AI."
The search command has the following options.


Option
Description


--max-num-results
Maximum number of results

--score-threshold
Minimum score (0 to 1)

--reranking
Enable/disable reranking

--filter
Metadata filter (key=value, multiple can be specified)

--json
Output in JSON format

Using --filter, you can narrow down by the custom metadata set earlier. For example, to search only articles by a specific author:
$ wrangler ai-search search devio-search --namespace konishi-test --query "AI" --filter author=konishi-ryo
Adding --json returns the response including metadata (such as slug). Since it is returned in score order, the Top-K can be used as-is.
 Quick Evaluation of Japanese Search AccuracyLet me do a simple evaluation.
 Evaluation MethodI prepare pairs of test queries and correct articles, and measure with Hit Rate@5 — simply checking whether the correct article is included in the Top-5 search results.

\text{Hit Rate@5} = \frac{\text{Number of queries where the correct answer is in Top-5}}{\text{Total number of queries}}
 Test Query DesignThe index target is approximately 1,000 articles, and I used 50 test queries. To approximate real search behavior, queries are in keyword format of up to 4 words separated by half-width spaces.
 ResultsHere are the test execution results.
Overall Hit Rate@5: 46/49 (93.9%)


Query
Expected Article
Rank


✅ Terraform RDS log management
terraform-rds-manage-cloudwatch-logs
#1

✅ CloudFormation Lambda inline code
cloudformation-lambda-inline-code-s3-zip
#1

✅ Multiple ALB logs Athena query
athena-partition-projection-enum
#1

✅ PostgreSQL index-only response
postgresql-include-index-only-scan
#2

✅ pnpm v11 postinstall broken
pnpm-v10-to-v11-migration-docker-ci
#1

✅ Claude Code Codex cross-review
claude-code-codex-cross-review
#1

✅ Twilio Studio weather forecast
twilio-weather-sms-bot
#1

✅ WAF ALB direct CloudFront via difference
tsnote-waf-attach-alb-or-cloudfront
#1

✅ Sales CLAUDE.md usage
claude-code-claude-md-for-non-engineer-sales
#1

✅ EC2 forgetting to stop LINE notification
ec2-line-notify-stop-reminder
#1

✅ LLM conversation memory persistence
mem0-llm-memory
#2

✅ StackSets IAM user bulk creation
create-handson-iam-users-with-stacksets
#1

✅ EC2 multiple instances AMI bulk creation
ec2-ami-backup-parameter-store-lambda
#1

✅ S3 pre-signed URL via VPC endpoint
s3-presigned-url-via-vpc-endpoint
#1

✅ ECS pause resume manual
amazon-ecs-pause-continue-deployments
#1

✅ EFS replication impact
tsnote-efs-replication-workload-impact-001
#1

✅ Route 53 subdomain Cloud DNS delegation
delegate-zone-from-amazon-route53-to-cloud-dns
#1

✅ NLB S3 fixed IP
nlb-s3-interface-endpoint-fixed-ip-access
#1

✅ RDS Multi-AZ ENA Express
rds-multi-az-ena-express-srd
#1

❌ Incident automation Bedrock
aws-devops-agent-error-investigation
-

✅ RAG wrong answer why paper
why-rag-fails-graph-perspective
#1

✅ Slack workflow DynamoDB
slack-bolt-workflow
#1

✅ Google Chat Bot Cloud Functions
google-chat-bot-cloud-functions-python
#3

✅ Kiro headless Lambda
kiro-cli-headless-lambda-guardduty-triage
#6

✅ Google Cloud Log Sink copy
gcp-folder-log-sink-dual-storage
#3

❌ Bedrock Claude boto3 Anthropic SDK migration
migrate-boto3-to-anthropic-sdk-bedrock
-

✅ Strands Agents Bedrock agent-sre
strands-bedrock-agent-sre
#1

✅ VPC PrivateLink cross-VPC
vpc-private-link-vpc-provider-customer-ks
#1

✅ SAM CLI Fn::ForEach
sam-cli-1-160-foreach
#1

✅ aws-vault 1Password Desktop
aws-vault-op-desktop-1password-desktop
#1

✅ Amazon Quick SAML Microsoft Entra ID
entra-id-amazon-quick-saml-sso
#1

✅ Regional NAT Gateway TGW cost
regional-nat-gateway-tgw-cost-analysis
#1

✅ PowerPoint translation automatic
pptx-auto-translation
#1

✅ SCP Deny exclusion
iam-identity-center-scp-awsreservedsso-role-arn
#7

✅ CDK BucketDeployment Inspector suppression
suppress-amazon-inspector-findings-for-aws-cdk-bucketdeployment-lambda
#1

✅ Databricks AI Forecast
databricks-ai-forecast-bi
#1

✅ In-house RAG accuracy not improving
enterprise-rag-deep-search-herb
#1

❌ System prompt improvement
try-promptfoo
-

✅ Chronos-2 time series forecasting
dgx-spark-chronos2-plc-sim-llm-maintenance
#2

✅ Aurora DSQL Toasty usable
toasty-amazon-aurora-dsql
#1

✅ Snowflake Adaptive Refresh
snowflake-dynamic-tables-adaptive-refresh-mode
#1

✅ Google Cloud Binary Authorization
binary-authorization-cloud-build-cloud-run
#1

✅ Cloudflare Rate Limiting workers
cloudflare-rate-limit-experiments
#1

✅ Cloud Logging BigQuery
cloud-logging-observability-analytics-linked-dataset-bigquery
#1

✅ Snowflake Fivetran manual setup
snowflake-fivetran-manual-setup-handson
#1

✅ Redshift RG RA3 benchmark
20260517-amazon-redshift-rg-vs-ra3
#1

✅ Langfuse LLM accuracy
langfuse-experiment-action-ci
#6

✅ AWS Security Agent penetration
security-agent-owasp-juice-shop
#2

✅ Twilio voice AI phone
twilio-openai-realtime-ai-call-flyio
#2

Out of 50 queries, 1 fell outside the Top-5 (ranked #6 and #7) but the correct article itself existed in the index, and there were only 3 complete misses.
Looking at the 3 misses, in each case the query was too generic to pinpoint a specific article among the 1,000. For example, "Incident automation Bedrock" was expected to return a DevOps Agent article, but since many Bedrock-related articles exist, it got buried among them. "System prompt improvement" was also a query aimed at the promptfoo article, but since multiple articles mention prompt improvement, it could not be identified without including the tool name or similar.
The larger the corpus, the more semantically similar articles there are, and the harder it becomes to get a specific article into the Top-5 with generic queries.
 My Impressions After Using ItSetup is easy. Just feed it data and search works. When building vector search yourself, you can offload decisions like DB selection, embedding model selection, and chunking strategy.
Japanese accuracy was a Hit Rate@5 of 93.9% in an evaluation of 1,000 articles and 50 queries. For keyword queries targeting a specific article, most come back at #1. With generic queries, there are cases where results get buried among articles on the same topic, but this is partly a query-side problem.
Agent integration. Since it also supports MCP server and Workers binding, it seems possible to build chatbots that answer based on blog content or external sources.
 How to Further Improve AccuracyThis verification was done with the default settings (vector search only), but AI Search has other options. There is also room for improvement through data preprocessing.
 AI Search SettingsHybrid Search: By combining keyword search, weight is added to matches for specific keywords like "boto3" and "postinstall". This can be effective for cases like the missed "Bedrock Claude boto3 Anthropic SDK migration" scenario
Query Rewrite: The LLM rewrites the query into a search-appropriate expression before searching. Accuracy may improve as ambiguous queries become more specific, but how the rewriting actually occurs depends on Cloudflare's LLM
Reranking: The top results from the initial search are re-evaluated and reordered by the LLM. Effective for cases where results are in the top 10 but missed from the top 5
 Data-side ImprovementsAdd title to custom metadata and weight with Boost by: Add the article title as a text type field in one of the 5 available custom metadata slots, and setting it to Boost by adds score for matches with the title
Add article summaries to metadata: Summarize topics like "the problem this article solves" with an LLM before uploading and attach to metadata. If the summary includes expressions not directly written in the body (e.g., "automate initial incident investigation"), it becomes easier to hit with general queries
Chunk size adjustment: The default is 1024 tokens. Making it larger increases the context included in one chunk and improves semantic search accuracy, but there is a tradeoff where noise also increases
 Controllable ValuesAI Search automatically handles chunk splitting and index building, but several parameters can be adjusted per instance.
You can check and change them from the instance settings screen in the dashboard. The main items are as follows.
Embedding


Parameter
Default Value
Description


Embedding model
@cf/qwen/qwen3-embedding-0.6b
Model used for vectorization

Chunk size
1024 tokens
Number of tokens per chunk

Chunk overlap
10%
Overlap rate between adjacent chunks (0%–30%)

The chunk splitting algorithm is recursive chunking (recursively splitting at natural boundaries in the order of paragraphs → sentences), and cannot be changed. It splits with awareness of structural boundaries such as Markdown headings, so it works well with blog articles.
Retrieval


Parameter
Default Value
Description


Match threshold
0.4
Minimum score. Chunks below this are not included in results

Maximum number of results
10
Maximum number of chunks to return

Boost by
-
Weight scores by specific metadata

Other Options


Item
Description


Hybrid search
Combines vector search with BM25 (full-text keyword search based on word frequency) (default: OFF)

Query rewrite
Rewrites ambiguous or colloquial queries into search-appropriate expressions using an LLM before searching (default: OFF)

Reranking
Re-ranks search results using an LLM (default: OFF)

Similarity caching
Caches results of similar queries for faster responses (default: ON, TTL 48 hours)

Generation model
@cf/meta/llama-3.3-70b-instruct-fp8-fast. Model used to generate answers based on search results

Settings can be changed even after instance creation. They can be changed from the dashboard or via update in the REST API.
 About LimitsMain limits of the Free plan.


Item
Free plan
Paid plan


Files/instance
100,000
1,000,000 (500,000 for hybrid search)

File size limit
4 MB
4 MB

Queries/month
20,000
Unlimited

Instances/account
100
5,000

Since the CMS in this case is Contentful with a 2 MB limit per entry, the 4 MB file size limit was not an issue.
20,000 queries per month is sufficient for verification purposes. A Paid plan is needed if incorporating into a product.
 SummaryI tried enabling natural language search across approximately 1,000 DevIO articles using Cloudflare AI Search.
Regarding the data source, I tested with direct upload from local this time, but at larger scale, uploading to R2 would be far more efficient.
What I did:
Created an AI Search instance with Wrangler CLI
Retrieved articles from the CMS in Markdown and uploaded via the Items API
Tested search with CLI and REST API, and quickly evaluated Japanese accuracy with Hit Rate@5
Given the ease of setup, I think this accuracy with only default vector search settings is sufficient. Further improvements can be expected by combining options like Hybrid Search and Query Rewrite with data preprocessing.
It's free while in beta, so if you're curious, give it a try.
 Reference DocumentationCloudflare AI Search Documentation
AI Search: the search primitive for your agents (Cloudflare Blog)
Items API (REST API)
Wrangler CLI - AI Search
Data source configuration
Limits & Pricing

I tried feeding DevelopersIO blog articles into Cloudflare AI Search to see if I could search them using natural language

What is Cloudflare AI Search

What I Did This Time

Setting Up the AI Search Instance

About Metadata

Uploading Article Data

Trying a Search

Quick Evaluation of Japanese Search Accuracy

Evaluation Method

Test Query Design

Results

My Impressions After Using It

How to Further Improve Accuracy

AI Search Settings

Data-side Improvements

Controllable Values

About Limits

Summary

Reference Documentation

AI白書2026 配布中

AWS Topics

Trending Topics

Products & Services

Features and Series

Field	Content
`filename`	File name
`folder`	Folder path
`timestamp`	Last updated datetime

Option	Description
`--max-num-results`	Maximum number of results
`--score-threshold`	Minimum score (0 to 1)
`--reranking`	Enable/disable reranking
`--filter`	Metadata filter (`key=value`, multiple can be specified)
`--json`	Output in JSON format

Query	Expected Article	Rank
✅ Terraform RDS log management	terraform-rds-manage-cloudwatch-logs	#1
✅ CloudFormation Lambda inline code	cloudformation-lambda-inline-code-s3-zip	#1
✅ Multiple ALB logs Athena query	athena-partition-projection-enum	#1
✅ PostgreSQL index-only response	postgresql-include-index-only-scan	#2
✅ pnpm v11 postinstall broken	pnpm-v10-to-v11-migration-docker-ci	#1
✅ Claude Code Codex cross-review	claude-code-codex-cross-review	#1
✅ Twilio Studio weather forecast	twilio-weather-sms-bot	#1
✅ WAF ALB direct CloudFront via difference	tsnote-waf-attach-alb-or-cloudfront	#1
✅ Sales CLAUDE.md usage	claude-code-claude-md-for-non-engineer-sales	#1
✅ EC2 forgetting to stop LINE notification	ec2-line-notify-stop-reminder	#1
✅ LLM conversation memory persistence	mem0-llm-memory	#2
✅ StackSets IAM user bulk creation	create-handson-iam-users-with-stacksets	#1
✅ EC2 multiple instances AMI bulk creation	ec2-ami-backup-parameter-store-lambda	#1
✅ S3 pre-signed URL via VPC endpoint	s3-presigned-url-via-vpc-endpoint	#1
✅ ECS pause resume manual	amazon-ecs-pause-continue-deployments	#1
✅ EFS replication impact	tsnote-efs-replication-workload-impact-001	#1
✅ Route 53 subdomain Cloud DNS delegation	delegate-zone-from-amazon-route53-to-cloud-dns	#1
✅ NLB S3 fixed IP	nlb-s3-interface-endpoint-fixed-ip-access	#1
✅ RDS Multi-AZ ENA Express	rds-multi-az-ena-express-srd	#1
❌ Incident automation Bedrock	aws-devops-agent-error-investigation	-
✅ RAG wrong answer why paper	why-rag-fails-graph-perspective	#1
✅ Slack workflow DynamoDB	slack-bolt-workflow	#1
✅ Google Chat Bot Cloud Functions	google-chat-bot-cloud-functions-python	#3
✅ Kiro headless Lambda	kiro-cli-headless-lambda-guardduty-triage	#6
✅ Google Cloud Log Sink copy	gcp-folder-log-sink-dual-storage	#3
❌ Bedrock Claude boto3 Anthropic SDK migration	migrate-boto3-to-anthropic-sdk-bedrock	-
✅ Strands Agents Bedrock agent-sre	strands-bedrock-agent-sre	#1
✅ VPC PrivateLink cross-VPC	vpc-private-link-vpc-provider-customer-ks	#1
✅ SAM CLI Fn::ForEach	sam-cli-1-160-foreach	#1
✅ aws-vault 1Password Desktop	aws-vault-op-desktop-1password-desktop	#1
✅ Amazon Quick SAML Microsoft Entra ID	entra-id-amazon-quick-saml-sso	#1
✅ Regional NAT Gateway TGW cost	regional-nat-gateway-tgw-cost-analysis	#1
✅ PowerPoint translation automatic	pptx-auto-translation	#1
✅ SCP Deny exclusion	iam-identity-center-scp-awsreservedsso-role-arn	#7
✅ CDK BucketDeployment Inspector suppression	suppress-amazon-inspector-findings-for-aws-cdk-bucketdeployment-lambda	#1
✅ Databricks AI Forecast	databricks-ai-forecast-bi	#1
✅ In-house RAG accuracy not improving	enterprise-rag-deep-search-herb	#1
❌ System prompt improvement	try-promptfoo	-
✅ Chronos-2 time series forecasting	dgx-spark-chronos2-plc-sim-llm-maintenance	#2
✅ Aurora DSQL Toasty usable	toasty-amazon-aurora-dsql	#1
✅ Snowflake Adaptive Refresh	snowflake-dynamic-tables-adaptive-refresh-mode	#1
✅ Google Cloud Binary Authorization	binary-authorization-cloud-build-cloud-run	#1
✅ Cloudflare Rate Limiting workers	cloudflare-rate-limit-experiments	#1
✅ Cloud Logging BigQuery	cloud-logging-observability-analytics-linked-dataset-bigquery	#1
✅ Snowflake Fivetran manual setup	snowflake-fivetran-manual-setup-handson	#1
✅ Redshift RG RA3 benchmark	20260517-amazon-redshift-rg-vs-ra3	#1
✅ Langfuse LLM accuracy	langfuse-experiment-action-ci	#6
✅ AWS Security Agent penetration	security-agent-owasp-juice-shop	#2
✅ Twilio voice AI phone	twilio-openai-realtime-ai-call-flyio	#2

Parameter	Default Value	Description
Embedding model	`@cf/qwen/qwen3-embedding-0.6b`	Model used for vectorization
Chunk size	1024 tokens	Number of tokens per chunk
Chunk overlap	10%	Overlap rate between adjacent chunks (0%–30%)

Parameter	Default Value	Description
Match threshold	0.4	Minimum score. Chunks below this are not included in results
Maximum number of results	10	Maximum number of chunks to return
Boost by	-	Weight scores by specific metadata

Item	Description
Hybrid search	Combines vector search with BM25 (full-text keyword search based on word frequency) (default: OFF)
Query rewrite	Rewrites ambiguous or colloquial queries into search-appropriate expressions using an LLM before searching (default: OFF)
Reranking	Re-ranks search results using an LLM (default: OFF)
Similarity caching	Caches results of similar queries for faster responses (default: ON, TTL 48 hours)
Generation model	`@cf/meta/llama-3.3-70b-instruct-fp8-fast`. Model used to generate answers based on search results

Item	Free plan	Paid plan
Files/instance	100,000	1,000,000 (500,000 for hybrid search)
File size limit	4 MB	4 MB
Queries/month	20,000	Unlimited
Instances/account	100	5,000