I tried Vercel AI Gateway as an OpenRouter user

From the perspective of having worked with Claude models on OpenRouter, I tried Vercel AI Gateway with the same prompts. I will record the differences between the two across four perspectives: response metadata structure, routing information granularity, cache layers, and dashboard positioning.

越井琢巳 (Koshii Takumi)

2026.05.01

This page has been translated by machine translation. View original

 IntroductionI had been using OpenRouter in my day-to-day development to call Claude-based models, but I wanted to see for myself how Vercel AI Gateway differs, so I tried both services side by side with the same prompts.
This article is aimed at readers who have experience with OpenRouter but have never used Vercel AI Gateway. It summarizes the results of comparing the two services side by side. Rather than ranking one above the other, I will lay out the facts of what differs and how.
 What is Vercel AI Gateway?Vercel AI Gateway is an AI model integration API provided by Vercel. In addition to integration with the Vercel AI SDK, it exposes both an OpenAI Chat Completions-compatible endpoint and an Anthropic Messages-compatible endpoint, and provides observability features within the Vercel dashboard.
 What is OpenRouter?OpenRouter is a unified gateway service that lets you work with hundreds of AI models published by multiple providers through a single API. It has an OpenAI Chat Completions-compatible endpoint, and allows you to describe per-model fallbacks and provider selection within a single request.
 Testing EnvironmentTest date: May 1, 2026
Model used: anthropic/claude-haiku-4.5 (the caching section also uses anthropic/claude-sonnet-4.5)
 Target AudienceDevelopers who have experience calling Claude-based models via OpenRouter and are trying Vercel AI Gateway for the first time
People who want to understand the differences in response structure and observability features when choosing an AI model gateway
People considering how to integrate AI calls with their in-house monitoring infrastructure
 ReferencesVercel AI Gateway Documentation
Vercel AI Gateway: Provider Options
Vercel AI Gateway: Observability
OpenRouter Documentation
OpenRouter: Model Fallbacks
OpenRouter: Input/Output Logging
 Basic Chat Completions calls can be tested by simply swapping the baseURLThe first thing I found is that, for basic Chat Completions calls, very few changes are needed to point an existing OpenRouter client at Vercel AI Gateway. Both services expose an OpenAI Chat Completions-compatible endpoint, and both use Authorization: Bearer <token> for authentication.
When I actually called GET /v1/models and compared Claude-based model IDs, 12 model IDs were common to both services. For example, anthropic/claude-haiku-4.5, which I used in this article, can be called with the same ID on both the Vercel side and the OpenRouter side.
When writing with the OpenAI-compatible SDK, you can switch simply by replacing baseURL.
import OpenAI from 'openai';

// For Vercel AI Gateway
const vercelClient = new OpenAI({
  apiKey: process.env.AI_GATEWAY_API_KEY,
  baseURL: 'https://ai-gateway.vercel.sh/v1',
});

// For OpenRouter
const openrouterClient = new OpenAI({
  apiKey: process.env.OPENROUTER_API_KEY,
  baseURL: 'https://openrouter.ai/api/v1',
});
With the Vercel AI SDK, simply specifying the model as a string makes AI Gateway the default provider. As long as AI_GATEWAY_API_KEY is set as an environment variable, no additional client configuration is needed.
import { generateText } from 'ai';

const { text } = await generateText({
  model: 'anthropic/claude-haiku-4.5',
  prompt: 'Reply with exactly: ok',
});
!However, when using features specific to each service, such as caching or routing, you will need to handle differences in headers, providerOptions, and so on individually. In the rest of this article, I will look at those differences from various angles.
 Differences in response metadata structureThe client differences are minimal, but when you read the response contents, the differences between the two services become clear. When sending the same prompt, same model, and same parameters, the three most notable differences were as follows.
■ Display of the selected provider

Vercel: choices[0].message.provider_metadata.gateway.routing.resolvedProvider

OpenRouter: Top-level provider field
■ Additional fields in choices[0].message

Vercel: Includes provider_metadata.{anthropic, gateway}

OpenRouter: Has refusal and reasoning, and also lists choices[0].native_finish_reason
■ Cache-related fields in usage

Vercel: cache_creation_input_tokens and market_cost appear directly under it

OpenRouter: Does not have the above, but has prompt_tokens_details.cache_write_tokens
Vercel maintains the standard OpenAI-compatible format while consolidating the selected provider and routing details under choices[0].message.provider_metadata.gateway. OpenRouter, on the other hand, indicates the selected provider via the top-level provider field. The structure of placing cost and cost_details directly under usage is similar between both services. When migrating, the main area where you need to change how you retrieve data is around the selected provider name.
 Routing metadata is always included in responsesOne thing I noticed on the Vercel side is that routing metadata is included in every response. Even without triggering a fallback, originalModelId, resolvedProvider, fallbacksAvailable, planningReasoning, and modelAttempts are returned inside provider_metadata.gateway.routing every time. For example, in this verification, for a call to anthropic/claude-haiku-4.5, a single response revealed that the available fallback candidates were ["bedrock", "vertexAnthropic"], that it ultimately succeeded via the direct Anthropic route, and that the provider attempt completed in one try.
Full text of Vercel AI Gateway's provider_metadata.gateway{
  "routing": {
    "originalModelId": "anthropic/claude-haiku-4.5",
    "resolvedProvider": "anthropic",
    "resolvedProviderApiModelId": "claude-haiku-4-5-20251001",
    "fallbacksAvailable": ["bedrock", "vertexAnthropic"],
    "planningReasoning": "System credentials planned for: anthropic, bedrock, vertexAnthropic. Total execution order: anthropic(system) → bedrock(system) → vertexAnthropic(system)",
    "canonicalSlug": "anthropic/claude-haiku-4.5",
    "finalProvider": "anthropic",
    "modelAttemptCount": 1,
    "modelAttempts": [
      {
        "modelId": "anthropic:claude-haiku-4-5-20251001",
        "canonicalSlug": "anthropic/claude-haiku-4.5",
        "success": true,
        "providerAttemptCount": 1,
        "providerAttempts": [
          {
            "provider": "anthropic",
            "providerApiModelId": "claude-haiku-4-5-20251001",
            "credentialType": "system",
            "success": true,
            "startTime": 3274894.19934,
            "endTime": 3275274.606483,
            "providerRequestId": "req_011XXXXXXXXXXXXXXXXXXXXX",
            "statusCode": 200,
            "providerResponseId": "msg_01XXXXXXXXXXXXXXXXXXXXX"
          }
        ]
      }
    ],
    "totalProviderAttemptCount": 1
  },
  "cost": "0.000087",
  "marketCost": "0.000087",
  "inferenceCost": "0.000087",
  "inputInferenceCost": "0.000022",
  "outputInferenceCost": "0.000065",
  "generationId": "gen_01XXXXXXXXXXXXXXXXXXXXXXXX"
}
When receiving via streaming, the same structure arrives in the delta.provider_metadata of the final chunk. There is no need to query a separate endpoint to retrieve routing information.
On the other hand, in OpenRouter responses, while the selected provider itself can be confirmed via the provider field, fallback candidates and attempt history are not visible in the response. Only when a fallback is triggered does the model field in the response change to the name of the model that was actually used.
 Caching operates at different layersBoth services have a "caching feature," but they are fundamentally different.
Within the scope of the current official documentation, Vercel AI Gateway does not appear to have a cache that holds entire responses at the gateway layer. Instead, it is designed to transparently use prompt caching provided by providers such as Anthropic. When you specify providerOptions.gateway.caching: 'auto', AI Gateway automatically inserts cache_control at the end of static content for providers that require explicit cache markers, such as Anthropic and MiniMax.
When I actually sent system: [{ type: 'text', text: <long text>, cache_control: { type: 'ephemeral' } }] to the Anthropic Messages-compatible endpoint, the first response for claude-sonnet-4.5 returned cache_creation_input_tokens: 3,202, confirming that a cache write had occurred on the Anthropic side. However, when I sent the same prompt again in a subsequent request, cache_creation_input_tokens remained at 3,202 and cache_read_input_tokens was 0, meaning no cache read was observed. I was unable to identify the cause from this verification alone, but it is possible that factors such as the location of the cache_control specification, differences in request content, provider-side caching conditions, or AI Gateway's routing path may have had an effect.
OpenRouter, on the other hand, has a gateway-layer response cache that can be enabled with the X-OpenRouter-Cache: true header. The cache key is constructed from the API key, model, endpoint type, whether streaming is enabled, and the SHA-256 hash of the request body, and for a default of 5 minutes (adjustable from 1 second to 86,400 seconds via X-OpenRouter-Cache-TTL), the same request will receive an instant response.
!The response cache is a Beta feature at the time of writing. As the API and behavior may change, please check the latest documentation before using it in production.
In testing, for the same request to claude-haiku-4.5, the first call took 1,127 ms and the second was reduced to 33 ms. On a cache hit, token consumption is 0 and the charge is recorded as 0.
What emerged from the verification is that Vercel AI Gateway plays the role of passing through the provider's cache, while OpenRouter has a path where the gateway itself returns a cached response for cases where the same request arrives repeatedly. If you want to cache entire responses on the application side, with Vercel you will need to set up a separate caching layer.
!Note that both services do have a path for using the provider-side prompt cache. This section focuses specifically on the point that a cache which returns the entire response to an identical request at the gateway layer exists in OpenRouter but was not confirmed in Vercel AI Gateway.
 What you can see in the dashboardThe information observable in the dashboards also differs between the two services.


Information
Vercel
OpenRouter


Time-series graphs
Usage (Spend by Model, P50 TTFT by Model, Requests by Model, All Tokens)
Activity (Spend, Requests, Tokens)

Aggregation by API key
API key view in Requests
Handled by filters on the Logs page

Aggregation by project
Project view in Requests
Not supported

Individual logs and request details
Individual logs in Requests
Logs page and detail view (prompt and completion content is displayed when input/output logging is turned ON in Observability)

 Vercel AI Gateway DashboardThe AI Gateway tab in the Vercel dashboard has two views: Usage and Requests. Usage displays four graphs: Spend by Model, P50 TTFT by Model, Requests by Model, and All Tokens.
The Requests view allows you to drill down into requests from three perspectives: by project, by API key, and as individual logs.
■ By project
■ By API key
■ Individual logs
 OpenRouter DashboardThe OpenRouter Activity page displays three graphs: Spend, Requests, and Tokens.
The Logs page lets you view a list of requests. Model, provider, token count, cost, and other details are listed row by row.
Clicking on each request lets you view the prompt and completion contents.
However, input/output logging is disabled by default and must be explicitly toggled from Observability.
 ConclusionComing from a background of using OpenRouter and trying Vercel AI Gateway, I found clear differences between the two services in where observability information is stored and which layer handles caching. Vercel always includes routing metadata in responses and integrates its dashboard into the Vercel project structure, while OpenRouter exposes usage.cost and cost_details.upstream_inference_* at the top level, and provides gateway-layer response caching and broadcast to external observability tools.

Rather than asking which is better, I think the practical approach is to choose based on where you want to retrieve observability information, which layer you want to handle caching, and which monitoring infrastructure you want to integrate with. Since the effort to redirect your local code to Vercel AI Gateway is as simple as swapping the baseURL, I recommend trying it first and confirming the differences for yourself.

I tried Vercel AI Gateway as an OpenRouter user

Introduction

What is Vercel AI Gateway?

What is OpenRouter?

Testing Environment

Target Audience

References

Basic Chat Completions calls can be tested by simply swapping the baseURL

Differences in response metadata structure

Routing metadata is always included in responses

Caching operates at different layers

What you can see in the dashboard

Vercel AI Gateway Dashboard

OpenRouter Dashboard

Conclusion

生成AI活用はクラスメソッドにお任せ

AWS Topics

Trending Topics

Products & Services

Features and Series

Information	Vercel	OpenRouter
Time-series graphs	Usage (Spend by Model, P50 TTFT by Model, Requests by Model, All Tokens)	Activity (Spend, Requests, Tokens)
Aggregation by API key	API key view in Requests	Handled by filters on the Logs page
Aggregation by project	Project view in Requests	Not supported
Individual logs and request details	Individual logs in Requests	Logs page and detail view (prompt and completion content is displayed when input/output logging is turned ON in Observability)