
Inserting a Custom Gateway into Claude Code Communications — Implementation with WASI 0.3 and WAC
This page has been translated by machine translation. View original
Inserting Your Own Gateway into Claude Code — Building a Minimal LLM Gateway with WASI 0.3 and WAC
Introduction
Recently, more and more companies are adopting Claude Code in enterprise settings.
When that happens, security becomes an unavoidable concern.
For example, suppose you have a problem like "are arbitrary keywords or confidential information being sent to the LLM?"
This kind of issue, especially in industries like finance, healthcare, and public sector,
makes it important to be able to technically control "whether information is appropriate to send" before transmission,
and to maintain an audit trail showing that policies have been applied.
This article targets Claude Code. Note that Claude Desktop (Enterprise) uses managed settings and Cowork uses dedicated configuration keys, which are separate from the
ANTHROPIC_BASE_URLprocedure described below.
Claude Code has built-in governance features such as permissions, hooks, and managed-settings.
For blocking at the input (prompt) level, the UserPromptSubmit hook can also inspect content before sending and reject it.
However, for centrally handling the "final HTTP request" and "response stream" including conversation history and tool results,
hooks are insufficient and a gateway or proxy is required.
(You can see all model API requests/responses in one place and intervene in streaming as well)
Since Claude Code allows you to point the connection to your own endpoint using the ANTHROPIC_BASE_URL environment variable,
you can insert arbitrary guardrails into the communication without touching Claude Code itself, as shown below:
Claude Code → Your own gateway (inspection, blocking, masking, etc.) → api.anthropic.com
In this article, we'll try composing WASM to build our own gateway/proxy.
About Claude Code
Changing Claude Code's Connection Destination
Claude Code allows you to control where requests are sent via environment variables.
The main environment variables are as follows:
ANTHROPIC_BASE_URL: The destination URL. Point this to your own gatewayANTHROPIC_API_KEY(x-api-keyheader) / subscription OAuth tokenCLAUDE_CODE_OAUTH_TOKEN(Authorization: Bearer) … authenticationANTHROPIC_CUSTOM_HEADERS… attach arbitrary custom headers (inName: Valueformat, multiple entries separated by newlines)
Note: If you point
ANTHROPIC_BASE_URLto a host other thanapi.anthropic.com, MCP tool search will be disabled by default (if your gateway can forwardtool_reference, you can re-enable it withENABLE_TOOL_SEARCH=true).
Since the gateway is an official feature, simply forwarding the authentication headers sent by the client
(x-api-key / Authorization: Bearer, etc.) allows it to work through the gateway
whether using an API key or a Max/Team/Enterprise plan.
The gateway needs to relay /v1/messages and /v1/messages/count_tokens,
and must pass through anthropic-version / anthropic-beta headers as-is.
(/v1/models is optional and only needed if supporting model discovery)
For more details, refer to the official documentation: Environment Variable Reference, LLM gateway configuration.
About Built-in Security Features
In enterprise/team operations, roughly the following features are available out of the box:
- Enforcement of settings via managed-settings.json (MDM distribution, OS-protected paths)
- Permission rules / hooks (command execution permissions and hooks)
※ e.g., theUserPromptSubmithook can inspect prompt input and reject sending - Model allow lists, telemetry output via OpenTelemetry
As mentioned earlier, hooks can be used to block at the input (prompt) level.
However, hooks cannot intervene in the HTTP request body including conversation history and tool results,
or in the SSE stream of responses themselves.
To inspect/block/modify this in one place, even during streaming,
you need to set up a custom gateway via ANTHROPIC_BASE_URL.
Enforcing Gateway Use via MDM
If ANTHROPIC_BASE_URL is fixed in an OS-level
managed-settings.json distributed via MDM, users cannot change it.
※ This allows you to force Claude Code to go through your own gateway
The managed scope takes top priority among all settings, and adding env.ANTHROPIC_BASE_URL
prevents overriding even with a shell export, but this only constrains Claude Code itself.
To guarantee that "model API communications always go through the gateway," combine this with network egress control.
※ Using a custom ANTHROPIC_BASE_URL means server-managed settings distributed from the management console are unavailable. Distribute these via OS-level managed-settings.json instead.
Create Gateway (WASM, WAC)
For the gateway, simply "implementing one Anthropic Messages API-compatible reverse proxy" is fine.
However, if you want to add more guardrails or change rules per group,
things get a bit complex, so in this article we've composed guardrails as WASM components.
What is WASM/WASI?
WASM (WebAssembly)
WASM is a portable binary instruction format.
It runs at near-native speed within a sandbox and is not tied to a specific language.
Originally for browsers, it now runs on servers as well.
WASI (WebAssembly System Interface)
This is the standard system interface for running WASM outside of browsers.
It provides features like file, network, and HTTP access in a capability-based manner (designed to explicitly grant only the required capabilities).
Component Model / wac
This is a mechanism for modularizing WASM with typed interfaces (WIT) and composing multiple components into one.
wac plug is the composition tool.
How It Works
It operates with the following mechanism:
- Each guardrail becomes one WASM component
- e.g.,
meter(measurement) /secret-scan(secret detection) /output-mask(response redaction) …
- e.g.,
- These are composed into 1 binary using
wac plugand launched withwasmtime serve - The client (Claude Code) points
ANTHROPIC_BASE_URLhere
Claude Code ─▶ [meter]─▶[secret-scan]─▶[output-mask]─▶[anthropic-out]─▶ api.anthropic.com
measure secret detect redact response relay with fixed destination
◀──────────── response returns passing through streaming (SSE) ────────────◀
The diagram above shows an expanded configuration example. In the Try section below, we implement two components: log → anthropic-out.
Why WASM?
With WASI 0.3.0, WebAssembly components can now handle asynchronous (streaming) processing,
with stream for flowing data piece by piece and future for values that arrive later now available as standard.
This makes it possible to inspect, block, and replace data mid-stream (at the SSE chunk level) while the response is flowing.
For example, processing like "check for specific terms or formats and error out if found" can be implemented directly.
secret-scan: If API keys, private keys, or token-like formats are mixed into the prompt (conversation history), block before sending upstreamoutput-mask: Redact email addresses and known confidential information in the response stream (e.g.,hoge@example.com→[redacted:email])
Note: The
output-maskexample shown here focuses on email addresses and known confidential information, and does not detect everything. (Images, base64, and binary are not inspection targets) "What to detect and what to ignore" depends on the implementation, so be careful.
The advantages of building with WASM composition are as follows:
- Single process, static composition means no interruption to streaming
- Any language is fine as long as it has a toolchain that can output
wasi:http - Data non-retention can be a design principle (the gateway itself does not store/log request bodies or API keys. Bodies that are not blocked are sent directly to Anthropic)
Adding components, changing their order, and varying configurations per team can all be done just by changing composition settings,
giving it high flexibility.
Environment
Verification was conducted in the following environment:
| Item | Version |
|---|---|
| OS | macOS 26.4 (Apple Silicon / arm64) |
| Claude Code | 2.1.185 |
| Rust | 1.95.0 (target wasm32-wasip2) |
| wasmtime | 45.0.1 (WASI 0.3 / -S p3 -W component-model-async compatible version) |
| wasm-tools | 1.251.0 |
| wac (wac-cli) | 0.10.0 |
| wit-bindgen | 0.58 |
| Node.js | 20.19.0 |
| mise | 2026.6.0 macos-arm64 |
Setup
Install the necessary tools to run the demo.
- Rust + WASM target (rustup)
% rustup toolchain install 1.95.0
% rustup target add wasm32-wasip2
- wasmtime / wasm-tools
A version supporting WASI 0.3 -S p3 -W component-model-async.
% mise use -g wasmtime@45.0.1 wasm-tools@1.251.0
- wac (WebAssembly composition tool)
% cargo install --locked --version 0.10.0 wac-cli
- wkg (WASI WIT package retrieval tool. Used only when fetching WIT)
% cargo install --locked --version 0.10.0 wkg
- Claude Code
Prepare an API key issued from the Console (ANTHROPIC_API_KEY), or for subscribers, an OAuth token issued with claude setup-token
(CLAUDE_CODE_OAUTH_TOKEN).
Try
Let's try the following minimal configuration:
- Start a gateway locally
- Point
ANTHROPIC_BASE_URLto it - Use Claude Code normally, confirm it works as usual through the gateway, and check the log stage output
0. Minimal Proxy
We talked about WASM composition, but fundamentally it's a reverse proxy.
Here's an image of it. (TypeScript sample)
import http from "node:http";
const UPSTREAM = "https://api.anthropic.com";
http.createServer(async (req, res) => {
// Read the request body first
const chunks: Buffer[] = [];
for await (const c of req) chunks.push(c as Buffer);
const body = Buffer.concat(chunks).toString("utf8");
// ① Inspection: block if confidential information starting with "sk-ant-" is mixed in
if (/sk-ant-[A-Za-z0-9_\-]{20,}/.test(body)) {
res.writeHead(400, { "content-type": "application/json" });
res.end(JSON.stringify({
type: "error",
error: { type: "invalid_request_error",
message: "Secret detected in the request. Remove it and retry." },
}));
return;
}
/**
* ② Authentication headers (x-api-key / authorization / anthropic-version / anthropic-beta)
* are forwarded upstream, but host, connection, content-length, etc. are not unconditionally forwarded (left to fetch)
*/
const fwd = new Headers();
for (const [k, v] of Object.entries(req.headers)) {
const key = k.toLowerCase();
// Don't forward hop-by-hop headers like host, and accept-encoding (to avoid compressed responses)
if (["host", "connection", "content-length", "transfer-encoding", "accept-encoding"].includes(key)) continue;
if (typeof v === "string") fwd.set(k, v);
}
// fetch throws an exception if a body is attached to GET/HEAD, so branch accordingly
const method = req.method ?? "GET";
const init: RequestInit = { method, headers: fwd };
if (method !== "GET" && method !== "HEAD") {
init.body = body;
// @ts-ignore Receive streaming response with Node's fetch
init.duplex = "half";
}
const upstream = await fetch(`${UPSTREAM}${req.url}`, init);
res.writeHead(upstream.status, Object.fromEntries(upstream.headers));
// Stream SSE chunks as-is (you can also inspect and redact the response here)
for await (const chunk of upstream.body as any) res.write(chunk);
res.end();
// ★ Bind to loopback
}).listen(8080, "127.0.0.1", () => console.log("gateway on 127.0.0.1:8080"));
This is also a gateway implementation that "returns 400 without sending upstream if a secret is included."
(This minimal version only checks the body with a regex, so it's easy to circumvent)
Also, by rewriting chunks on the response side, streaming redaction and similar operations are possible.
1. Create the Sample Project
The TypeScript above was a conceptual explanation.
Next, we'll actually create two WASM components — log (an example of inserting your own logic) and the exit point (relaying to api.anthropic.com) —
compose them with wac, launch with wasmtime, and verify operation with actual Claude Code.
The directory structure is as follows. (Place WASI 0.3 WIT packages in wit/)
※ The demo implementation is in Rust
gateway-demo/
├── rust-toolchain.toml
├── wit/
│ └── wasi_http@0.3.0-rc-2026-03-15.wasm # WASI 0.3 WIT
├── log/ # Log component (middleware)
│ ├── Cargo.toml
│ └── src/lib.rs
└── anthropic-out/ # Exit component: relay to api.anthropic.com (service)
├── Cargo.toml
└── src/lib.rs
# rust-toolchain.toml
[toolchain]
channel = "1.95.0"
targets = ["wasm32-wasip2"]
2. Log
A middleware that passes requests to the inner (upstream) side while outputting the method and path to stderr.
Replace this part with inspection, aggregation, blocking, etc. to create your own guardrail.
# log/Cargo.toml
[package]
name = "log"
version = "0.1.0"
edition = "2021"
[lib]
crate-type = ["cdylib"]
[dependencies]
wit-bindgen = { version = "0.58", features = ["async-spawn"] }
[workspace]
// log/src/lib.rs
wit_bindgen::generate!({
path: "../wit/wasi_http@0.3.0-rc-2026-03-15.wasm",
world: "wasi:http/middleware@0.3.0-rc-2026-03-15",
generate_all,
async: true,
});
use exports::wasi::http::handler::Guest;
use wasi::http::handler as inner;
use wasi::http::types::{ErrorCode, Request, Response};
struct Log;
impl Guest for Log {
async fn handle(request: Request) -> Result<Response, ErrorCode> {
let method = request.get_method().await;
let path = request.get_path_with_query().await;
eprintln!("[log] {method:?} {path:?}"); // Insert your own logic here
inner::handle(request).await // Pass the response downstream as-is
}
}
export!(Log);
3. Exit: Relay to api.anthropic.com
A service component that sends received requests to api.anthropic.com and returns
the response (including SSE) as-is.
If you leave the host (the gateway-destined value set by the client) and forward it,
you'll get a 403 because the Host mismatches with the destination api.anthropic.com,
so strip only the headers that must not be forwarded and change the destination.
(x-api-key / Authorization / anthropic-* are passed through as-is)
# anthropic-out/Cargo.toml
[package]
name = "anthropic-out"
version = "0.1.0"
edition = "2021"
[lib]
crate-type = ["cdylib"]
[dependencies]
wit-bindgen = { version = "0.58", features = ["async-spawn"] }
[workspace]
// anthropic-out/src/lib.rs
wit_bindgen::generate!({
path: "../wit/wasi_http@0.3.0-rc-2026-03-15.wasm",
world: "wasi:http/service@0.3.0-rc-2026-03-15",
generate_all,
async: true,
});
use exports::wasi::http::handler::Guest;
use wasi::http::client;
use wasi::http::types::{ErrorCode, Fields, Request, Response, Scheme};
struct AnthropicOut;
// Headers that must not be forwarded upstream.
// Leaving host causes a 403 due to Host/authority mismatch with the destination
const STRIP: &[&str] = &[
"host", ":authority", "connection", "keep-alive",
"transfer-encoding", "te", "trailer", "upgrade",
"proxy-connection", "accept-encoding",
];
impl Guest for AnthropicOut {
async fn handle(request: Request) -> Result<Response, ErrorCode> {
let method = request.get_method().await;
let path = request.get_path_with_query().await;
// Pass x-api-key / authorization / anthropic-* etc. through as-is, strip only host etc.
let kept: Vec<(String, Vec<u8>)> = request
.get_headers()
.await
.copy_all()
.await
.into_iter()
.filter(|(n, _)| !STRIP.contains(&n.to_ascii_lowercase().as_str()))
.collect();
let headers = Fields::from_list(kept)
.await
.map_err(|e| ErrorCode::InternalError(Some(format!("{e:?}"))))?;
let (res_tx, res_rx) = wit_future::new(|| Ok(()));
let (body, trailers_rx) = Request::consume_body(request, res_rx).await;
let (req, _t) = Request::new(headers, Some(body), trailers_rx, None).await;
let _res_tx = res_tx;
let _ = req.set_method(method).await;
let _ = req.set_path_with_query(path).await;
let _ = req.set_scheme(Some(Scheme::Https)).await;
let _ = req.set_authority(Some("api.anthropic.com".to_string())).await;
client::send(req).await // Return api.anthropic.com's response as-is
}
}
export!(AnthropicOut);
4. Build → Compose → Launch
Run from inside gateway-demo/.
# 1) WASM build
cargo build --release --target wasm32-wasip2 --manifest-path log/Cargo.toml
cargo build --release --target wasm32-wasip2 --manifest-path anthropic-out/Cargo.toml
# 2) Compose into 1 binary with wac plug (log=outer / anthropic-out=inner=exit)
wac plug log/target/wasm32-wasip2/release/log.wasm \
--plug anthropic-out/target/wasm32-wasip2/release/anthropic_out.wasm \
-o gateway.wasm
# 3) Launch with wasmtime
wasmtime serve -S p3,cli -W component-model-async --addr 127.0.0.1:8080 gateway.wasm
5. Route Actual Claude Code Through It
In a separate terminal, launch claude with ANTHROPIC_BASE_URL pointing to this gateway.
For MAX and other subscriptions, issue an OAuth token with the claude setup-token command.
※ The following is an example for the Max plan
# 1) Issue an OAuth token
% claude setup-token
# → Use the displayed sk-ant-oat01-...
# 2) Set gateway and token information (unset other authentication first)
% unset ANTHROPIC_API_KEY ANTHROPIC_AUTH_TOKEN
% export ANTHROPIC_BASE_URL=http://localhost:8080
% export CLAUDE_CODE_OAUTH_TOKEN=sk-ant-oat01-...
# 3) Launch claude
% claude
When run, Claude responds as usual (since the gateway is transparent, the user experience doesn't change).
❯ hello claude code.
⏺ Hello! Is there anything I can help you with?
At this point, on the terminal where the gateway was launched (standard error), the log stage output flows,
confirming that requests are going through the gateway.
(Communications for /v1/messages/count_tokens and /v1/messages occur)
[log] Method::Post Some("/v1/messages/count_tokens")
[log] Method::Post Some("/v1/messages?beta=true")
The gateway passes authentication headers (Authorization: Bearer …) directly to api.anthropic.com.
※ For developer API keys, export ANTHROPIC_API_KEY=sk-ant-... also works
※ If both ANTHROPIC_API_KEY and CLAUDE_CODE_OAUTH_TOKEN are set, the API key takes priority. Note that ANTHROPIC_AUTH_TOKEN takes priority over the API key
※ Priority order: ANTHROPIC_AUTH_TOKEN > ANTHROPIC_API_KEY > CLAUDE_CODE_OAUTH_TOKEN
6. Measuring Gateway Overhead
Let's look at how much overhead is introduced by inserting the proxy.
Since actual response times are dominated by Anthropic's generation time and vary greatly,
here we eliminate LLM variability and look only at the gateway's own processing cost.
We replaced the upstream with a local mock LLM (a mock that returns SSE at a constant rate) and measured.
We sent the same request multiple times via (a) without gateway (direct connection to mock LLM) and (b) through the gateway,
and compared medians.
| Route | Median Time to First Token (TTFB) |
|---|---|
| Direct mock LLM (no gateway) | 0.54 ms |
| Via gateway | 1.07 ms |
| Difference (gateway processing cost) | +approx. 0.5 ms |
The difference was about +approx. 1.2 ms even when looking at the total time until stream completion (the value for WASM processing alone locally).
We also measured actual execution time with and without the gateway using actual Claude Code (claude -p, MAX plan, haiku).
The median was 9.82 seconds without gateway / 9.76 seconds via gateway (the gateway route was slightly faster),
with each run falling within the range of variability (7.2 to 14.9 seconds), and no clear difference could be confirmed in this measurement.
For a More Serious Gateway Setup
We've quickly verified the mechanism locally, so let's try a more serious configuration.
This time, we used AWS Fargate as follows and verified with actual Claude Code.
[MDM distribution] ANTHROPIC_BASE_URL = https://gw.internal
│
▼ ← Reachable only from internal network (internal ALB / source IP restriction / mTLS)
┌──────────────────┐ ┌───────────────────────────────┐ egress allowed ┌────────────────────┐
│ Claude Code(CLI) │ ───▶ │ Gateway (Fargate) │ only to this │ api.anthropic.com │
│ managed device │ │ wasmtime + composed │ ───────────────▶│ │
└──────────────────┘ │ meter→secret-scan→output-mask │ └────────────────────┘
└───────────────────────────────┘
╳ managed device → direct access to api.anthropic.com is blocked at the network level (egress control)
- Containerize the gateway for persistent deployment (Fargate / Cloud Run etc., environments where wasmtime can be bundled)
- Health checks don't go upstream (
/healthzis short-circuited locally) - Make the public boundary fail-closed (avoid unrestricted public access; require internal ALB, source IP restriction, mTLS, etc.)
- Restrict outbound HTTP destinations to
api.anthropic.comusing network-level egress control (you can also restrict permissions per stage with WASI, but it depends on the world design, so secure it at the network level first) - Distribute
ANTHROPIC_BASE_URLvia MDM + enforce with network egress control
When actually thinking about operations:
(1) Network design that truly blocks egress
(2) MDM distribution and override prevention
(3) Audit log design
(4) Team-based guardrail operations
These are all related, and the design will vary depending on the situation, organization size, and existing rules.
With careful consideration of these factors and appropriate design, "safely using Claude Code within your organization" should be achievable.
Summary
Claude Code has an official LLM gateway feature that allows you to replace the connection destination with ANTHROPIC_BASE_URL.
Using this, without modifying Claude Code itself, you can insert your own gateway to apply
content-based controls not available in built-in features — such as "blocking confidential information before sending" and "redacting responses" —
while preserving streaming.
In this article, we built that gateway using WASM component composition (WASI 0.3).
By creating guardrails as components, they can be swapped out or added just by changing the composition settings,
and can be implemented in any language as long as it has a toolchain supporting wasi:http.
That said, the gateway only needs to be an HTTP service compatible with the Anthropic Messages API,
and WASM composition is just one example.
Implementing it as a regular reverse proxy is perfectly fine, and it can run either in the cloud or on-premises.
For organizational use, we recommend first applying the built-in controls such as managed-settings under the Enterprise plan,
and then combining that with a gateway like the one in this article.
By fixing ANTHROPIC_BASE_URL as non-overridable via MDM and combining that with network egress control,
you can force model API communications to go through the gateway.
This allows you to cover processing based on communication content that built-in features alone cannot reach.
References
- Claude Code: LLM gateway configuration
- Claude Code: Hooks
- Claude Code: Set up Claude Code for your organization
- Claude Code: Environment Variable Reference
- Claude Code: Settings
- Claude Code: Server-managed settings
- Claude Code: Desktop application
- Claude Code: Authentication
- Claude Code: Legal and compliance
- Anthropic: Handle streaming refusals
- WASI 0.3.0 announcement (Bytecode Alliance)
- Component Model
- wac (WebAssembly Composition)
- Wasmtime

