Intercepting Claude Code Communications with a Custom Gateway — Implementation Using WASI 0.3 and WAC

2026.06.23
This page has been translated by machine translation. View original
 IntroductionRecently, there has been a growing trend of enterprises adopting Claude Code.

When that happens, security becomes an unavoidable concern.
For example, consider the problem of "whether arbitrary keywords or sensitive information is being sent to the LLM."

These kinds of issues are particularly important in industries like finance, healthcare, and public sector,

where it's critical to be able to technically control "what information is allowed to be sent" before transmission,

and to leave an audit trail showing that policies have been applied.
This article targets Claude Code. Note that Claude Desktop (Enterprise) uses managed settings, and Cowork uses dedicated configuration keys, which differ from the ANTHROPIC_BASE_URL procedure described below.
Claude Code has built-in governance features such as permissions, hooks, and managed-settings.

For blocking at the input (prompt) level, the UserPromptSubmit hook can also be used to

inspect content before submission and reject it.

However, for centrally handling the "final HTTP request" and "response stream" that include conversation history and tool results,

hooks fall short, and a gateway or proxy is needed.

(A single place to view all model API requests/responses, with the ability to intervene in streaming as well)
In Claude Code, the ANTHROPIC_BASE_URL environment variable allows you to

point the connection to your own endpoint, so as shown below,

you can insert arbitrary guardrails into the communication without modifying Claude Code itself.
Claude Code → Custom gateway (inspection, blocking, masking, etc.) → api.anthropic.com
In this article, we'll try building our own gateway/proxy by composing WASM.
 About Claude Code Changing Claude Code's connection destinationClaude Code can control the destination of requests via environment variables.

The main environment variables are as follows.
ANTHROPIC_BASE_URL: The destination URL. Point this to your own gateway
ANTHROPIC_API_KEY (x-api-key header) / Subscription OAuth token CLAUDE_CODE_OAUTH_TOKEN (Authorization: Bearer) … Authentication
ANTHROPIC_CUSTOM_HEADERS … Add arbitrary custom headers (Name: Value format, multiple entries separated by newlines)
Note: If you point ANTHROPIC_BASE_URL to a host other than api.anthropic.com, MCP tool search is disabled by default, so be careful (if the gateway can forward tool_reference, it can be re-enabled with ENABLE_TOOL_SEARCH=true).
Since the gateway is an official feature, forwarding the authentication headers sent by the client

(x-api-key / Authorization: Bearer, etc.) as-is means it works through the gateway

whether using an API key, or on Max/Team/Enterprise plans.
The gateway needs to relay /v1/messages and /v1/messages/count_tokens,

and pass through the anthropic-version / anthropic-beta headers as-is.

(/v1/models is optional, only if supporting model discovery)
For more details, refer to the official documentation: Environment Variables Reference, LLM gateway configuration.
 About built-in security featuresFor enterprise/team operations, roughly the following features are available built-in.
Enforcing settings via managed-settings.json (MDM distribution / OS-protected paths)
Permission rules / hooks (command execution permissions and hooks)

※ e.g., the UserPromptSubmit hook can inspect prompt input and reject submission
Model allowlist, telemetry output via OpenTelemetry
As mentioned earlier, hooks can be used for blocking at the input (prompt) level.

However, hooks cannot intervene in the HTTP request body including conversation history and tool results,

or in the SSE stream of responses themselves.

To inspect/block/modify these in a single place, even during streaming,

you need to set up a custom gateway via ANTHROPIC_BASE_URL.
 Enforcing the gateway via MDMANTHROPIC_BASE_URL can be fixed in the OS-level

managed-settings.json distributed via MDM, making it impossible for users to change.

※ This allows you to force Claude Code to go through your own gateway
The managed scope takes the highest priority among all settings, and adding env.ANTHROPIC_BASE_URL means

it cannot be overridden even with a shell export, but this only constrains Claude Code itself.

To ensure that model API traffic is always routed through the gateway, you should combine it with network egress controls.
※ Using a custom ANTHROPIC_BASE_URL means server-managed settings distributed from the management console cannot be used. Distribute via OS-level managed-settings.json.
![Target is Claude Code CLI]
As mentioned above, this article targets the Claude Code CLI.

Claude Desktop / Cowork (GUI apps) have a different configuration entry point from the CLI,

and export ANTHROPIC_BASE_URL in the shell does not take effect.

Desktop (Enterprise) configures via managed settings, and Cowork (on 3rd-party inference) configures via dedicated settings keys.
If you want to route Desktop through your own gateway, specify it in the Desktop settings rather than shell environment variables
Enterprise uses the Gateway backend in the settings screen
Local Code sessions launched by Desktop use env in ~/.claude/settings.json
 Create Gateway（WASM・WAC）For the gateway, simply "implementing one Anthropic Messages API-compatible reverse proxy" is fine.

However, if you want to add more guardrails or change rules per group,

it gets a bit complicated, so in this article, we've tried composing guardrails as WASM components.
 What is WASM/WASI? WASM（WebAssembly）WASM is a portable binary instruction format.

It runs at near-native speed in a sandbox and is not tied to any specific language.

Originally designed for browsers, it now also runs on servers.
 WASI（WebAssembly System Interface）This is a standard system interface for running WASM outside the browser.

It provides features like files, networking, and HTTP in a capability-based design (where only explicitly required capabilities are granted).
 Component Model / wacA mechanism to modularize WASM with typed interfaces (WIT) and compose multiple components into one.

wac plug is the composition tool.
 How it worksIt operates with the following mechanism.
Each guardrail is made into a single WASM component
e.g., meter (measurement) / secret-scan (secret detection) / output-mask (response redaction) …

These are composed into 1 binary with wac plug and launched with wasmtime serve
The client (Claude Code) points ANTHROPIC_BASE_URL here
Claude Code ─▶ [meter]─▶[secret-scan]─▶[output-mask]─▶[anthropic-out]─▶ api.anthropic.com
               Measure    Secret detect  Redact response  Fixed relay to destination
            ◀──────────── Response returns through the streaming (SSE) ────────────◀
The diagram above shows an extended configuration example. In the Try section below, we'll implement two components: log → anthropic-out.
 Why WASM?With WASI 0.3.0, WebAssembly components can now handle asynchronous (streaming) processing,

with stream for passing data bit by bit and future for values that arrive later now available as standard.

This makes it possible to inspect, block, and replace content while a response is being streamed (at the SSE chunk level).

For example, processing like "check for specific terms or formats, and return an error if found" can be implemented directly.
secret-scan: If API keys, private keys, or token-like formats are mixed into the prompt (conversation history), block it before sending upstream
output-mask: Redact email addresses and known sensitive information in the response stream (e.g., hoge@example.com → [redacted:email])
Note: The output-mask example shown here focuses on email addresses and known sensitive information, and does not detect everything. (Images, base64, and binary are not subject to inspection) What is and isn't detected depends on the implementation, so be careful.
The advantages of building with WASM composition are as follows.
Single process, static composition means no interruption to streaming
Any language works as long as there's a toolchain that can output wasi:http
Data non-retention can be a design principle (the gateway itself does not save/log request bodies or API keys. Content that isn't blocked is sent to Anthropic as-is)
Adding components, changing the order, or changing the configuration per team only requires changes to the composition settings,

making it highly flexible.
 EnvironmentVerification was performed in the following environment.


Item
Version


OS
macOS 26.4 (Apple Silicon / arm64)

Claude Code
2.1.185

Rust
1.95.0 (target wasm32-wasip2)

wasmtime
45.0.1 (WASI 0.3 / -S p3 -W component-model-async compatible version)

wasm-tools
1.251.0

wac (wac-cli)
0.10.0

wit-bindgen
0.58

Node.js
20.19.0

mise
2026.6.0 macos-arm64

 SetupLet's install various tools to run the demo.
Rust + WASM target (rustup)
% rustup toolchain install 1.95.0
% rustup target add wasm32-wasip2
wasmtime / wasm-tools
A version compatible with WASI 0.3's -S p3 -W component-model-async.
% mise use -g wasmtime@45.0.1 wasm-tools@1.251.0
wac (WebAssembly composition tool)
% cargo install --locked --version 0.10.0 wac-cli
wkg (WASI WIT package retrieval tool. Used only when fetching WIT)
% cargo install --locked --version 0.10.0 wkg
Claude Code
Prepare an API key issued from the Console (ANTHROPIC_API_KEY), or for subscriptions, an OAuth token (CLAUDE_CODE_OAUTH_TOKEN) issued with the claude setup-token command.
 TryLet's try with the following minimal configuration.
Start the gateway locally
Point ANTHROPIC_BASE_URL to it
Use Claude Code normally, confirm it works as usual via the gateway, and verify the log stage output
 0. Minimal ProxyWe talked about WASM composition, but in essence, it's a reverse proxy.

Here's an image of what it looks like. (TypeScript sample)
import http from "node:http";

const UPSTREAM = "https://api.anthropic.com";

http.createServer(async (req, res) => {
  // Read the request body first
  const chunks: Buffer[] = [];
  for await (const c of req) chunks.push(c as Buffer);
  const body = Buffer.concat(chunks).toString("utf8");

  // ① Inspection: block if sensitive information (here, starting with "sk-ant-") is mixed in
  if (/sk-ant-[A-Za-z0-9_\-]{20,}/.test(body)) {
    res.writeHead(400, { "content-type": "application/json" });
    res.end(JSON.stringify({
      type: "error",
      error: { type: "invalid_request_error",
        message: "Secret detected in the request. Remove it and retry." },
    }));
    return;
  }

  /** 
   * ② Authentication headers (x-api-key / authorization / anthropic-version / anthropic-beta) are
   * forwarded upstream, but host, connection, content-length, etc. are not forwarded unconditionally (left to fetch)
   */
  const fwd = new Headers();
  for (const [k, v] of Object.entries(req.headers)) {
    const key = k.toLowerCase();
    // Don't forward hop-by-hop headers like host and accept-encoding (to avoid compressed responses)
    if (["host", "connection", "content-length", "transfer-encoding", "accept-encoding"].includes(key)) continue;
    if (typeof v === "string") fwd.set(k, v);
  }
  // Branch because fetch throws an exception if body is attached to GET/HEAD
  const method = req.method ?? "GET";
  const init: RequestInit = { method, headers: fwd };
  if (method !== "GET" && method !== "HEAD") {
    init.body = body;
    // @ts-ignore Receive streaming response from Node's fetch
    init.duplex = "half";
  }
  const upstream = await fetch(`${UPSTREAM}${req.url}`, init);
  res.writeHead(upstream.status, Object.fromEntries(upstream.headers));
  // Stream SSE chunks as-is (you can also peek at the response here for redaction)
  for await (const chunk of upstream.body as any) res.write(chunk);
  res.end();
  // ★ Bind to loopback
}).listen(8080, "127.0.0.1", () => console.log("gateway on 127.0.0.1:8080"));
This is already an implementation of a gateway that "returns 400 without sending upstream if secrets are included."

(This minimal version only uses regex on the body, so it's easy to bypass)

Also, by rewriting response chunks, in-stream redaction during streaming is also possible.
 1. Create a sample projectThe TypeScript above was a conceptual explanation.

Next, we'll actually create two WASM components — log (an example of inserting your own logic) and egress (relaying to api.anthropic.com) —

compose them with wac, launch with wasmtime, and verify operation with actual Claude Code.
The directory structure is as follows. (Place the WASI 0.3 WIT packages in wit/)

※ The demo implementation is in Rust
gateway-demo/
├── rust-toolchain.toml
├── wit/
│   └── wasi_http@0.3.0-rc-2026-03-15.wasm   # WASI 0.3 WIT
├── log/            # Log component (middleware)
│   ├── Cargo.toml
│   └── src/lib.rs
└── anthropic-out/  # Egress component: relay to api.anthropic.com (service)
    ├── Cargo.toml
    └── src/lib.rs
# rust-toolchain.toml
[toolchain]
channel = "1.95.0"
targets = ["wasm32-wasip2"]
!WASI 0.3.0 itself was officially released on 2026-06-11, but

the WASI 0.3 support in wasmtime 45 used for verification is at a preview stage,

and the WIT uses an RC version (wasi:http@0.3.0-rc-2026-03-15).
This WIT package can be retrieved with wkg (WebAssembly package tool)

using wkg get wasi:http@0.3.0-rc-2026-03-15 --format wasm -o wit/.
 2. LogA middleware that passes requests through to the inner (upstream) side while printing the method and path to stderr.

Replace this section with inspection, aggregation, blocking, etc. to create your own guardrails.
# log/Cargo.toml
[package]
name = "log"
version = "0.1.0"
edition = "2021"

[lib]
crate-type = ["cdylib"]

[dependencies]
wit-bindgen = { version = "0.58", features = ["async-spawn"] }

[workspace]
// log/src/lib.rs
wit_bindgen::generate!({
    path: "../wit/wasi_http@0.3.0-rc-2026-03-15.wasm",
    world: "wasi:http/middleware@0.3.0-rc-2026-03-15",
    generate_all,
    async: true,
});

use exports::wasi::http::handler::Guest;
use wasi::http::handler as inner;
use wasi::http::types::{ErrorCode, Request, Response};

struct Log;

impl Guest for Log {
    async fn handle(request: Request) -> Result<Response, ErrorCode> {
        let method = request.get_method().await;
        let path = request.get_path_with_query().await;
        eprintln!("[log] {method:?} {path:?}"); // Insert your own logic here
        inner::handle(request).await // Pass the response downstream as-is
    }
}

export!(Log);
 3. Egress: Relay to api.anthropic.comA service component that sends received requests to api.anthropic.com and returns

the response (including SSE) as-is.

If the host header (the value pointing to the gateway set by the client) is left in and forwarded,

it will cause a mismatch with the Host of the destination api.anthropic.com, resulting in a 403,

so we remove only the headers that must not be forwarded and change the destination.

(x-api-key / Authorization / anthropic-* are passed through as-is)
# anthropic-out/Cargo.toml
[package]
name = "anthropic-out"
version = "0.1.0"
edition = "2021"

[lib]
crate-type = ["cdylib"]

[dependencies]
wit-bindgen = { version = "0.58", features = ["async-spawn"] }

[workspace]
// anthropic-out/src/lib.rs
wit_bindgen::generate!({
    path: "../wit/wasi_http@0.3.0-rc-2026-03-15.wasm",
    world: "wasi:http/service@0.3.0-rc-2026-03-15",
    generate_all,
    async: true,
});

use exports::wasi::http::handler::Guest;
use wasi::http::client;
use wasi::http::types::{ErrorCode, Fields, Request, Response, Scheme};

struct AnthropicOut;

// Headers that must not be forwarded upstream.
// Leaving host will cause a mismatch with the destination's Host/authority, resulting in 403
const STRIP: &[&str] = &[
    "host", ":authority", "connection", "keep-alive",
    "transfer-encoding", "te", "trailer", "upgrade",
    "proxy-connection", "accept-encoding",
];

impl Guest for AnthropicOut {
    async fn handle(request: Request) -> Result<Response, ErrorCode> {
        let method = request.get_method().await;
        let path = request.get_path_with_query().await;

        // Keep x-api-key / authorization / anthropic-* etc. as-is, remove only host etc.
        let kept: Vec<(String, Vec<u8>)> = request
            .get_headers()
            .await
            .copy_all()
            .await
            .into_iter()
            .filter(|(n, _)| !STRIP.contains(&n.to_ascii_lowercase().as_str()))
            .collect();
        let headers = Fields::from_list(kept)
            .await
            .map_err(|e| ErrorCode::InternalError(Some(format!("{e:?}"))))?;

        let (res_tx, res_rx) = wit_future::new(|| Ok(()));
        let (body, trailers_rx) = Request::consume_body(request, res_rx).await;
        let (req, _t) = Request::new(headers, Some(body), trailers_rx, None).await;
        let _res_tx = res_tx;

        let _ = req.set_method(method).await;
        let _ = req.set_path_with_query(path).await;
        let _ = req.set_scheme(Some(Scheme::Https)).await;
        let _ = req.set_authority(Some("api.anthropic.com".to_string())).await;

        client::send(req).await // Return the api.anthropic.com response as-is
    }
}

export!(AnthropicOut);
 4. Build → Compose → LaunchRun from within gateway-demo/.
# 1) WASM build
cargo build --release --target wasm32-wasip2 --manifest-path log/Cargo.toml
cargo build --release --target wasm32-wasip2 --manifest-path anthropic-out/Cargo.toml

# 2) Compose into 1 binary with wac plug (log=outer / anthropic-out=inner=egress)
wac plug log/target/wasm32-wasip2/release/log.wasm \
  --plug anthropic-out/target/wasm32-wasip2/release/anthropic_out.wasm \
  -o gateway.wasm

# 3) Launch with wasmtime
wasmtime serve -S p3,cli -W component-model-async --addr 127.0.0.1:8080 gateway.wasm
 5. Route through actual Claude CodeIn a separate terminal, start claude with ANTHROPIC_BASE_URL pointing to this gateway.

For MAX and other subscriptions, issue an OAuth token using the claude setup-token command.

※ The following is an example for the Max plan
# 1) Issue an OAuth token
% claude setup-token
#   → Use the displayed sk-ant-oat01-...

# 2) Set gateway info and token info (unset other authentication first)
% unset ANTHROPIC_API_KEY ANTHROPIC_AUTH_TOKEN
% export ANTHROPIC_BASE_URL=http://localhost:8080
% export CLAUDE_CODE_OAUTH_TOKEN=sk-ant-oat01-...

# 3) Launch claude
% claude
When executed, Claude responds as usual (the gateway is transparent, so the user experience doesn't change).
❯ hello claude code.

⏺ Hello! Is there anything I can help you with?
At this point, the terminal where the gateway was launched (standard error) will show output from the log stage,

confirming that requests are passing through the gateway.

(/v1/messages/count_tokens and /v1/messages traffic will occur)
[log] Method::Post Some("/v1/messages/count_tokens")
[log] Method::Post Some("/v1/messages?beta=true")
The gateway forwards authentication headers (Authorization: Bearer …) to api.anthropic.com as-is.

※ For developer API keys, export ANTHROPIC_API_KEY=sk-ant-... also works

※ If both ANTHROPIC_API_KEY and CLAUDE_CODE_OAUTH_TOKEN are set, the API key takes priority. Note that ANTHROPIC_AUTH_TOKEN takes priority over the API key

※ Priority order: ANTHROPIC_AUTH_TOKEN > ANTHROPIC_API_KEY > CLAUDE_CODE_OAUTH_TOKEN
!Note:

Please limit this verification to your own or your organization's Claude Code traffic.

What Anthropic prohibits is:
Third-party developers providing Claude.ai login to users
Services that proxy Free, Pro, or Max credentials

For internal organizational use as well, please check your contract terms and organizational policies.
 6. Measuring gateway overheadLet's see how much overhead there is from adding a proxy.

Since actual response times are dominated by Anthropic's generation time and vary greatly,

here we eliminate LLM variability and look only at the processing cost of the gateway itself.
We replaced the upstream with a local mock LLM (a mock that returns SSE at a fixed pace) and measured.

We sent the same request multiple times both (a) without a gateway (directly connected to the mock LLM) and (b) via the gateway,

and compared the medians.


Route
Time to First Byte (TTFB) Median


Mock LLM direct connection (no gateway)
0.54 ms

Via gateway
1.07 ms

Difference (gateway processing cost)
+approx. 0.5 ms

The difference was approximately +1.2 ms even for the total stream completion time (standalone WASM processing value on local).
We also measured execution times with and without the gateway for actual Claude Code (claude -p, MAX plan, haiku).

The median was 9.82 seconds without the gateway / 9.76 seconds via the gateway (the gateway was slightly faster),

and each run fell within the variation range (7.2–14.9 seconds), so no clear difference was confirmed in this measurement.
 If you want to set up a more production-grade gatewayWe've confirmed the mechanism quickly locally, so let's try a more production-grade configuration.

This time, we used AWS Fargate as shown below, and verified it with actual Claude Code.
 [MDM distribution] ANTHROPIC_BASE_URL = https://gw.internal
     │
     ▼   ← Reachable only from internal network (internal ALB / source IP restriction / mTLS)
 ┌──────────────────┐      ┌───────────────────────────────┐  Egress allowed   ┌────────────────────┐
 │ Claude Code(CLI) │ ───▶ │ Gateway (Fargate)              │  to this dest only│  api.anthropic.com │
 │  Managed device  │      │ wasmtime + composed           │ ───────────────▶│                    │
 └──────────────────┘      │ meter→secret-scan→output-mask │                 └────────────────────┘
                           └───────────────────────────────┘
   ╳ Direct access from managed devices → api.anthropic.com is blocked at the network level (egress control)
Containerize the gateway and run it persistently (Fargate / Cloud Run etc., environments where wasmtime can be bundled)
Health checks do not go to upstream (short-circuit /healthz locally)
Make the public boundary fail-closed (avoid unrestricted public access; require internal ALB or source IP restriction, mTLS, etc.)
Limit outbound HTTP destinations to api.anthropic.com via network-level egress control (permissions can also be restricted per stage in WASI, but since it depends on world design, ensure it at the network level first)
Distribute ANTHROPIC_BASE_URL via MDM + enforce with network egress control
When actually thinking about operations,
(1) Network design that genuinely blocks egress

(2) MDM distribution and override prevention

(3) Log design for auditing

(4) Operating team-specific guardrails
are all relevant, and the design will vary depending on the situation, organizational size, and existing rules.

By considering these factors and designing appropriately, it should be possible to "use Claude Code safely in an organization."
 SummaryClaude Code has an official LLM gateway feature that allows you to swap out the connection destination with ANTHROPIC_BASE_URL.

Using this, without modifying Claude Code itself, you can insert content-based controls via your own gateway —

such as "stopping secrets before they're sent" and "redacting responses" — that aren't built-in,

while maintaining streaming.
In this article, we built that gateway using WASM component composition (WASI 0.3).

By creating guardrails as components, they can be swapped out or added with just composition configuration,

and can be implemented in any language as long as there's a toolchain that supports wasi:http.
That said, the gateway only needs to be an HTTP service compatible with the Anthropic Messages API,

and WASM composition is just one approach.

There's no issue with implementing it as a regular reverse proxy, and it can run in the cloud or on-premises.
For organizational use, the recommended approach is to first leverage the built-in controls like managed-settings in the Enterprise plan,

and then combine them with a gateway like the one in this article.

By fixing ANTHROPIC_BASE_URL as non-overridable via MDM and combining it with network egress controls,

you can enforce that all model API traffic goes through the gateway.

This allows you to cover content-based processing that can't be reached with built-in features alone.
 ReferencesClaude Code: LLM gateway configuration
Claude Code: Hooks
Claude Code: Set up Claude Code for your organization
Claude Code: 環境変数リファレンス
Claude Code: Settings
Claude Code: Server-managed settings
Claude Code: Desktop application
Claude Code: Authentication
Claude Code: Legal and compliance
Anthropic: Handle streaming refusals
WASI 0.3.0 announcement（Bytecode Alliance）
Component Model
wac（WebAssembly Composition）
Wasmtime
Intercepting Claude Code Communications with a Custom Gateway — Implementation Using WASI 0.3 and WAC

Introduction

About Claude Code

Changing Claude Code's connection destination

About built-in security features

Enforcing the gateway via MDM

Create Gateway（WASM・WAC）

What is WASM/WASI?

WASM（WebAssembly）

WASI（WebAssembly System Interface）

Component Model / `wac`

How it works

Why WASM?

Environment

Setup

Try

0. Minimal Proxy

1. Create a sample project

2. Log

3. Egress: Relay to api.anthropic.com

4. Build → Compose → Launch

5. Route through actual Claude Code

6. Measuring gateway overhead

If you want to set up a more production-grade gateway

Summary

References

Claudeならクラスメソッドにお任せください

AWS Topics

Trending Topics

Products & Services

Features and Series

Item	Version
OS	macOS 26.4 (Apple Silicon / arm64)
Claude Code	2.1.185
Rust	1.95.0 (target `wasm32-wasip2`)
wasmtime	45.0.1 (WASI 0.3 / `-S p3 -W component-model-async` compatible version)
wasm-tools	1.251.0
wac (wac-cli)	0.10.0
wit-bindgen	0.58
Node.js	20.19.0
mise	2026.6.0 macos-arm64
Route	Time to First Byte (TTFB) Median
Mock LLM direct connection (no gateway)	0.54 ms
Via gateway	1.07 ms
Difference (gateway processing cost)	+approx. 0.5 ms