🤖 fix: enable prompt caching for mux-gateway Anthropic models #816

ethanndickson · 2025-12-01T11:49:52Z

Problem

Prompt caching wasn't working for Anthropic models accessed via mux-gateway. Only the system message (~5.4k tokens) was being cached, but conversation history was not.

Root Cause

The gateway provider uses a different request format and translation mechanism:

Provider	Request Format	Cache Control Translation
Direct Anthropic	`json.messages`	SDK translates `providerOptions` at all levels
Gateway	`json.prompt`	Gateway server only translates `providerOptions` at message level

Our applyCacheControl() was setting providerOptions.anthropic.cacheControl at the content part level, which the gateway server ignores. Only createCachedSystemMessage() was setting it at message level (hence system prompt caching worked).

Fix

Update wrapFetchWithAnthropicCacheControl to:

Detect gateway format by checking for json.prompt array
Add providerOptions.anthropic.cacheControl at message level for gateway requests
Keep cache_control injection at content part level for direct Anthropic

Testing

Verified caching works by checking that cache read tokens increase on subsequent messages.

Generated with mux

The gateway provider sends requests in SDK internal format (json.prompt) rather than Anthropic API format (json.messages). Additionally, the gateway server only translates providerOptions.anthropic.cacheControl at the message level, not at content part level. This fix: 1. Detects gateway format by checking for json.prompt array 2. Adds providerOptions.anthropic.cacheControl at message level for gateway 3. Keeps cache_control injection at content part level for direct Anthropic Without this fix, only system messages were cached (5.4k) because they already had message-level providerOptions. Conversation history was not cached because applyCacheControl() sets providerOptions at content part level, which the gateway server ignores.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2025-12-01T11:54:40Z

src/node/services/aiService.ts

+        // For direct Anthropic: add cache_control to last content part
+        const content = lastMsg.content;
        if (Array.isArray(content) && content.length > 0) {
-          // Array content: add cache_control to last part
          const lastPart = content[content.length - 1] as Record<string, unknown>;
          lastPart.cache_control ??= { type: "ephemeral" };


Avoid adding cache_control to gateway prompts

Gateway Anthropic requests use the AI SDK json.prompt schema and the function notes the gateway rejects raw cache_control fields, yet this block still injects cache_control into the last prompt content part when messages resolves from json.prompt. That means gateway chat requests with array content will now carry Anthropic-specific fields the gateway schema doesn’t accept, leading to 400/validation errors instead of enabling caching for those calls.

Useful? React with 👍 / 👎.

Not true, this function is only called for Anthropic

ethanndickson · 2025-12-01T13:37:12Z

@codex review

chatgpt-codex-connector · 2025-12-01T13:41:30Z

Codex Review: Didn't find any major issues. Already looking forward to the next diff.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

ethanndickson requested a review from ibetitsmike December 1, 2025 11:50

chatgpt-codex-connector bot reviewed Dec 1, 2025

View reviewed changes

ibetitsmike approved these changes Dec 1, 2025

View reviewed changes

ethanndickson added this pull request to the merge queue Dec 1, 2025

Merged via the queue into main with commit da9acc6 Dec 1, 2025
18 of 19 checks passed

ethanndickson deleted the mux-gateway-caching-fix branch December 1, 2025 13:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🤖 fix: enable prompt caching for mux-gateway Anthropic models #816

🤖 fix: enable prompt caching for mux-gateway Anthropic models #816

Uh oh!

ethanndickson commented Dec 1, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Dec 1, 2025

Uh oh!

ethanndickson Dec 1, 2025 •

edited

Loading

Uh oh!

ethanndickson commented Dec 1, 2025

Uh oh!

chatgpt-codex-connector bot commented Dec 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

🤖 fix: enable prompt caching for mux-gateway Anthropic models #816

🤖 fix: enable prompt caching for mux-gateway Anthropic models #816

Uh oh!

Conversation

ethanndickson commented Dec 1, 2025

Problem

Root Cause

Fix

Testing

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

ethanndickson Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ethanndickson commented Dec 1, 2025

Uh oh!

chatgpt-codex-connector bot commented Dec 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ethanndickson Dec 1, 2025 •

edited

Loading