Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 23 additions & 5 deletions src/node/services/aiService.ts
Original file line number Diff line number Diff line change
Expand Up @@ -125,16 +125,34 @@ function wrapFetchWithAnthropicCacheControl(baseFetch: typeof fetch): typeof fet

// Inject cache_control on last message's last content part
// This caches the entire conversation
if (Array.isArray(json.messages) && json.messages.length >= 1) {
const lastMsg = json.messages[json.messages.length - 1] as Record<string, unknown>;
const content = lastMsg.content;
// Handle both formats:
// - Direct Anthropic provider: json.messages (Anthropic API format)
// - Gateway provider: json.prompt (AI SDK internal format)
const messages = Array.isArray(json.messages)
? json.messages
: Array.isArray(json.prompt)
? json.prompt
: null;

if (messages && messages.length >= 1) {
const lastMsg = messages[messages.length - 1] as Record<string, unknown>;

// For gateway: add providerOptions.anthropic.cacheControl at message level
// (gateway validates schema strictly, doesn't allow raw cache_control on messages)
if (Array.isArray(json.prompt)) {
const providerOpts = (lastMsg.providerOptions ?? {}) as Record<string, unknown>;
const anthropicOpts = (providerOpts.anthropic ?? {}) as Record<string, unknown>;
anthropicOpts.cacheControl ??= { type: "ephemeral" };
providerOpts.anthropic = anthropicOpts;
lastMsg.providerOptions = providerOpts;
}

// For direct Anthropic: add cache_control to last content part
const content = lastMsg.content;
if (Array.isArray(content) && content.length > 0) {
// Array content: add cache_control to last part
const lastPart = content[content.length - 1] as Record<string, unknown>;
lastPart.cache_control ??= { type: "ephemeral" };
Comment on lines +150 to 154

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid adding cache_control to gateway prompts

Gateway Anthropic requests use the AI SDK json.prompt schema and the function notes the gateway rejects raw cache_control fields, yet this block still injects cache_control into the last prompt content part when messages resolves from json.prompt. That means gateway chat requests with array content will now carry Anthropic-specific fields the gateway schema doesn’t accept, leading to 400/validation errors instead of enabling caching for those calls.

Useful? React with 👍 / 👎.

Copy link
Member Author

@ethanndickson ethanndickson Dec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not true, this function is only called for Anthropic

}
// Note: String content messages are rare after SDK conversion; skip for now
}

// Update body with modified JSON
Expand Down