perf: support extended thinking for Anthropic, DeepSeek reasoning mode, and Gemini text part thought signatures to improve multi-turn reasoning performance. #4240

Soulter · 2025-12-28T11:10:02Z

fixes: #4119 #4174

Modifications / 改动点

This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果

Checklist / 检查清单

😊 如果 PR 中有新加入的功能，已经通过 Issue / 邮件等方式和作者讨论过。/ If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
👀 我的更改经过了良好的测试，并已在上方提供了“验证步骤”和“运行截图”。/ My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
🤓 我确保没有引入新依赖库，或者引入了新依赖库的同时将其添加到了 requirements.txt 和 pyproject.toml 文件相应位置。/ I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
😮 我的更改没有引入恶意代码。/ My changes do not introduce malicious code.

Summary by Sourcery

改进跨各类提供商和 Agent 流水线的多轮推理支持与可观测性。

新增功能：

在 Agent 消息中添加结构化的 ThinkPart 支持，并在整个 Agent 运行上下文和历史中持久化 LLM 推理内容和签名。
支持 Anthropic 的扩展思维配置，在流式与非流式响应中捕获思维内容和签名。
支持 Gemini 的推理文本和思维签名，将其接入请求分片和响应元数据。
新增专用的 xAI 聊天补全提供商适配器，支持原生搜索参数注入。

错误修复：

当 Anthropic 的 system 或 tool_result 内容为空时，通过提供安全默认值避免崩溃。

增强改进：

通过提供商配置控制推理文本的展示，并在最终消息链中注入推理文本，而不是在流水线早期修改补全文本。
优化会话历史保存逻辑，使用 Agent 上下文中的消息，并遵守逐条消息的 no-save 标记。
扩展 LLMResponse，以在推理内容之外可选地携带推理签名。
调整 TTS 触发逻辑，简化概率处理、在缺少提供商时发出警告，并避免朗读仅包含推理内容的响应。

Original summary in English

Summary by Sourcery

Improve multi-turn reasoning support and observability across providers and the agent pipeline.

New Features:

Add structured ThinkPart support in agent messages and persist LLM reasoning and signatures through the agent run context and history.
Support Anthropic extended thinking configuration, capturing thinking content and signatures in both streaming and non-streaming responses.
Support Gemini reasoning text and thought signatures, wiring them into request parts and response metadata.
Introduce a dedicated xAI chat completion provider adapter with native search parameter injection.

Bug Fixes:

Avoid crashes when Anthropic system or tool_result content is empty by providing safe defaults.

Enhancements:

Gate display of reasoning text via provider settings and inject it into the final message chain instead of mutating completion text earlier in the pipeline.
Refine conversation history saving to use the agent context messages and respect per-message no-save flags.
Extend LLMResponse to carry optional reasoning signatures alongside reasoning content.
Adjust TTS triggering logic to simplify probability handling, warn on missing provider, and avoid speaking reasoning-only responses.

…e, and Gemini text part thought signatures to improve multi-turn reasoning performance.

sourcery-ai

Hey - 我发现了 4 个问题，并给出了一些整体层面的反馈：

在 Anthropic 的流式循环里有一条多余的 print(f"event: {event}")，会在生产环境里疯狂往 stdout 打印日志；建议改成调试级别的日志记录，或者直接删除。
Anthropic 的 anth_thinking_config.budget 值在没有校验的情况下直接传递给了 payloads["thinking"]，但配置注释中说明它必须 ≥1024；更安全的做法是对该值进行限制或校验，并在值无效时记录日志/忽略，而不是原样转发。
TTS 触发逻辑的语义发生了变化（例如移除了 >= 1.0 的“必定触发”捷径，以及对跳过 TTS 的日志行为不同）；请仔细确认这是否与现有配置下的预期行为一致，以避免用户在 TTS 触发时看到意料之外的变化。

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- There is a stray `print(f"event: {event}")` in the Anthropic streaming loop which will spam stdout in production; consider replacing it with a debug logger or removing it entirely.
- The Anthropic `anth_thinking_config.budget` value is passed through directly to `payloads["thinking"]` without validation, even though the config hint states it must be ≥1024; it would be safer to clamp or validate this and log/ignore invalid values rather than forwarding them as-is.
- The TTS trigger logic has changed semantics (e.g. removal of the `>= 1.0` always-trigger shortcut and different logging on skipped TTS); please double‑check this matches the intended behavior across existing configurations so users don't see unexpected changes in when TTS fires.

## Individual Comments

### Comment 1
<location> `astrbot/core/pipeline/result_decorate/stage.py:267-269` </location>
<code_context>
+                and random.random() <= self.tts_trigger_probability
+                and tts_provider
+            )
+            if not tts_provider:
+                logger.warning(
+                    f"会话 {event.unified_msg_origin} 未配置文本转语音模型。",
                 )

</code_context>

<issue_to_address>
**issue:** TTS provider warning will be emitted even when TTS is disabled or inapplicable.

With this structure, the `if not tts_provider:` branch runs even when `should_tts` is false (TTS disabled, non-LLM result, or probability check fails), so we log warnings in cases where TTS wouldn’t be used at all. This adds noise and obscures real misconfigurations. Please gate the warning behind the same condition as the TTS execution, e.g. `if should_tts and not tts_provider:`.
</issue_to_address>

### Comment 2
<location> `astrbot/core/provider/sources/anthropic_source.py:236` </location>
<code_context>
         ) as stream:
             assert isinstance(stream, anthropic.AsyncMessageStream)
             async for event in stream:
+                print(f"event: {event}")
                 if event.type == "message_start":
                     # the usage contains input token usage
</code_context>

<issue_to_address>
**issue:** Using `print` in async streaming loop can flood stdout and bypass the logger.

If you still need this for troubleshooting, switch to `logger.debug` (or similar) or guard it behind a debug flag so it’s disabled in normal deployments.
</issue_to_address>

### Comment 3
<location> `astrbot/core/provider/sources/gemini_source.py:338` </location>
<code_context>
+                            reasoning_content = part.get("think") or None
+                            thinking_signature = part.get("encrypted") or None
+                        else:
+                            text += part.get("text")
+
+                    if thinking_signature and isinstance(thinking_signature, str):
</code_context>

<issue_to_address>
**issue (bug_risk):** Unsafe string concatenation may fail if `part.get("text")` is missing or non-string.

Since `part.get("text")` can be `None` or a non-string, `text += part.get("text")` may raise `TypeError`. Please guard this, e.g.:

```python
v = part.get("text") or ""
if isinstance(v, str):
    text += v
```

This keeps the code resilient to unexpected or evolving upstream payloads.
</issue_to_address>

### Comment 4
<location> `astrbot/core/provider/sources/openai_source.py:376` </location>
<code_context>
+                new_content = []  # not including think part
+                for part in message["content"]:
+                    if part.get("type") == "think":
+                        reasoning_content += part.get("think")
+                    else:
+                        new_content.append(part)
</code_context>

<issue_to_address>
**issue (bug_risk):** `reasoning_content += part.get("think")` can raise if the key is missing or not a string.

`dict.get` returns `None` by default, so `reasoning_content += part.get("think")` can raise `TypeError` if the key is missing or the value isn’t a string. Consider guarding the value or providing a safe default, e.g.:

```python
val = part.get("think")
if isinstance(val, str):
    reasoning_content += val
```

or `reasoning_content += str(part.get("think", ""))` if coercion is acceptable.
</issue_to_address>

Sourcery 对开源项目免费——如果你觉得这次评审有帮助，欢迎分享 ✨

_{帮我变得更有用！请在每条评论上点 👍 或 👎，我会根据你的反馈改进评审质量。}

Original comment in English

Hey - I've found 4 issues, and left some high level feedback:

There is a stray print(f"event: {event}") in the Anthropic streaming loop which will spam stdout in production; consider replacing it with a debug logger or removing it entirely.
The Anthropic anth_thinking_config.budget value is passed through directly to payloads["thinking"] without validation, even though the config hint states it must be ≥1024; it would be safer to clamp or validate this and log/ignore invalid values rather than forwarding them as-is.
The TTS trigger logic has changed semantics (e.g. removal of the >= 1.0 always-trigger shortcut and different logging on skipped TTS); please double‑check this matches the intended behavior across existing configurations so users don't see unexpected changes in when TTS fires.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- There is a stray `print(f"event: {event}")` in the Anthropic streaming loop which will spam stdout in production; consider replacing it with a debug logger or removing it entirely.
- The Anthropic `anth_thinking_config.budget` value is passed through directly to `payloads["thinking"]` without validation, even though the config hint states it must be ≥1024; it would be safer to clamp or validate this and log/ignore invalid values rather than forwarding them as-is.
- The TTS trigger logic has changed semantics (e.g. removal of the `>= 1.0` always-trigger shortcut and different logging on skipped TTS); please double‑check this matches the intended behavior across existing configurations so users don't see unexpected changes in when TTS fires.

## Individual Comments

### Comment 1
<location> `astrbot/core/pipeline/result_decorate/stage.py:267-269` </location>
<code_context>
+                and random.random() <= self.tts_trigger_probability
+                and tts_provider
+            )
+            if not tts_provider:
+                logger.warning(
+                    f"会话 {event.unified_msg_origin} 未配置文本转语音模型。",
                 )

</code_context>

<issue_to_address>
**issue:** TTS provider warning will be emitted even when TTS is disabled or inapplicable.

With this structure, the `if not tts_provider:` branch runs even when `should_tts` is false (TTS disabled, non-LLM result, or probability check fails), so we log warnings in cases where TTS wouldn’t be used at all. This adds noise and obscures real misconfigurations. Please gate the warning behind the same condition as the TTS execution, e.g. `if should_tts and not tts_provider:`.
</issue_to_address>

### Comment 2
<location> `astrbot/core/provider/sources/anthropic_source.py:236` </location>
<code_context>
         ) as stream:
             assert isinstance(stream, anthropic.AsyncMessageStream)
             async for event in stream:
+                print(f"event: {event}")
                 if event.type == "message_start":
                     # the usage contains input token usage
</code_context>

<issue_to_address>
**issue:** Using `print` in async streaming loop can flood stdout and bypass the logger.

If you still need this for troubleshooting, switch to `logger.debug` (or similar) or guard it behind a debug flag so it’s disabled in normal deployments.
</issue_to_address>

### Comment 3
<location> `astrbot/core/provider/sources/gemini_source.py:338` </location>
<code_context>
+                            reasoning_content = part.get("think") or None
+                            thinking_signature = part.get("encrypted") or None
+                        else:
+                            text += part.get("text")
+
+                    if thinking_signature and isinstance(thinking_signature, str):
</code_context>

<issue_to_address>
**issue (bug_risk):** Unsafe string concatenation may fail if `part.get("text")` is missing or non-string.

Since `part.get("text")` can be `None` or a non-string, `text += part.get("text")` may raise `TypeError`. Please guard this, e.g.:

```python
v = part.get("text") or ""
if isinstance(v, str):
    text += v
```

This keeps the code resilient to unexpected or evolving upstream payloads.
</issue_to_address>

### Comment 4
<location> `astrbot/core/provider/sources/openai_source.py:376` </location>
<code_context>
+                new_content = []  # not including think part
+                for part in message["content"]:
+                    if part.get("type") == "think":
+                        reasoning_content += part.get("think")
+                    else:
+                        new_content.append(part)
</code_context>

<issue_to_address>
**issue (bug_risk):** `reasoning_content += part.get("think")` can raise if the key is missing or not a string.

`dict.get` returns `None` by default, so `reasoning_content += part.get("think")` can raise `TypeError` if the key is missing or the value isn’t a string. Consider guarding the value or providing a safe default, e.g.:

```python
val = part.get("think")
if isinstance(val, str):
    reasoning_content += val
```

or `reasoning_content += str(part.get("think", ""))` if coercion is acceptable.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

astrbot/core/pipeline/result_decorate/stage.py

astrbot/core/provider/sources/anthropic_source.py

astrbot/core/provider/sources/gemini_source.py

astrbot/core/provider/sources/openai_source.py

Soulter · 2025-12-28T11:17:38Z

一个风险点是，目前 openai_source 默认 reasoning_key 为 "reasoning_content"。
即使已经对 groq provider 做了特殊处理，但是在某些使用了其他 key name 的提供商下，这个行为可能会导致提供商那边 validation error。

目前知道 DeepSeek、MoonshotAI 都使用 reasoning_content

kawayiYokami · 2025-12-28T20:20:28Z

astrbot\core\provider\sources\openai_source.py:113-118

        model = payloads.get("model", "").lower()

        # 针对 deepseek 模型的特殊处理：deepseek-reasoner调用必须移除 tools ，否则将被切换至 deepseek-chat
        if model == "deepseek-reasoner" and "tools" in payloads:
            del payloads["tools"]

这一段要去掉，这一段过时了

…n openai source

kawayiYokami · 2025-12-29T05:47:09Z

一个风险点是，目前 openai_source 默认 reasoning_key 为 "reasoning_content"。即使已经对 groq provider 做了特殊处理，但是在某些使用了其他 key name 的提供商下，这个行为可能会导致提供商那边 validation error。

目前知道 DeepSeek、MoonshotAI 都使用 reasoning_content

这个根据经验只能遇到一个加一个。各大供应商都没有统一标准，而且请求体格式也常变动

…essing

perf: support extended thinking for Anthropic, DeepSeek reasoning mod…

f027742

…e, and Gemini text part thought signatures to improve multi-turn reasoning performance.

auto-assign bot requested review from LIghtJUNction and anka-afk December 28, 2025 11:10

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. labels Dec 28, 2025

Soulter mentioned this pull request Dec 28, 2025

Feature/deepseek thinking mode #4190

Closed

5 tasks

sourcery-ai bot reviewed Dec 28, 2025

View reviewed changes

Soulter added 2 commits December 28, 2025 19:12

chore: remove verbose

797a74f

perf

d1f069f

refactor: remove special tools handling for deepseek-reasoner model i…

8ee3f4f

…n openai source

Soulter added 3 commits December 29, 2025 14:10

fix: improve error handling and logging in InternalAgentSubStage proc…

6b25475

…essing

refactor: remove unused reasoning content from Gemini source processing

f4bdffa

refactor: enhance modality determination logic in useProviderSources

5e6300d

dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Dec 29, 2025

Soulter merged commit 66e2f49 into master Dec 29, 2025
6 checks passed

Soulter mentioned this pull request Dec 30, 2025

[Bug]在使用DeepSeek的思考模式（thinking mode）进行工具调用（tool calls）时，缺少了必需的 reasoning_content 字段 #4174

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

perf: support extended thinking for Anthropic, DeepSeek reasoning mode, and Gemini text part thought signatures to improve multi-turn reasoning performance. #4240

perf: support extended thinking for Anthropic, DeepSeek reasoning mode, and Gemini text part thought signatures to improve multi-turn reasoning performance. #4240

Uh oh!

Soulter commented Dec 28, 2025 •

edited

Loading

Uh oh!

sourcery-ai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Soulter commented Dec 28, 2025 •

edited

Loading

Uh oh!

kawayiYokami commented Dec 28, 2025 •

edited

Loading

Uh oh!

kawayiYokami commented Dec 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

perf: support extended thinking for Anthropic, DeepSeek reasoning mode, and Gemini text part thought signatures to improve multi-turn reasoning performance. #4240

perf: support extended thinking for Anthropic, DeepSeek reasoning mode, and Gemini text part thought signatures to improve multi-turn reasoning performance. #4240

Uh oh!

Conversation

Soulter commented Dec 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Modifications / 改动点

Screenshots or Test Results / 运行截图或测试结果

Checklist / 检查清单

Summary by Sourcery

Summary by Sourcery

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Soulter commented Dec 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kawayiYokami commented Dec 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kawayiYokami commented Dec 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Soulter commented Dec 28, 2025 •

edited

Loading

Soulter commented Dec 28, 2025 •

edited

Loading

kawayiYokami commented Dec 28, 2025 •

edited

Loading