feat: extra text block #4189

kawayiYokami · 2025-12-24T19:57:44Z

🎯 动机 / Motivation

解决的问题：

多文本块支持缺失：当前 AstrBot 只支持单条用户消息，无法在一个请求中包含多个文本块
系统信息污染用户输入：用户标识、时间、群组信息等系统提醒直接插入到用户输入中，违反了 OpenAI 最佳实践
格式不够优雅：系统提醒分散在多个文本块中，浪费 token 且不易解析

添加的功能：

多文本块消息支持：通过 extra_content_blocks 属性支持多个文本块
优化系统提醒格式：统一用 <system_reminder> 包裹，遵循 OpenAI 最佳实践
语义化标签：区分系统提醒、图片描述、引用消息

📝 改动点 / Modifications

核心文件修改：

core/provider/entities.py
- 新增 extra_content_blocks 属性（默认空列表）
- 重构 assemble_context() 方法支持多文本块
- 智能降级：单文本块时保持向后兼容
core/provider/sources/openai_source.py
- 更新 assemble_context() 支持多文本块和额外内容
core/provider/sources/gemini_source.py
- 更新 assemble_context() 支持多文本块和额外内容
core/provider/sources/anthropic_source.py
- 更新 assemble_context() 支持多文本块和额外内容
packages/astrbot/process_llm_request.py
- 重构系统提醒收集机制：先收集，后统一包裹
- 图片描述改为 <image_caption> 标签
- 系统提醒统一为 <system_reminder> 格式

实现的功能：

✅ 多文本块支持：一个用户消息可包含多个文本块
✅ OpenAI 最佳实践：用户发言在前，系统提醒在后
✅ Token 优化：去除冗余换行，统一系统提醒格式
✅ 向后兼容：现有代码无需修改
✅ 语义清晰：明确区分不同类型的内容

🖼️ 测试结果 / Test Results

测试场景：

向后兼容性测试：只有 prompt 时正确降级为简单格式
多文本块测试：prompt + extra_content_blocks 正确生成数组格式
系统提醒格式测试：验证 <system_reminder> 包裹格式
图片描述测试：验证 <image_caption> 标签格式

测试输出示例：

   1 {
   2   "role": "user",
   3   "content": [
   4     {"type": "text", "text": "你好世界"},
   5     {"type": "text", "text": "<system_reminder>User ID: 123, Nickname: TestUserGroup name:
     测试群Current datetime: 2025-12-25 10:30</system_reminder>"},
   6     {"type": "text", "text": "<image_caption>一张美丽的风景图</image_caption>"}
   7   ]
   8 }

✅ 检查清单 / Checklist

😊 功能讨论：多文本块功能是架构优化，无需额外讨论
👀 测试充分：已通过完整的功能测试，验证了向后兼容性和新功能
🤓 无新依赖：未引入任何新的依赖库
😮 代码安全：代码仅涉及消息格式优化，无安全风险

🚀 使用示例

   1 # 插件中使用新功能
   2 req = event.request_llm(prompt="你好")
   3 req.extra_content_blocks.extend([
   4     {"type": "text", "text": "<system_reminder>请用友好的语气回答</system_reminder>"}
   5 ])
   6 yield req

需要注意的是，本次PR并未对所以本人从未使用的功能进行优化。
比如知识库。

✦ 这个 PR 为 AstrBot 的多模态消息处理奠定了基础，同时保持了完全的向后兼容性！🎯

Summary by Sourcery

在保持现有单文本行为向后兼容的前提下，为用户消息添加对多个内容块的结构化支持。

新功能：

在 ProviderRequest 上引入 extra_content_blocks 字段，用于携带额外的用户消息片段，例如系统提醒、图像描述和引用消息。
在 OpenAI、Gemini 和 Anthropic 提供方中支持多块用户内容，将主要文本、额外内容块和图片组合成统一的消息载荷。

增强点：

优化各提供方的上下文组装逻辑，以主要用户发言为优先；在仅有图片时添加文本占位符；在适用时回退到旧的单文本格式。
通过收集元数据（用户、群组、时间），将系统提醒标准化为一个统一的 <system_reminder> 文本块，而不是直接注入到提示词中。
修改图像描述和引用消息的处理方式，将其作为带语义标签的文本块（例如 <image_caption>、<Quoted Message>）追加在用户消息之后，而非前置到提示词前面。

Original summary in English

Summary by Sourcery

Add structured support for multiple content blocks in user messages while keeping existing single-text behavior backward compatible.

New Features:

Introduce an extra_content_blocks field on ProviderRequest to carry additional user message segments such as system reminders, image captions, and quoted messages.
Support multi-block user content in OpenAI, Gemini, and Anthropic providers, combining primary text, extra content blocks, and images into a unified message payload.

Enhancements:

Refine context assembly across providers to prioritize the main user utterance, add a text placeholder when only images are present, and fall back to the legacy single-text format when applicable.
Standardize system reminder formatting by collecting metadata (user, group, time) and wrapping it in a single <system_reminder> text block instead of injecting it directly into the prompt.
Change image captions and quoted message handling to emit semantically tagged text blocks (e.g., <image_caption>, ) appended after the user message instead of prefixing the prompt.

sourcery-ai

Hey - 我发现了 3 个问题，并给出了一些总体反馈：

assemble_context 中的向后兼容降级逻辑在 ProviderRequest 与各个 provider source 之间不一致：OpenAI / Gemini / Anthropic 只要存在单一文本块就会降级为纯文本字符串（即使该块来自 extra_content_blocks），而 ProviderRequest.assemble_context 只会在不存在额外块或图片时才降级；建议统一这些条件，以便在多内容块场景下，各个 provider 的行为保持一致、更可预测。
在 process_llm_request 中构建 system_content 时，system_parts 使用 "".join(...) 拼接，这会生成一个没有任何分隔符的长串（例如 User ID...Nickname...Group name...Current datetime...），影响可读性；建议使用换行或其他明显分隔符（如 "\n".join(system_parts)）来拼接，以符合预期的可读格式。

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The backward‑compatibility downgrade logic in `assemble_context` is inconsistent between `ProviderRequest` and the provider sources: OpenAI/Gemini/Anthropic downgrade to a plain text string whenever there is a single text block (even if it came from `extra_content_blocks`), whereas `ProviderRequest.assemble_context` only downgrades when there are no extra blocks or images; consider aligning these conditions so that multi‑block usage behaves predictably across providers.
- When building `system_content` in `process_llm_request`, `system_parts` are concatenated with `"".join(...)`, which will produce a single run‑on string (e.g., `User ID...Nickname...Group name...Current datetime...`) without separators; consider joining with newlines or a clear delimiter (e.g., `"\n".join(system_parts)`) to match the intended readable format.

## Individual Comments

### Comment 1
<location> `packages/astrbot/process_llm_request.py:243-244` </location>
<code_context>
+            req.extra_content_blocks.append({"type": "text", "text": quoted_text})
+
+        # 统一包裹所有系统提醒
+        if system_parts:
+            system_content = (
+                "<system_reminder>" + "".join(system_parts) + "</system_reminder>"
+            )
</code_context>

<issue_to_address>
**issue (bug_risk):** System reminder pieces are concatenated without separators, which harms readability and may change semantics.

Previously these system details were separated by newlines, but `"".join(system_parts)` now produces `<system_reminder>User ID: ...Group name: ...Current datetime: ...</system_reminder>` with no delimiters. This reduces readability and may break existing prompts that depend on line breaks. Consider joining with newlines (e.g. `"\n".join(system_parts)` or prefixing each part with `"\n"`) to preserve the prior structure.
</issue_to_address>

### Comment 2
<location> `astrbot/core/provider/sources/openai_source.py:665-667` </location>
<code_context>
-            return user_content
-        return {"role": "user", "content": text}
+
+        # 如果只有文本且没有额外内容块，返回简单格式以保持向后兼容
+        if len(content_blocks) == 1 and content_blocks[0]["type"] == "text":
+            return {"role": "user", "content": content_blocks[0]["text"]}
+
</code_context>

<issue_to_address>
**suggestion (bug_risk):** The downgrade-to-plain-text condition ignores `extra_content_blocks` and may misclassify messages.

The comment says “只有文本且没有额外内容块”, but the code only checks `len(content_blocks) == 1` and `type == "text"`. As a result:
- A message with empty `text` and a single text `extra_content_block` (no images) will be downgraded to `{"content": <extra_text>}`, losing the original block structure.
- This differs from `ProviderRequest.assemble_context`, which only downgrades when `not self.extra_content_blocks and not self.image_urls`.
To keep block structure when content comes from `extra_content_blocks` or multimodal input, consider applying the same guard (also ensuring there are no images and no extra blocks beyond the main prompt text).

```suggestion
        # 如果只有主文本且没有额外内容块和图片，返回简单格式以保持向后兼容
        # 注意：这里与 ProviderRequest.assemble_context 保持一致，
        # 仅在没有 extra_content_blocks 且没有 image_urls 时才降级为纯文本，
        # 避免把来自 extra_content_blocks 的内容或多模态消息误判为简单文本消息。
        if (
            not extra_content_blocks
            and not image_urls
            and len(content_blocks) == 1
            and content_blocks[0]["type"] == "text"]
        ):
            return {"role": "user", "content": content_blocks[0]["text"]}
```
</issue_to_address>

### Comment 3
<location> `astrbot/core/provider/sources/gemini_source.py:842-843` </location>
<code_context>
-            return user_content
-        return {"role": "user", "content": text}
+
+        # 如果只有文本且没有额外内容块，返回简单格式以保持向后兼容
+        if len(content_blocks) == 1 and content_blocks[0]["type"] == "text":
+            return {"role": "user", "content": content_blocks[0]["text"]}
+
</code_context>

<issue_to_address>
**suggestion:** Same downgrade condition issue as OpenAI: extra-only content can be collapsed to plain text unexpectedly.

This uses the same overly-broad downgrade rule as `openai_source.assemble_context`: any single text `content_block` is converted to `{"content": <text>}`, even when it comes solely from `extra_content_blocks`. In cases where callers only pass an extra block (e.g., system reminder, quoted message) and no prompt/images, this collapses the structure and loses semantics. Please tighten the condition (as in `ProviderRequest.assemble_context`) so that only a plain user prompt with no extras/images is downgraded.

Suggested implementation:

```python
        # 如果只有文本且没有图片或额外内容块，返回简单格式以保持向后兼容
        # 注意：仅在调用方未提供 extra_content_blocks 和 image_urls 时才降级，
        # 避免“只传额外块（例如系统提醒、引述消息）”的场景被错误折叠为纯文本。
        if (
            len(content_blocks) == 1
            and content_blocks[0]["type"] == "text"
            and not image_urls
            and not extra_content_blocks
        ):
            return {"role": "user", "content": content_blocks[0]["text"]}

```

1. 确认 `assemble_context` 的调用方在仅传「额外内容块」而不传主 `text` 时，依然会把这些块放入 `extra_content_blocks`，而不是混入主文本块中；否则需要对构造 `content_blocks` 的逻辑做类似 `ProviderRequest.assemble_context` 的拆分（例如区分「主 user 提示」与「extra blocks」的来源）。
2. 建议对 `ProviderRequest.assemble_context` 当前的降级条件进行对照，确保这里与那里的降级语义保持一致（都仅在“单一主文本、无图片、无 extra 块”时返回简化结构）。
</issue_to_address>

Sourcery 对开源项目免费使用——如果你觉得这次评审有帮助，欢迎分享给更多人 ✨

_{请帮我变得更有用！欢迎在每条评论上点 👍 或 👎，我会根据你的反馈不断改进评审质量。}

Original comment in English

Hey - I've found 3 issues, and left some high level feedback:

The backward‑compatibility downgrade logic in assemble_context is inconsistent between ProviderRequest and the provider sources: OpenAI/Gemini/Anthropic downgrade to a plain text string whenever there is a single text block (even if it came from extra_content_blocks), whereas ProviderRequest.assemble_context only downgrades when there are no extra blocks or images; consider aligning these conditions so that multi‑block usage behaves predictably across providers.
When building system_content in process_llm_request, system_parts are concatenated with "".join(...), which will produce a single run‑on string (e.g., User ID...Nickname...Group name...Current datetime...) without separators; consider joining with newlines or a clear delimiter (e.g., "\n".join(system_parts)) to match the intended readable format.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The backward‑compatibility downgrade logic in `assemble_context` is inconsistent between `ProviderRequest` and the provider sources: OpenAI/Gemini/Anthropic downgrade to a plain text string whenever there is a single text block (even if it came from `extra_content_blocks`), whereas `ProviderRequest.assemble_context` only downgrades when there are no extra blocks or images; consider aligning these conditions so that multi‑block usage behaves predictably across providers.
- When building `system_content` in `process_llm_request`, `system_parts` are concatenated with `"".join(...)`, which will produce a single run‑on string (e.g., `User ID...Nickname...Group name...Current datetime...`) without separators; consider joining with newlines or a clear delimiter (e.g., `"\n".join(system_parts)`) to match the intended readable format.

## Individual Comments

### Comment 1
<location> `packages/astrbot/process_llm_request.py:243-244` </location>
<code_context>
+            req.extra_content_blocks.append({"type": "text", "text": quoted_text})
+
+        # 统一包裹所有系统提醒
+        if system_parts:
+            system_content = (
+                "<system_reminder>" + "".join(system_parts) + "</system_reminder>"
+            )
</code_context>

<issue_to_address>
**issue (bug_risk):** System reminder pieces are concatenated without separators, which harms readability and may change semantics.

Previously these system details were separated by newlines, but `"".join(system_parts)` now produces `<system_reminder>User ID: ...Group name: ...Current datetime: ...</system_reminder>` with no delimiters. This reduces readability and may break existing prompts that depend on line breaks. Consider joining with newlines (e.g. `"\n".join(system_parts)` or prefixing each part with `"\n"`) to preserve the prior structure.
</issue_to_address>

### Comment 2
<location> `astrbot/core/provider/sources/openai_source.py:665-667` </location>
<code_context>
-            return user_content
-        return {"role": "user", "content": text}
+
+        # 如果只有文本且没有额外内容块，返回简单格式以保持向后兼容
+        if len(content_blocks) == 1 and content_blocks[0]["type"] == "text":
+            return {"role": "user", "content": content_blocks[0]["text"]}
+
</code_context>

<issue_to_address>
**suggestion (bug_risk):** The downgrade-to-plain-text condition ignores `extra_content_blocks` and may misclassify messages.

The comment says “只有文本且没有额外内容块”, but the code only checks `len(content_blocks) == 1` and `type == "text"`. As a result:
- A message with empty `text` and a single text `extra_content_block` (no images) will be downgraded to `{"content": <extra_text>}`, losing the original block structure.
- This differs from `ProviderRequest.assemble_context`, which only downgrades when `not self.extra_content_blocks and not self.image_urls`.
To keep block structure when content comes from `extra_content_blocks` or multimodal input, consider applying the same guard (also ensuring there are no images and no extra blocks beyond the main prompt text).

```suggestion
        # 如果只有主文本且没有额外内容块和图片，返回简单格式以保持向后兼容
        # 注意：这里与 ProviderRequest.assemble_context 保持一致，
        # 仅在没有 extra_content_blocks 且没有 image_urls 时才降级为纯文本，
        # 避免把来自 extra_content_blocks 的内容或多模态消息误判为简单文本消息。
        if (
            not extra_content_blocks
            and not image_urls
            and len(content_blocks) == 1
            and content_blocks[0]["type"] == "text"
        ):
            return {"role": "user", "content": content_blocks[0]["text"]}
```
</issue_to_address>

### Comment 3
<location> `astrbot/core/provider/sources/gemini_source.py:842-843` </location>
<code_context>
-            return user_content
-        return {"role": "user", "content": text}
+
+        # 如果只有文本且没有额外内容块，返回简单格式以保持向后兼容
+        if len(content_blocks) == 1 and content_blocks[0]["type"] == "text":
+            return {"role": "user", "content": content_blocks[0]["text"]}
+
</code_context>

<issue_to_address>
**suggestion:** Same downgrade condition issue as OpenAI: extra-only content can be collapsed to plain text unexpectedly.

This uses the same overly-broad downgrade rule as `openai_source.assemble_context`: any single text `content_block` is converted to `{"content": <text>}`, even when it comes solely from `extra_content_blocks`. In cases where callers only pass an extra block (e.g., system reminder, quoted message) and no prompt/images, this collapses the structure and loses semantics. Please tighten the condition (as in `ProviderRequest.assemble_context`) so that only a plain user prompt with no extras/images is downgraded.

Suggested implementation:

```python
        # 如果只有文本且没有图片或额外内容块，返回简单格式以保持向后兼容
        # 注意：仅在调用方未提供 extra_content_blocks 和 image_urls 时才降级，
        # 避免“只传额外块（例如系统提醒、引述消息）”的场景被错误折叠为纯文本。
        if (
            len(content_blocks) == 1
            and content_blocks[0]["type"] == "text"
            and not image_urls
            and not extra_content_blocks
        ):
            return {"role": "user", "content": content_blocks[0]["text"]}

```

1. 确认 `assemble_context` 的调用方在仅传「额外内容块」而不传主 `text` 时，依然会把这些块放入 `extra_content_blocks`，而不是混入主文本块中；否则需要对构造 `content_blocks` 的逻辑做类似 `ProviderRequest.assemble_context` 的拆分（例如区分「主 user 提示」与「extra blocks」的来源）。
2. 建议对 `ProviderRequest.assemble_context` 当前的降级条件进行对照，确保这里与那里的降级语义保持一致（都仅在“单一主文本、无图片、无 extra 块”时返回简化结构）。
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

packages/astrbot/process_llm_request.py

astrbot/core/provider/sources/openai_source.py

astrbot/core/provider/sources/gemini_source.py

kawayiYokami · 2025-12-25T11:04:48Z

这个PR将是其他模块优化的基石

feat: 多文本块功能

c5a2827

auto-assign bot requested review from Fridemn and Raven95676 December 24, 2025 19:57

sourcery-ai bot reviewed Dec 24, 2025

View reviewed changes

packages/astrbot/process_llm_request.py Show resolved Hide resolved

astrbot/core/provider/sources/openai_source.py Outdated Show resolved Hide resolved

astrbot/core/provider/sources/gemini_source.py Outdated Show resolved Hide resolved

FIX

9449ff6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat: extra text block #4189

feat: extra text block #4189

kawayiYokami commented Dec 24, 2025 •

edited by sourcery-ai bot

Loading

Uh oh!

sourcery-ai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kawayiYokami commented Dec 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

feat: extra text block #4189

Are you sure you want to change the base?

feat: extra text block #4189

Conversation

kawayiYokami commented Dec 24, 2025 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Summary by Sourcery

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kawayiYokami commented Dec 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kawayiYokami commented Dec 24, 2025 •

edited by sourcery-ai bot

Loading