Skip to content

Conversation

@kawayiYokami
Copy link
Contributor

@kawayiYokami kawayiYokami commented Dec 24, 2025

🎯 动机 / Motivation

解决的问题:

  • 多文本块支持缺失:当前 AstrBot 只支持单条用户消息,无法在一个请求中包含多个文本块
  • 系统信息污染用户输入:用户标识、时间、群组信息等系统提醒直接插入到用户输入中,违反了 OpenAI 最佳实践
  • 格式不够优雅:系统提醒分散在多个文本块中,浪费 token 且不易解析

添加的功能:

  • 多文本块消息支持:通过 extra_content_blocks 属性支持多个文本块
  • 优化系统提醒格式:统一用 <system_reminder> 包裹,遵循 OpenAI 最佳实践
  • 语义化标签:区分系统提醒、图片描述、引用消息

📝 改动点 / Modifications

核心文件修改:

  1. core/provider/entities.py

    • 新增 extra_content_blocks 属性(默认空列表)
    • 重构 assemble_context() 方法支持多文本块
    • 智能降级:单文本块时保持向后兼容
  2. core/provider/sources/openai_source.py

    • 更新 assemble_context() 支持多文本块和额外内容
  3. core/provider/sources/gemini_source.py

    • 更新 assemble_context() 支持多文本块和额外内容
  4. core/provider/sources/anthropic_source.py

    • 更新 assemble_context() 支持多文本块和额外内容
  5. packages/astrbot/process_llm_request.py

    • 重构系统提醒收集机制:先收集,后统一包裹
    • 图片描述改为 <image_caption> 标签
    • 系统提醒统一为 <system_reminder> 格式

实现的功能:

  • ✅ 多文本块支持:一个用户消息可包含多个文本块
  • ✅ OpenAI 最佳实践:用户发言在前,系统提醒在后
  • ✅ Token 优化:去除冗余换行,统一系统提醒格式
  • ✅ 向后兼容:现有代码无需修改
  • ✅ 语义清晰:明确区分不同类型的内容

🖼️ 测试结果 / Test Results

测试场景:

  1. 向后兼容性测试:只有 prompt 时正确降级为简单格式
  2. 多文本块测试:prompt + extra_content_blocks 正确生成数组格式
  3. 系统提醒格式测试:验证 <system_reminder> 包裹格式
  4. 图片描述测试:验证 <image_caption> 标签格式

测试输出示例:

   1 {
   2   "role": "user",
   3   "content": [
   4     {"type": "text", "text": "你好世界"},
   5     {"type": "text", "text": "<system_reminder>User ID: 123, Nickname: TestUserGroup name:
     测试群Current datetime: 2025-12-25 10:30</system_reminder>"},
   6     {"type": "text", "text": "<image_caption>一张美丽的风景图</image_caption>"}
   7   ]
   8 }

✅ 检查清单 / Checklist

  • 😊 功能讨论:多文本块功能是架构优化,无需额外讨论
  • 👀 测试充分:已通过完整的功能测试,验证了向后兼容性和新功能
  • 🤓 无新依赖:未引入任何新的依赖库
  • 😮 代码安全:代码仅涉及消息格式优化,无安全风险

🚀 使用示例

   1 # 插件中使用新功能
   2 req = event.request_llm(prompt="你好")
   3 req.extra_content_blocks.extend([
   4     {"type": "text", "text": "<system_reminder>请用友好的语气回答</system_reminder>"}
   5 ])
   6 yield req

需要注意的是,本次PR并未对所以 本人从未使用的功能进行优化。
比如知识库。

✦ 这个 PR 为 AstrBot 的多模态消息处理奠定了基础,同时保持了完全的向后兼容性!🎯

Summary by Sourcery

在保持现有单文本行为向后兼容的前提下,为用户消息添加对多个内容块的结构化支持。

新功能:

  • ProviderRequest 上引入 extra_content_blocks 字段,用于携带额外的用户消息片段,例如系统提醒、图像描述和引用消息。
  • 在 OpenAI、Gemini 和 Anthropic 提供方中支持多块用户内容,将主要文本、额外内容块和图片组合成统一的消息载荷。

增强点:

  • 优化各提供方的上下文组装逻辑,以主要用户发言为优先;在仅有图片时添加文本占位符;在适用时回退到旧的单文本格式。
  • 通过收集元数据(用户、群组、时间),将系统提醒标准化为一个统一的 <system_reminder> 文本块,而不是直接注入到提示词中。
  • 修改图像描述和引用消息的处理方式,将其作为带语义标签的文本块(例如 <image_caption><Quoted Message>)追加在用户消息之后,而非前置到提示词前面。
Original summary in English

Summary by Sourcery

Add structured support for multiple content blocks in user messages while keeping existing single-text behavior backward compatible.

New Features:

  • Introduce an extra_content_blocks field on ProviderRequest to carry additional user message segments such as system reminders, image captions, and quoted messages.
  • Support multi-block user content in OpenAI, Gemini, and Anthropic providers, combining primary text, extra content blocks, and images into a unified message payload.

Enhancements:

  • Refine context assembly across providers to prioritize the main user utterance, add a text placeholder when only images are present, and fall back to the legacy single-text format when applicable.
  • Standardize system reminder formatting by collecting metadata (user, group, time) and wrapping it in a single <system_reminder> text block instead of injecting it directly into the prompt.
  • Change image captions and quoted message handling to emit semantically tagged text blocks (e.g., <image_caption>, ) appended after the user message instead of prefixing the prompt.

@auto-assign auto-assign bot requested review from Fridemn and Raven95676 December 24, 2025 19:57
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - 我发现了 3 个问题,并给出了一些总体反馈:

  • assemble_context 中的向后兼容降级逻辑在 ProviderRequest 与各个 provider source 之间不一致:OpenAI / Gemini / Anthropic 只要存在单一文本块就会降级为纯文本字符串(即使该块来自 extra_content_blocks),而 ProviderRequest.assemble_context 只会在不存在额外块或图片时才降级;建议统一这些条件,以便在多内容块场景下,各个 provider 的行为保持一致、更可预测。
  • process_llm_request 中构建 system_content 时,system_parts 使用 "".join(...) 拼接,这会生成一个没有任何分隔符的长串(例如 User ID...Nickname...Group name...Current datetime...),影响可读性;建议使用换行或其他明显分隔符(如 "\n".join(system_parts))来拼接,以符合预期的可读格式。
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The backward‑compatibility downgrade logic in `assemble_context` is inconsistent between `ProviderRequest` and the provider sources: OpenAI/Gemini/Anthropic downgrade to a plain text string whenever there is a single text block (even if it came from `extra_content_blocks`), whereas `ProviderRequest.assemble_context` only downgrades when there are no extra blocks or images; consider aligning these conditions so that multi‑block usage behaves predictably across providers.
- When building `system_content` in `process_llm_request`, `system_parts` are concatenated with `"".join(...)`, which will produce a single run‑on string (e.g., `User ID...Nickname...Group name...Current datetime...`) without separators; consider joining with newlines or a clear delimiter (e.g., `"\n".join(system_parts)`) to match the intended readable format.

## Individual Comments

### Comment 1
<location> `packages/astrbot/process_llm_request.py:243-244` </location>
<code_context>
+            req.extra_content_blocks.append({"type": "text", "text": quoted_text})
+
+        # 统一包裹所有系统提醒
+        if system_parts:
+            system_content = (
+                "<system_reminder>" + "".join(system_parts) + "</system_reminder>"
+            )
</code_context>

<issue_to_address>
**issue (bug_risk):** System reminder pieces are concatenated without separators, which harms readability and may change semantics.

Previously these system details were separated by newlines, but `"".join(system_parts)` now produces `<system_reminder>User ID: ...Group name: ...Current datetime: ...</system_reminder>` with no delimiters. This reduces readability and may break existing prompts that depend on line breaks. Consider joining with newlines (e.g. `"\n".join(system_parts)` or prefixing each part with `"\n"`) to preserve the prior structure.
</issue_to_address>

### Comment 2
<location> `astrbot/core/provider/sources/openai_source.py:665-667` </location>
<code_context>
-            return user_content
-        return {"role": "user", "content": text}
+
+        # 如果只有文本且没有额外内容块,返回简单格式以保持向后兼容
+        if len(content_blocks) == 1 and content_blocks[0]["type"] == "text":
+            return {"role": "user", "content": content_blocks[0]["text"]}
+
</code_context>

<issue_to_address>
**suggestion (bug_risk):** The downgrade-to-plain-text condition ignores `extra_content_blocks` and may misclassify messages.

The comment says “只有文本且没有额外内容块”, but the code only checks `len(content_blocks) == 1` and `type == "text"`. As a result:
- A message with empty `text` and a single text `extra_content_block` (no images) will be downgraded to `{"content": <extra_text>}`, losing the original block structure.
- This differs from `ProviderRequest.assemble_context`, which only downgrades when `not self.extra_content_blocks and not self.image_urls`.
To keep block structure when content comes from `extra_content_blocks` or multimodal input, consider applying the same guard (also ensuring there are no images and no extra blocks beyond the main prompt text).

```suggestion
        # 如果只有主文本且没有额外内容块和图片,返回简单格式以保持向后兼容
        # 注意:这里与 ProviderRequest.assemble_context 保持一致,
        # 仅在没有 extra_content_blocks 且没有 image_urls 时才降级为纯文本,
        # 避免把来自 extra_content_blocks 的内容或多模态消息误判为简单文本消息。
        if (
            not extra_content_blocks
            and not image_urls
            and len(content_blocks) == 1
            and content_blocks[0]["type"] == "text"]
        ):
            return {"role": "user", "content": content_blocks[0]["text"]}
```
</issue_to_address>

### Comment 3
<location> `astrbot/core/provider/sources/gemini_source.py:842-843` </location>
<code_context>
-            return user_content
-        return {"role": "user", "content": text}
+
+        # 如果只有文本且没有额外内容块,返回简单格式以保持向后兼容
+        if len(content_blocks) == 1 and content_blocks[0]["type"] == "text":
+            return {"role": "user", "content": content_blocks[0]["text"]}
+
</code_context>

<issue_to_address>
**suggestion:** Same downgrade condition issue as OpenAI: extra-only content can be collapsed to plain text unexpectedly.

This uses the same overly-broad downgrade rule as `openai_source.assemble_context`: any single text `content_block` is converted to `{"content": <text>}`, even when it comes solely from `extra_content_blocks`. In cases where callers only pass an extra block (e.g., system reminder, quoted message) and no prompt/images, this collapses the structure and loses semantics. Please tighten the condition (as in `ProviderRequest.assemble_context`) so that only a plain user prompt with no extras/images is downgraded.

Suggested implementation:

```python
        # 如果只有文本且没有图片或额外内容块,返回简单格式以保持向后兼容
        # 注意:仅在调用方未提供 extra_content_blocks 和 image_urls 时才降级,
        # 避免“只传额外块(例如系统提醒、引述消息)”的场景被错误折叠为纯文本。
        if (
            len(content_blocks) == 1
            and content_blocks[0]["type"] == "text"
            and not image_urls
            and not extra_content_blocks
        ):
            return {"role": "user", "content": content_blocks[0]["text"]}

```

1. 确认 `assemble_context` 的调用方在仅传「额外内容块」而不传主 `text` 时,依然会把这些块放入 `extra_content_blocks`,而不是混入主文本块中;否则需要对构造 `content_blocks` 的逻辑做类似 `ProviderRequest.assemble_context` 的拆分(例如区分「主 user 提示」与「extra blocks」的来源)。
2. 建议对 `ProviderRequest.assemble_context` 当前的降级条件进行对照,确保这里与那里的降级语义保持一致(都仅在“单一主文本、无图片、无 extra 块”时返回简化结构)。
</issue_to_address>

Sourcery 对开源项目免费使用——如果你觉得这次评审有帮助,欢迎分享给更多人 ✨
请帮我变得更有用!欢迎在每条评论上点 👍 或 👎,我会根据你的反馈不断改进评审质量。
Original comment in English

Hey - I've found 3 issues, and left some high level feedback:

  • The backward‑compatibility downgrade logic in assemble_context is inconsistent between ProviderRequest and the provider sources: OpenAI/Gemini/Anthropic downgrade to a plain text string whenever there is a single text block (even if it came from extra_content_blocks), whereas ProviderRequest.assemble_context only downgrades when there are no extra blocks or images; consider aligning these conditions so that multi‑block usage behaves predictably across providers.
  • When building system_content in process_llm_request, system_parts are concatenated with "".join(...), which will produce a single run‑on string (e.g., User ID...Nickname...Group name...Current datetime...) without separators; consider joining with newlines or a clear delimiter (e.g., "\n".join(system_parts)) to match the intended readable format.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The backward‑compatibility downgrade logic in `assemble_context` is inconsistent between `ProviderRequest` and the provider sources: OpenAI/Gemini/Anthropic downgrade to a plain text string whenever there is a single text block (even if it came from `extra_content_blocks`), whereas `ProviderRequest.assemble_context` only downgrades when there are no extra blocks or images; consider aligning these conditions so that multi‑block usage behaves predictably across providers.
- When building `system_content` in `process_llm_request`, `system_parts` are concatenated with `"".join(...)`, which will produce a single run‑on string (e.g., `User ID...Nickname...Group name...Current datetime...`) without separators; consider joining with newlines or a clear delimiter (e.g., `"\n".join(system_parts)`) to match the intended readable format.

## Individual Comments

### Comment 1
<location> `packages/astrbot/process_llm_request.py:243-244` </location>
<code_context>
+            req.extra_content_blocks.append({"type": "text", "text": quoted_text})
+
+        # 统一包裹所有系统提醒
+        if system_parts:
+            system_content = (
+                "<system_reminder>" + "".join(system_parts) + "</system_reminder>"
+            )
</code_context>

<issue_to_address>
**issue (bug_risk):** System reminder pieces are concatenated without separators, which harms readability and may change semantics.

Previously these system details were separated by newlines, but `"".join(system_parts)` now produces `<system_reminder>User ID: ...Group name: ...Current datetime: ...</system_reminder>` with no delimiters. This reduces readability and may break existing prompts that depend on line breaks. Consider joining with newlines (e.g. `"\n".join(system_parts)` or prefixing each part with `"\n"`) to preserve the prior structure.
</issue_to_address>

### Comment 2
<location> `astrbot/core/provider/sources/openai_source.py:665-667` </location>
<code_context>
-            return user_content
-        return {"role": "user", "content": text}
+
+        # 如果只有文本且没有额外内容块,返回简单格式以保持向后兼容
+        if len(content_blocks) == 1 and content_blocks[0]["type"] == "text":
+            return {"role": "user", "content": content_blocks[0]["text"]}
+
</code_context>

<issue_to_address>
**suggestion (bug_risk):** The downgrade-to-plain-text condition ignores `extra_content_blocks` and may misclassify messages.

The comment says “只有文本且没有额外内容块”, but the code only checks `len(content_blocks) == 1` and `type == "text"`. As a result:
- A message with empty `text` and a single text `extra_content_block` (no images) will be downgraded to `{"content": <extra_text>}`, losing the original block structure.
- This differs from `ProviderRequest.assemble_context`, which only downgrades when `not self.extra_content_blocks and not self.image_urls`.
To keep block structure when content comes from `extra_content_blocks` or multimodal input, consider applying the same guard (also ensuring there are no images and no extra blocks beyond the main prompt text).

```suggestion
        # 如果只有主文本且没有额外内容块和图片,返回简单格式以保持向后兼容
        # 注意:这里与 ProviderRequest.assemble_context 保持一致,
        # 仅在没有 extra_content_blocks 且没有 image_urls 时才降级为纯文本,
        # 避免把来自 extra_content_blocks 的内容或多模态消息误判为简单文本消息。
        if (
            not extra_content_blocks
            and not image_urls
            and len(content_blocks) == 1
            and content_blocks[0]["type"] == "text"
        ):
            return {"role": "user", "content": content_blocks[0]["text"]}
```
</issue_to_address>

### Comment 3
<location> `astrbot/core/provider/sources/gemini_source.py:842-843` </location>
<code_context>
-            return user_content
-        return {"role": "user", "content": text}
+
+        # 如果只有文本且没有额外内容块,返回简单格式以保持向后兼容
+        if len(content_blocks) == 1 and content_blocks[0]["type"] == "text":
+            return {"role": "user", "content": content_blocks[0]["text"]}
+
</code_context>

<issue_to_address>
**suggestion:** Same downgrade condition issue as OpenAI: extra-only content can be collapsed to plain text unexpectedly.

This uses the same overly-broad downgrade rule as `openai_source.assemble_context`: any single text `content_block` is converted to `{"content": <text>}`, even when it comes solely from `extra_content_blocks`. In cases where callers only pass an extra block (e.g., system reminder, quoted message) and no prompt/images, this collapses the structure and loses semantics. Please tighten the condition (as in `ProviderRequest.assemble_context`) so that only a plain user prompt with no extras/images is downgraded.

Suggested implementation:

```python
        # 如果只有文本且没有图片或额外内容块,返回简单格式以保持向后兼容
        # 注意:仅在调用方未提供 extra_content_blocks 和 image_urls 时才降级,
        # 避免“只传额外块(例如系统提醒、引述消息)”的场景被错误折叠为纯文本。
        if (
            len(content_blocks) == 1
            and content_blocks[0]["type"] == "text"
            and not image_urls
            and not extra_content_blocks
        ):
            return {"role": "user", "content": content_blocks[0]["text"]}

```

1. 确认 `assemble_context` 的调用方在仅传「额外内容块」而不传主 `text` 时,依然会把这些块放入 `extra_content_blocks`,而不是混入主文本块中;否则需要对构造 `content_blocks` 的逻辑做类似 `ProviderRequest.assemble_context` 的拆分(例如区分「主 user 提示」与「extra blocks」的来源)。
2. 建议对 `ProviderRequest.assemble_context` 当前的降级条件进行对照,确保这里与那里的降级语义保持一致(都仅在“单一主文本、无图片、无 extra 块”时返回简化结构)。
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@kawayiYokami
Copy link
Contributor Author

这个PR将是其他模块优化的基石

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant