Skip to content

Conversation

@Soulter
Copy link
Member

@Soulter Soulter commented Dec 30, 2025

fixes: #3207

Motivation

  • Fix issue where chunk_overlap=0 was treated as falsy and fell back to default overlap, preventing users from setting zero overlap.
  • The bug appeared in the recursive character chunker when using or to select fallback values.

Description

  • Replace chunk_size = chunk_size or self.chunk_size and overlap = overlap or self.chunk_overlap with explicit if ... is None checks in astrbot/core/knowledge_base/chunking/recursive.py.
  • This change ensures that chunk_overlap=0 is accepted as a valid value while only None triggers the default.
  • The only modified file is astrbot/core/knowledge_base/chunking/recursive.py and the change is a small defensive check.

Testing

  • No automated tests were executed as part of this rollout.
  • The change is limited to value selection logic and does not add new behavior beyond allowing zero overlap.

Codex Task

Summary by Sourcery

Bug Fixes:

  • 修复了对 chunk_sizechunk_overlap 的处理方式,使得显式传入 0 时不再触发使用默认值。
Original summary in English

Summary by Sourcery

Bug Fixes:

  • Fix handling of chunk_size and chunk_overlap so that explicitly passing 0 no longer triggers use of the default values.

@dosubot dosubot bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label Dec 30, 2025
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

你好——我在这里给出了一些总体性的反馈:

  • 建议对一些无效组合添加防御性检查,比如 chunk_size <= 0overlap >= chunk_size,这样可以避免当用户传入边界值时,这个循环出现步长为非正数或其他意料之外的行为。
给 AI Agent 的提示词
Please address the comments from this code review:

## Overall Comments
- Consider adding defensive checks for invalid combinations like `chunk_size <= 0` or `overlap >= chunk_size` so this loop cannot end up with a non-positive step or unexpected behavior when users pass edge-case values.

Sourcery 对开源项目是免费的——如果你觉得我们的代码审查有帮助,欢迎分享 ✨
帮我变得更有用!请在每条评论上点 👍 或 👎,我会根据这些反馈来改进之后给你的审查建议。
Original comment in English

Hey - I've left some high level feedback:

  • Consider adding defensive checks for invalid combinations like chunk_size <= 0 or overlap >= chunk_size so this loop cannot end up with a non-positive step or unexpected behavior when users pass edge-case values.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Consider adding defensive checks for invalid combinations like `chunk_size <= 0` or `overlap >= chunk_size` so this loop cannot end up with a non-positive step or unexpected behavior when users pass edge-case values.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@dosubot dosubot bot added size:S This PR changes 10-29 lines, ignoring generated files. and removed size:XS This PR changes 0-9 lines, ignoring generated files. labels Dec 30, 2025
@Soulter Soulter merged commit 792fb69 into master Dec 30, 2025
6 checks passed
@Soulter Soulter deleted the codex/fix-chunk-overlap-parameter-to-allow-zero branch December 30, 2025 07:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

codex size:S This PR changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] 新版知识库分块重叠限制最小50

2 participants