-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
fix(#4214): optimize pip install output decoding for cross-platform encoding compatibility #4249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Contributor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
你好——我发现了 1 个问题,并给出了一些整体反馈:
- 建议在模块级缓存
locale.getpreferredencoding(False)和sys.platform.startswith("win")的检查,这样_robust_decode在处理每一行输出时就不需要重复进行这些基本静态的查询。 - 当回退到使用
errors="replace"时,可以考虑在发生替换时加入一个简单的标记或调试日志,这样在排查 pip 输出中的意外编码问题时会更容易。
给 AI Agent 的提示
Please address the comments from this code review:
## Overall Comments
- Consider caching `locale.getpreferredencoding(False)` and the `sys.platform.startswith("win")` check at module level so `_robust_decode` doesn’t repeat these relatively static lookups for every output line.
- When falling back to `errors="replace"`, you might want to include a brief marker or debug log when replacements occur so it’s easier to diagnose unexpected encoding issues in the pip output.
## Individual Comments
### Comment 1
<location> `astrbot/core/utils/pip_installer.py:11-24` </location>
<code_context>
+def _robust_decode(line: bytes) -> str:
+ """解码字节流,兼容不同平台的编码"""
+ try:
+ return line.decode("utf-8").strip()
+ except UnicodeDecodeError:
+ pass
+ try:
+ return line.decode(locale.getpreferredencoding(False)).strip()
+ except UnicodeDecodeError:
+ pass
+ if sys.platform.startswith("win"):
+ try:
+ return line.decode("gbk").strip()
+ except UnicodeDecodeError:
+ pass
+ return line.decode("utf-8", errors="replace").strip()
+
+
</code_context>
<issue_to_address>
**suggestion:** 建议在本地编码和 Windows 特定的解码中同样使用 `errors="replace"`(或类似方式),以保留更多可读输出。
目前只有最后一次 UTF-8 解码使用了 `errors="replace"`;本地编码和 Windows 的 GBK 解码都是严格解码,然后再回退到带替换的 UTF-8。对于非 UTF-8 的输出,这可能会产生不必要的乱码。可以考虑让本地编码/GBK 解码也使用 `errors="replace"`(或 `"ignore"`),这样在大部分内容有效的情况下也能成功解码,并保留最后的 UTF-8 兜底,用于所有其他解码都失败的情形。
```suggestion
try:
return line.decode("utf-8").strip()
except UnicodeDecodeError:
pass
try:
# 使用系统首选编码,允许替换非法字符以尽量保留可读输出
return line.decode(locale.getpreferredencoding(False), errors="replace").strip()
except UnicodeDecodeError:
pass
if sys.platform.startswith("win"):
try:
# Windows 下常见 GBK 编码,同样使用替换策略
return line.decode("gbk", errors="replace").strip()
except UnicodeDecodeError:
pass
# 最后的兜底仍然使用 UTF-8 + replace,确保不会抛出异常
return line.decode("utf-8", errors="replace").strip()
```
</issue_to_address>帮我变得更有用!请在每条评论上点 👍 或 👎,我会根据你的反馈改进后续的代码审查。
Original comment in English
Hey - I've found 1 issue, and left some high level feedback:
- Consider caching
locale.getpreferredencoding(False)and thesys.platform.startswith("win")check at module level so_robust_decodedoesn’t repeat these relatively static lookups for every output line. - When falling back to
errors="replace", you might want to include a brief marker or debug log when replacements occur so it’s easier to diagnose unexpected encoding issues in the pip output.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider caching `locale.getpreferredencoding(False)` and the `sys.platform.startswith("win")` check at module level so `_robust_decode` doesn’t repeat these relatively static lookups for every output line.
- When falling back to `errors="replace"`, you might want to include a brief marker or debug log when replacements occur so it’s easier to diagnose unexpected encoding issues in the pip output.
## Individual Comments
### Comment 1
<location> `astrbot/core/utils/pip_installer.py:11-24` </location>
<code_context>
+def _robust_decode(line: bytes) -> str:
+ """解码字节流,兼容不同平台的编码"""
+ try:
+ return line.decode("utf-8").strip()
+ except UnicodeDecodeError:
+ pass
+ try:
+ return line.decode(locale.getpreferredencoding(False)).strip()
+ except UnicodeDecodeError:
+ pass
+ if sys.platform.startswith("win"):
+ try:
+ return line.decode("gbk").strip()
+ except UnicodeDecodeError:
+ pass
+ return line.decode("utf-8", errors="replace").strip()
+
+
</code_context>
<issue_to_address>
**suggestion:** Consider using `errors="replace"` (or similar) for the locale/Windows-specific decodes as well to retain more readable output.
Currently only the final UTF-8 decode uses `errors="replace"`; the locale and Windows GBK decodes are strict and then fall back to UTF-8 with replacement. For non-UTF-8 output this can create unnecessary mojibake. Consider allowing the locale/GBK decodes to use `errors="replace"` (or `"ignore"`) so they succeed when mostly valid, and keep the final UTF-8 fallback for cases where all other decodes fail.
```suggestion
try:
return line.decode("utf-8").strip()
except UnicodeDecodeError:
pass
try:
# 使用系统首选编码,允许替换非法字符以尽量保留可读输出
return line.decode(locale.getpreferredencoding(False), errors="replace").strip()
except UnicodeDecodeError:
pass
if sys.platform.startswith("win"):
try:
# Windows 下常见 GBK 编码,同样使用替换策略
return line.decode("gbk", errors="replace").strip()
except UnicodeDecodeError:
pass
# 最后的兜底仍然使用 UTF-8 + replace,确保不会抛出异常
return line.decode("utf-8", errors="replace").strip()
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Soulter
approved these changes
Dec 29, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #4214
Summary by Sourcery
Bug Fixes:
UnicodeDecodeError。Original summary in English
Summary by Sourcery
Bug Fixes:
Original summary in English
Summary by Sourcery
Bug Fixes:
UnicodeDecodeError。Original summary in English
Summary by Sourcery
Bug Fixes: