Skip to content

Conversation

@simplify123
Copy link
Contributor

@simplify123 simplify123 commented Dec 27, 2025

修复amr/silk 自动识别和转换逻辑 解决了#4225 存在的bug

fixes: #4225

Modifications / 改动点

实现了直接使用qq发送语音消息时,xinference中的sensevoice也可以正常识别转换成文字,而不再出现“Xinference STT failed: INVALID”错误

  • This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果

408706F5-5C8F-4F95-9972-B99C99236D6B

Checklist / 检查清单

  • 😊 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。/ If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
  • 👀 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”。/ My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
  • 🤓 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到了 requirements.txtpyproject.toml 文件相应位置。/ I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
  • 😮 我的更改没有引入恶意代码。/ My changes do not introduce malicious code.

Summary by Sourcery

改进 Xinference STT 提供方中的音频格式检测与转换逻辑,以正确处理 QQ 语音消息并避免出现 INVALID 错误。

Bug Fixes(错误修复):

  • 修复对 amr/silk QQ 语音消息的不正确处理,该问题会导致 Xinference STT 出现 INVALID 错误。

Enhancements(功能增强):

  • 优化音频格式检测,通过魔术字节和文件扩展名区分 SILK 和 AMR,在转录前将每种格式分别路由到合适的转换流程中。
Original summary in English

Summary by Sourcery

Improve audio format detection and conversion in the Xinference STT provider to correctly handle QQ voice messages and avoid INVALID errors.

Bug Fixes:

  • Fix incorrect handling of amr/silk QQ voice messages that caused Xinference STT INVALID errors.

Enhancements:

  • Refine audio format detection by distinguishing SILK and AMR via magic bytes and file extensions, routing each format through the appropriate conversion path before transcription.

@auto-assign auto-assign bot requested review from Raven95676 and Soulter December 27, 2025 17:09
@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Dec 27, 2025
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - 我在这里给出了一些总体反馈:

  • 在使用 audio_bytes[:8]audio_bytes[:6] 检查 magic bytes 时,建议先对非常短的 audio_bytes 做保护(例如:if len(audio_bytes) >= 8 and ...),以避免在输入损坏或被截断时出现边界情况错误。
  • 既然现在对 AMR 和 SILK 的处理逻辑已经区分开了,那么在 convert_to_pcm_wav/tencent_silk_to_wav 失败时(例如捕获异常并记录清晰的日志信息,或执行回退逻辑)做一些防护,可能会更安全,这样当转换出问题导致 STT 失败时更容易定位问题。
给 AI Agents 的提示
Please address the comments from this code review:

## Overall Comments
- When checking magic bytes with `audio_bytes[:8]` and `audio_bytes[:6]`, consider guarding for very short `audio_bytes` (e.g., `if len(audio_bytes) >= 8 and ...`) to avoid edge-case errors on malformed or truncated inputs.
- Now that AMR and SILK are handled differently, it may be safer to handle a failed `convert_to_pcm_wav`/`tencent_silk_to_wav` (e.g., catching exceptions and logging a clear message or falling back) so that STT failures are easier to diagnose when conversion breaks.

Sourcery 对开源项目是免费的——如果你觉得这次代码审查有帮助,欢迎分享 ✨
帮我变得更有用!请在每条评论上点击 👍 或 👎,我会根据你的反馈改进后续的代码审查。
Original comment in English

Hey - I've left some high level feedback:

  • When checking magic bytes with audio_bytes[:8] and audio_bytes[:6], consider guarding for very short audio_bytes (e.g., if len(audio_bytes) >= 8 and ...) to avoid edge-case errors on malformed or truncated inputs.
  • Now that AMR and SILK are handled differently, it may be safer to handle a failed convert_to_pcm_wav/tencent_silk_to_wav (e.g., catching exceptions and logging a clear message or falling back) so that STT failures are easier to diagnose when conversion breaks.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- When checking magic bytes with `audio_bytes[:8]` and `audio_bytes[:6]`, consider guarding for very short `audio_bytes` (e.g., `if len(audio_bytes) >= 8 and ...`) to avoid edge-case errors on malformed or truncated inputs.
- Now that AMR and SILK are handled differently, it may be safer to handle a failed `convert_to_pcm_wav`/`tencent_silk_to_wav` (e.g., catching exceptions and logging a clear message or falling back) so that STT failures are easier to diagnose when conversion breaks.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@dosubot dosubot bot added the area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. label Dec 27, 2025
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Dec 28, 2025
@Soulter Soulter changed the title Update xinference_stt_provider.py fix: Xinference STT failed: INVALID Dec 28, 2025
@Soulter Soulter merged commit f518109 into AstrBotDevs:master Dec 28, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. lgtm This PR has been approved by a maintainer size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]QQ客户端发送语音Converting silk/amr file to wav转换失败

2 participants