fix: Xinference STT failed: INVALID #4231

simplify123 · 2025-12-27T17:09:40Z

修复amr/silk 自动识别和转换逻辑解决了#4225 存在的bug

fixes: #4225

Modifications / 改动点

实现了直接使用qq发送语音消息时，xinference中的sensevoice也可以正常识别转换成文字，而不再出现“Xinference STT failed: INVALID”错误

This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果

Checklist / 检查清单

😊 如果 PR 中有新加入的功能，已经通过 Issue / 邮件等方式和作者讨论过。/ If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
👀 我的更改经过了良好的测试，并已在上方提供了“验证步骤”和“运行截图”。/ My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
🤓 我确保没有引入新依赖库，或者引入了新依赖库的同时将其添加到了 requirements.txt 和 pyproject.toml 文件相应位置。/ I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
😮 我的更改没有引入恶意代码。/ My changes do not introduce malicious code.

Summary by Sourcery

改进 Xinference STT 提供方中的音频格式检测与转换逻辑，以正确处理 QQ 语音消息并避免出现 INVALID 错误。

Bug Fixes（错误修复）:

修复对 amr/silk QQ 语音消息的不正确处理，该问题会导致 Xinference STT 出现 INVALID 错误。

Enhancements（功能增强）:

优化音频格式检测，通过魔术字节和文件扩展名区分 SILK 和 AMR，在转录前将每种格式分别路由到合适的转换流程中。

Original summary in English

Summary by Sourcery

Improve audio format detection and conversion in the Xinference STT provider to correctly handle QQ voice messages and avoid INVALID errors.

Bug Fixes:

Fix incorrect handling of amr/silk QQ voice messages that caused Xinference STT INVALID errors.

Enhancements:

Refine audio format detection by distinguishing SILK and AMR via magic bytes and file extensions, routing each format through the appropriate conversion path before transcription.

sourcery-ai

Hey - 我在这里给出了一些总体反馈：

在使用 audio_bytes[:8] 和 audio_bytes[:6] 检查 magic bytes 时，建议先对非常短的 audio_bytes 做保护（例如：if len(audio_bytes) >= 8 and ...），以避免在输入损坏或被截断时出现边界情况错误。
既然现在对 AMR 和 SILK 的处理逻辑已经区分开了，那么在 convert_to_pcm_wav/tencent_silk_to_wav 失败时（例如捕获异常并记录清晰的日志信息，或执行回退逻辑）做一些防护，可能会更安全，这样当转换出问题导致 STT 失败时更容易定位问题。

给 AI Agents 的提示

Please address the comments from this code review:

## Overall Comments
- When checking magic bytes with `audio_bytes[:8]` and `audio_bytes[:6]`, consider guarding for very short `audio_bytes` (e.g., `if len(audio_bytes) >= 8 and ...`) to avoid edge-case errors on malformed or truncated inputs.
- Now that AMR and SILK are handled differently, it may be safer to handle a failed `convert_to_pcm_wav`/`tencent_silk_to_wav` (e.g., catching exceptions and logging a clear message or falling back) so that STT failures are easier to diagnose when conversion breaks.

Sourcery 对开源项目是免费的——如果你觉得这次代码审查有帮助，欢迎分享 ✨

_{帮我变得更有用！请在每条评论上点击 👍 或 👎，我会根据你的反馈改进后续的代码审查。}

Original comment in English

Hey - I've left some high level feedback:

When checking magic bytes with audio_bytes[:8] and audio_bytes[:6], consider guarding for very short audio_bytes (e.g., if len(audio_bytes) >= 8 and ...) to avoid edge-case errors on malformed or truncated inputs.
Now that AMR and SILK are handled differently, it may be safer to handle a failed convert_to_pcm_wav/tencent_silk_to_wav (e.g., catching exceptions and logging a clear message or falling back) so that STT failures are easier to diagnose when conversion breaks.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- When checking magic bytes with `audio_bytes[:8]` and `audio_bytes[:6]`, consider guarding for very short `audio_bytes` (e.g., `if len(audio_bytes) >= 8 and ...`) to avoid edge-case errors on malformed or truncated inputs.
- Now that AMR and SILK are handled differently, it may be safer to handle a failed `convert_to_pcm_wav`/`tencent_silk_to_wav` (e.g., catching exceptions and logging a clear message or falling back) so that STT failures are easier to diagnose when conversion breaks.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

Update xinference_stt_provider.py

290b71b

auto-assign bot requested review from Raven95676 and Soulter December 27, 2025 17:09

dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Dec 27, 2025

sourcery-ai bot reviewed Dec 27, 2025

View reviewed changes

dosubot bot added the area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. label Dec 27, 2025

simplify123 mentioned this pull request Dec 27, 2025

[Bug]QQ客户端发送语音Converting silk/amr file to wav转换失败 #4225

Closed

2 tasks

Soulter approved these changes Dec 28, 2025

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label Dec 28, 2025

Soulter changed the title ~~Update xinference_stt_provider.py~~ fix: Xinference STT failed: INVALID Dec 28, 2025

chore: ruff format

6ec2778

Soulter merged commit f518109 into AstrBotDevs:master Dec 28, 2025
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix: Xinference STT failed: INVALID #4231

fix: Xinference STT failed: INVALID #4231

Uh oh!

simplify123 commented Dec 27, 2025 •

edited by Soulter

Loading

Uh oh!

sourcery-ai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

fix: Xinference STT failed: INVALID #4231

fix: Xinference STT failed: INVALID #4231

Uh oh!

Conversation

simplify123 commented Dec 27, 2025 • edited by Soulter Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

修复amr/silk 自动识别和转换逻辑 解决了#4225 存在的bug

Modifications / 改动点

Screenshots or Test Results / 运行截图或测试结果

Checklist / 检查清单

Summary by Sourcery

Summary by Sourcery

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

simplify123 commented Dec 27, 2025 •

edited by Soulter

Loading

修复amr/silk 自动识别和转换逻辑解决了#4225 存在的bug