-
Notifications
You must be signed in to change notification settings - Fork 818
Consolidate AsrModule API: single synchronous transcribe() method #17059
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17059
Note: Links to docs will display an error until the docs builds have been completed. ❌ 3 New Failures, 22 Pending, 2 Unrelated FailuresAs of commit e9b55ad with merge base 1df4dac ( NEW FAILURES - The following jobs have failed:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
…hronous Co-authored-by: kirklandsign <107070759+kirklandsign@users.noreply.github.com>
Co-authored-by: kirklandsign <107070759+kirklandsign@users.noreply.github.com>
Co-authored-by: kirklandsign <107070759+kirklandsign@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR simplifies the AsrModule API by consolidating the transcribe and transcribeBlocking methods into a single synchronous method. The refactored transcribe method now returns the complete transcription string directly and throws an exception for non-zero result codes. Additionally, the onComplete callback mechanism has been removed from both JNI and Kotlin layers, as the function's return now indicates completion.
Changes:
- Merged
transcribeBlockingintotranscribe, making it returnStringinstead ofIntand throw exceptions on errors - Removed
onCompletecallback from AsrCallback interface and related JNI infrastructure - Simplified
transcribemethod signature with nullableconfigparameter and default value
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| extension/android/jni/jni_layer_asr.cpp | Removed onCompleteMethod from AsrCallbackCache and eliminated the onComplete callback invocation after transcription |
| extension/android/executorch_android/src/main/java/org/pytorch/executorch/extension/asr/AsrModule.kt | Merged transcribe and transcribeBlocking into single synchronous method that returns String and throws exceptions; simplified signature with nullable config parameter |
| extension/android/executorch_android/src/main/java/org/pytorch/executorch/extension/asr/AsrCallback.kt | Simplified interface to only contain onToken method; removed onComplete method and updated documentation |
Comments suppressed due to low confidence (1)
extension/android/executorch_android/src/main/java/org/pytorch/executorch/extension/asr/AsrModule.kt:172
- The PR description mentions updating unit tests to cover the new transcribe method behavior (throwing exceptions for non-zero result codes and returning the complete transcription string), but no test file for AsrModule exists. Other modules in the codebase have test coverage (e.g., ModuleInstrumentationTest.kt, LlmModuleInstrumentationTest.kt, TrainingModuleE2ETest.kt). Consider adding tests to verify:
- The transcribe method throws RuntimeException for non-zero result codes
- The transcribe method returns the complete transcription string correctly
- The callback receives tokens correctly when provided
- The method works correctly when callback is null
fun transcribe(
wavPath: String,
config: AsrTranscribeConfig? = null,
callback: AsrCallback? = null,
): String {
val handle = nativeHandle.get()
check(handle != 0L) { "AsrModule has been destroyed" }
val wavFile = File(wavPath)
require(wavFile.canRead() && wavFile.isFile) { "Cannot read WAV file: $wavPath" }
val effectiveConfig = config ?: AsrTranscribeConfig()
val result = StringBuilder()
val status =
nativeTranscribe(
handle,
wavPath,
effectiveConfig.maxNewTokens,
effectiveConfig.temperature,
effectiveConfig.decoderStartTokenId,
object : AsrCallback {
override fun onToken(token: String) {
result.append(token)
callback?.onToken(token)
}
},
)
if (status != 0) {
throw RuntimeException("Transcription failed with error code: $status")
}
return result.toString()
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Summary
Consolidates
transcribe()andtranscribeBlocking()into a single synchronous method. Returns transcription string directly; throws on error. Removes redundantonCompletecallback—function return indicates completion.API Changes:
Kotlin: Single
transcribe(wavPath, config?, callback?)methodString(wasIntstatus code)RuntimeExceptionon non-zero native resulttranscribeBlocking()methodAsrCallback interface: Removed
onComplete(String)methodonToken(String)remains for streamingJNI layer: Removed
onCompleteMethodinvocation and cachingUsage:
Breaking changes (intentional):
transcribeBlocking()removed → usetranscribe()transcribe()returnsStringinstead ofIntAsrCallback.onComplete()removedTest plan
API contract verified via code inspection. Full integration testing requires Android SDK/NDK environment with model artifacts.
Original prompt
This pull request was created from Copilot chat.
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.