WhisperBridge provides on-device speech-to-text using Whisper models.
Its API is intentionally small: initialize a model, transcribe a WAV file, and release resources when done.
Platform support#
Current support in the library code:
- Android: supported
- iOS: supported
- JVM / Desktop: supported
- WASM: currently not implemented
Public API#
expect object WhisperBridge {
fun getModelPath(modelFileName: String): String
fun initModel(modelPath: String): Boolean
fun transcribeWav(
wavPath: String,
language: String? = null,
initialPrompt: String? = null,
): String
fun release()
}getModelPath(modelFileName)#
Returns a platform-usable path for the Whisper model file.
val modelPath = WhisperBridge.getModelPath("ggml-base.en.bin")initModel(modelPath)#
Loads the Whisper model and prepares the transcription context.
val ok = WhisperBridge.initModel(modelPath)
check(ok) { "Failed to initialize Whisper model" }Call this once before transcription.
transcribeWav(wavPath, language, initialPrompt)#
Transcribes a WAV audio file into text.
val text = WhisperBridge.transcribeWav(
wavPath = "/path/to/sample.wav",
language = "en",
initialPrompt = "The following is a technical discussion about Kotlin."
)
println(text)Parameters#
wavPath: path to the WAV file to transcribelanguage: optional language hint such as"en","es", or"fr". Providing it improves predictability and reduces ambiguity when you already know the input language.initialPrompt: optional text to prime the model before transcription begins. Use this to bias the output toward specific vocabulary, domain terms, or formatting conventions. The model treats this as prior context without transcribing it literally.
Return value#
Returns the transcription as a String.
If transcription fails, the result may be empty.
Supported input expectations#
The bridge API is WAV-oriented, so the simplest and most reliable flow is:
- record or load audio
- convert it to WAV if necessary
- call
transcribeWav(...)
If your app records audio in another format, perform a conversion step before calling the bridge.
release()#
Releases native Whisper resources.
WhisperBridge.release()Practical recommendations#
- Keep audio clean and reasonably short when testing.
- Normalize your pipeline around WAV files for fewer surprises.
- Reuse the initialized model when transcribing multiple files.
- Release resources when leaving the speech feature or shutting the app down.