Streaming lets you receive tokens as they are generated.

Plain streaming#

LlamaBridge.generateStream(
  prompt = "Stream a short poem.",
  callback = object : GenStream {
    override fun onDelta(text: String) = print(text)
    override fun onComplete() = println("\n✅ done")
    override fun onError(message: String) = println("❌ $message")
  }
)

Streaming with context#

LlamaBridge.generateStreamWithContext(
  systemPrompt = "You are concise.",
  contextBlock = "Topic: on-device LLMs.",
  userPrompt = "Give me 3 bullet points.",
  callback = object : GenStream { /* ... */ }
)

Lifecycle tips#

  • Call nativeCancelGenerate() when leaving the screen to stop background work.
  • If you hold a reference to the callback, avoid leaking UI scope objects.