Stable Diffusion image generation • Llamatik Documentation

StableDiffusionBridge lets you generate images locally from text prompts. The bridge returns raw RGBA pixel data so your app can convert it into the most appropriate image type for the current platform.

Initialize the model#

val modelPath = StableDiffusionBridge.getModelPath("dreamshaper.safetensors")
val ok = StableDiffusionBridge.initModel(modelPath, threads = 4)
check(ok)

Generate an image#

val rgba = StableDiffusionBridge.txt2img(
    prompt = "A cinematic illustration of a llama explorer",
    negativePrompt = "blurry, low quality, extra limbs",
    width = 512,
    height = 512,
    steps = 20,
    cfgScale = 7.0f,
    seed = 1234L,
)

Understanding the result#

The output is an RGBA ByteArray. To display it, convert the bytes into your platform image type using the same width and height you requested.

Image-to-image generation#

img2img generates a new image starting from an existing one, guided by a text prompt. The source image must be provided as a raw RGBA byte array.

val sourceRgba: ByteArray = /* existing image bytes as RGBA */
val result = StableDiffusionBridge.img2img(
    initImageRgba = sourceRgba,
    initImageW = 512,
    initImageH = 512,
    prompt = "The same scene as a watercolor painting",
    negativePrompt = "low quality, blurry",
    width = 512,
    height = 512,
    steps = 20,
    cfgScale = 7.0f,
    strength = 0.75f,
    seed = 1234L,
)

The strength parameter controls how much the source image is preserved:

0.0 leaves the image almost unchanged
1.0 ignores the source entirely and acts like txt2img
values around 0.6–0.8 are a good starting point for style transfers

Make sure the initImageRgba array contains exactly initImageW * initImageH * 4 bytes.

Parameter guidance#

width and height: start with 512x512 unless you have benchmarked larger values
steps: around 20 is a good mobile-friendly default
cfgScale: around 7.0 to 8.0 is a common starting range
seed: use a fixed value for reproducible results
strength (img2img only): start around 0.75 and adjust to taste

Performance advice#

Image generation is significantly heavier than text generation. On mobile devices:

prefer moderate output sizes
avoid excessive step counts for the first iteration
reuse one initialized model for many generations

Cleanup#

StableDiffusionBridge.release()