stabilityai/stable-diffusion-3.5-medium
Stability AI's Stable Diffusion 3.5 text-to-image family (medium 2.5B, large 8.1B, large-turbo) via vLLM-Omni with Cache-DiT acceleration
View on HuggingFaceGuide
Overview
Stable Diffusion 3.5 text-to-image generation models, served via vLLM-Omni with optional Cache-DiT acceleration.
Supported variants:
stabilityai/stable-diffusion-3.5-medium— 2.5B paramsstabilityai/stable-diffusion-3.5-large— 8.1B paramsstabilityai/stable-diffusion-3.5-large-turbo— 8.1B params (timestep-distilled for few-step inference)
Prerequisites
- vLLM-Omni on top of vLLM 0.12.0
diffuserslibrary
Installation
uv venv
source .venv/bin/activate
uv pip install vllm==0.12.0
uv pip install git+https://github.com/vllm-project/vllm-omni.git
Python Usage
from vllm_omni.entrypoints.omni import Omni
omni = Omni(model="stabilityai/stable-diffusion-3.5-medium")
images = omni.generate(
prompt="a cat wearing sunglasses, cyberpunk style",
negative_prompt="blurry, low quality",
height=1024, width=1024,
num_inference_steps=28,
guidance_scale=7.5,
num_outputs_per_prompt=2,
)
CLI Usage
python examples/offline_inference/text_to_image/text_to_image.py \
--model stabilityai/stable-diffusion-3.5-medium \
--prompt "a cat wearing sunglasses, cyberpunk style" \
--negative-prompt "blurry, low quality" \
--height 1024 --width 1024 \
--num-inference-steps 28 \
--guidance-scale 7.5
Cache-DiT Acceleration
Enable caching for significant speed-ups:
omni = Omni(
model="stabilityai/stable-diffusion-3.5-medium",
cache_backend="cache_dit",
cache_config={
"Fn_compute_blocks": 8,
"Bn_compute_blocks": 0,
"max_warmup_steps": 4,
"residual_diff_threshold": 0.12,
},
)
Key Parameters
| Parameter | Default | Description |
|---|---|---|
height | 1024 | Image height (multiples of 16) |
width | 1024 | Image width (multiples of 16) |
num_inference_steps | 28 | Denoising steps |
guidance_scale | 1.0 | Classifier-free guidance scale |