DiffusionGemma Offers Up to 4× Faster Text Generation
Google’s new experimental model, DiffusionGemma, uses a diffusion‑based approach to generate text blocks in parallel. On dedicated GPUs it can produce up to four times more tokens per second than the autoregressive Gemma 4 models, making it attractive for low‑latency, interactive local applications.