Google has announced Gemma 4 12B, its latest open-source generative AI model, which the firm says is small enough to run locally while still capable of performing complex tasks.
The model is intended to strike a balance between performance and size, with a reduced memory footprint that Google claims will “bring agentic multimodal intelligence directly to laptops”.
“Gemma 4 12B delivers performance nearing our larger 26B Mixture of Experts model on standard benchmarks, but at less than half the total memory footprint,” Google said in a statement. “Small enough to run locally on consumer laptops with 16GB of RAM, it unlocks powerful multimodal and agentic experiences right on your machine.”
In AI benchmarks such as MMLU Pro and GPQA Diamond, which test language processing and graduate-level scientific knowledge, Gemma 4 12B achieves 77.2 per cent and 78.8 per cent accuracy respectively. On these measures it outperforms some open models intended to run on-device, such as Meta’s Llama 4 Scout, which scored 74.3 per cent and 57.2 per cent on the same benchmarks.
Developers on reddit have already drawn comparisons between Gemma 4 12B and Alibaba’s Qwen 3.5-9B, which beats Google’s new model on the above benchmarks, scoring 82.5 per cent on MMLU Pro and 81.7 per cent on GPQA Diamond, but performs worse in coding tests such as LiveCodeBench v6.
One reddit user said that Gemma 4 12B is “hands down better” for text work, while another added the model is even superior to the newer model Qwen-3.6 models when it comes to making the model follow specific instructions via retrieval augmented generation.
In the latter, Gemma 4 12B scores 72 per cent versus Qwen3.5-9B’s 65.6 per cent.
Users looking to run Gemma 4 12B on their laptops will need at least 16GB unified memory or VRAM, beyond the scope of entry-level laptops but well within the performance range of devices aimed at professionals.
The Gemma family of AI models is available under the open source Apache 2.0 licence, unlike its frontier Gemini models which are proprietary and only accessible via Google Cloud.




