Skip to main content
  1. Posts/

7 Tiny AI Models for Raspberry Pi

··242 words·2 mins·

πŸ“ 7 Tiny AI Models That Run on a Raspberry Pi

Thanks to quantization and modern architectures, AI no longer requires expensive servers. These 7 models with 1 to 4B parameters prove that small doesn’t mean weak. All you need is llama.cpp and a quantized model from Hugging Face.

πŸ€– The models:

  • πŸ₯‡ Qwen3 4B 2507 β€” Reasoning, math, code, tool calling, 256K context. Most recommended.
  • πŸ‘οΈ Qwen3 VL 4B β€” Multimodal with vision: images, video, and long text. Can act as an autonomous UI agent.
  • 🌍 EXAONE 4.0 1.2B β€” Just 1.2B params, reasoning mode, supports English, Korean, and Spanish.
  • πŸ”§ Ministral 3B β€” From Mistral AI, vision + structured JSON outputs.
  • 🧠 Jamba 3B β€” Hybrid Transformer-Mamba architecture, high efficiency, 256K context.
  • 🏒 Granite 4.0 Micro β€” IBM, enterprise-focused, RAG and function calling, Apache 2.0.
  • πŸ’Ž Phi-4 Mini β€” Microsoft, 3.8B params, high-quality synthetic training data, 128K context.

πŸ’‘ Quick explanation

Quantization “compresses” AI models by reducing number precision (from 32-bit to 4 or 8-bit floats). A model that would need 16GB RAM can fit into a Raspberry Pi’s 4GB, with minimal quality loss. That’s edge AI, literally!

More information at the link πŸ‘‡

Also published on LinkedIn.
Juan Pedro Bretti Mandarano
Author
Juan Pedro Bretti Mandarano