LM Playground lets you run large language models directly on your Android device. Download models, load them in one tap, and chat — all offline, all private. No cloud servers, no API keys, no data leaving your device.
KEY FEATURES
On-device inference — all processing happens locally on your device. Your conversations stay private and never leave your phone.
Chat history — all your conversations are saved and organized. Pin, rename, or delete sessions from the sidebar. Resume any conversation …
LM Playground lets you run large language models directly on your Android device. Download models, load them in one tap, and chat — all offline, all private. No cloud servers, no API keys, no data leaving your device.
KEY FEATURES
On-device inference — all processing happens locally on your device. Your conversations stay private and never leave your phone.
Chat history — all your conversations are saved and organized. Pin, rename, or delete sessions from the sidebar. Resume any conversation right where you left off.
Rich chat experience — responses are rendered with full markdown support including headers, code blocks, lists, bold, italic, and more.
Reasoning models — see the thinking process of models like DeepSeek R1, Nemotron, and LFM2.5 Thinking displayed in a styled, collapsible section with adjustable thinking budget.
Tools — capable models can search the web, fetch a page, and run JavaScript right inside a reply. Each tool is off by default; turn on only the ones you want, per model, in Settings → Tools. Web search and fetch reach the internet only when you switch them on — everything else stays on your device.
Background generation — start a reply, then leave the app and it keeps running. A live notification shows generation status and token count, lets you copy or share the result without reopening the app, and chimes when it finishes.
Generation speed tracking — see token count, generation time, and tok/s speed for every response.
Per-model parameters — each model remembers its own generation settings. Fine-tune context size, thinking budget, temperature, Top-P, Top-K, Min-P, repetition penalty, and seed.
System prompts — save reusable instructions once and pick the right one for any model. Keep tone, role, or output format consistent across sessions.
Custom models — load your own GGUF model files from any source alongside the built-in catalog.
Reliable downloads — custom download engine with progress notifications, speed and ETA display, and automatic resume on network interruptions.
Flexible storage — choose where to store multi-GB model files using Android's Storage Access Framework. Easily move models between locations.
Optimized performance — ARM-optimized with KleidiAI kernels and OpenMP for faster generation on arm64 devices.
Starting from just 267 MB for the smallest model. Larger models (4B–8B) benefit from 8+ GB RAM, and the 20B model needs a high-end device with plenty of memory. You can also load any custom GGUF model.
OPEN SOURCE
LM Playground is open source under the MIT License. Powered by llama.cpp with models from Hugging Face.