LlamaPal — Private Offline AI for Android · The ChatGPT alternative that runs on your phone
LlamaPal is a free Android app that runs open-source large language
models — Llama 3, Llama 3.2, Mistral 7B, Phi-3, Phi-3.5, Gemma 2,
Qwen 2.5, Hermes 3, Nous Hermes, TinyLlama — entirely on your phone.
After a one-time model download, every chat happens 100% offline using
the open-source llama.cpp inference engine. No accounts, no cloud, no
telemetry on chat content, no internet required for inference.
LlamaPal is the private, offline alternative to cloud chatbots such as
ChatGPT, Google Gemini, Claude, Perplexity, Pi, and Character.AI. It
is ideal for private journaling, flights and airplane mode, travel
without roaming, brainstorming without sending ideas to a cloud
provider, offline coding help, language practice, and studying.
Why people choose LlamaPal
100% offline AI chat. Works on flights, in airplane mode, and in dead zones after the one-time model download.
Fully private. Inference happens on-device using llama.cpp. Conversations never leave your phone.
No accounts. No signup, no email, no cloud sync, no telemetry on chat content.
Free. Core chat is free forever. Optional Pro subscription removes ads and unlocks premium themes and custom personas.
Open-source models. Curated catalog of GGUF models from Hugging Face. Pick one labeled for your phone's speed and quality.
Voice input. On-device speech recognition. Material 3 dark UI designed for long sessions.
How LlamaPal compares
ChatGPT runs in the cloud and requires an OpenAI account. Google Gemini
is tied to your Google account and runs on Google servers. Anthropic
Claude is a cloud API. Character.AI, Pi, and Perplexity are all cloud
services. LlamaPal is different — it runs open-source models locally
on your Android phone with no account, no cloud, and no internet
required for chat.
Supported open-source LLMs
Llama 3.2 (1B, 3B) · Llama 3.1 (8B) · Mistral 7B · Phi-3 · Phi-3.5 ·
Gemma 2 (2B, 9B) · Qwen 2.5 · Hermes 3 · Nous Hermes · TinyLlama —
plus other community GGUF models from Hugging Face.
Device requirements
Android 9 (API 28) or higher, 64-bit ARM (arm64-v8a). 6 GB+ RAM
recommended. Each model needs roughly 4–8 GB of free storage. Internet
is only needed for browsing the model catalog and downloading models.
Common questions
Does LlamaPal work offline? Yes. After a model is downloaded once, the app works completely offline, including airplane mode, flights, and dead zones.
Is LlamaPal really private? Yes. Inference runs on-device using llama.cpp. Conversations never leave the phone. There are no accounts and no telemetry on chat content.
Is LlamaPal a ChatGPT alternative? Yes — a private, offline alternative. Instead of sending messages to a remote server, LlamaPal runs the model locally on your phone.
Is LlamaPal free? Yes. Core chat is free forever. An optional Pro subscription removes ads and unlocks premium themes and custom personas.
How do I run Llama 3 on my Android phone? Install LlamaPal from Google Play, open the model catalog, and download a Llama 3 GGUF model sized for your device.
Is there a ChatGPT app that works in airplane mode? Yes — LlamaPal is built for it. Inference runs locally so no internet is needed for chat.
What is GGUF? GGUF is the quantized model format used by llama.cpp. LlamaPal downloads GGUF models from Hugging Face and runs them on-device.