LlamaPal is an Android app that runs open-source large language models (LLMs) like Llama, Mistral, Phi, and Gemma directly on your phone. After a model is downloaded, all chat happens 100% offline on your device — no cloud servers, no accounts, no internet required.

Which models can I run?

LlamaPal supports popular open-source GGUF models including Llama 3, Mistral, Phi, Gemma, Qwen, and Hermes. Each model is labeled by speed and quality so you can pick the right one for your device.

Yes. Core chat is free forever. Lightweight ads support development. An optional Pro subscription removes ads and unlocks premium themes and custom personas.

What devices are supported?

Android 9 (API 28) or higher on 64-bit ARM devices (arm64-v8a). 6 GB+ RAM is recommended for the best experience. Each model needs roughly 4–8 GB of free storage.

How is LlamaPal different from ChatGPT?

ChatGPT runs in the cloud and requires an OpenAI account; your messages are sent to OpenAI servers. LlamaPal runs open-source models on your phone with no account and no cloud. After a one-time model download, LlamaPal works fully offline.

How is LlamaPal different from Google Gemini?

Google Gemini is a cloud-hosted assistant tied to your Google account. LlamaPal runs entirely on-device using open-source models from Hugging Face. No Google account, no cloud, no telemetry on chat content.

How is LlamaPal different from Character.AI?

Character.AI is a cloud chatbot service. LlamaPal runs open-source LLMs locally on Android, with built-in characters and Pro custom personas, and zero cloud round-trips for chat.

Can I use LlamaPal on a flight or in airplane mode?

Yes. Once a model is downloaded, LlamaPal works fully in airplane mode, on flights, in dead zones, and without any SIM card.

Does LlamaPal collect or upload my conversations?

No. Chat content stays on your device. There are no accounts, no chat-data telemetry, and no cloud sync. The only network calls are catalog browsing, model downloads from Hugging Face, anonymous crash reports, and ads on the free tier.

What is GGUF and why does LlamaPal use it?

GGUF is the standard quantized model format used by llama.cpp. It packs open-source LLMs into compact files that can run efficiently on phones. LlamaPal downloads GGUF models from Hugging Face and runs them on-device.

Can I run Llama 3 on my Android phone with LlamaPal?

Yes. LlamaPal supports Llama 3.2 (1B, 3B) and Llama 3.1 (8B), plus Mistral, Phi, Gemma, Qwen, and Hermes. The catalog labels each model by speed and quality so you can pick one that fits your device.

Does LlamaPal need an account or signup?

No. LlamaPal works without any account, login, or signup. Just install, download a model, and chat.

Is LlamaPal open source?

LlamaPal uses the open-source llama.cpp inference engine and runs open-source community models from Hugging Face. The on-device LLM runtime will be released as an open-source library on Maven Central.

How much storage does LlamaPal need?

The app itself is small. Each model adds roughly 4–8 GB depending on size and quantization. You can keep multiple models or delete ones you don't use.

Can I use voice input with LlamaPal?

Yes. LlamaPal supports voice input via on-device speech recognition, so even voice stays private.

Is LlamaPal available on iPhone or iOS?

LlamaPal is an Android app today. iOS is not supported at the moment.

LlamaPal — Private Offline AI for Android · The ChatGPT alternative that runs on your phone

LlamaPal is a free Android app that runs open-source large language models — Llama 3, Llama 3.2, Mistral 7B, Phi-3, Phi-3.5, Gemma 2, Qwen 2.5, Hermes 3, Nous Hermes, TinyLlama — entirely on your phone. After a one-time model download, every chat happens 100% offline using the open-source llama.cpp inference engine. No accounts, no cloud, no telemetry on chat content, no internet required for inference.

LlamaPal is the private, offline alternative to cloud chatbots such as ChatGPT, Google Gemini, Claude, Perplexity, Pi, and Character.AI. It is ideal for private journaling, flights and airplane mode, travel without roaming, brainstorming without sending ideas to a cloud provider, offline coding help, language practice, and studying.

Why people choose LlamaPal

100% offline AI chat. Works on flights, in airplane mode, and in dead zones after the one-time model download.
Fully private. Inference happens on-device using llama.cpp. Conversations never leave your phone.
No accounts. No signup, no email, no cloud sync, no telemetry on chat content.
Free. Core chat is free forever. Optional Pro subscription removes ads and unlocks premium themes and custom personas.
Open-source models. Curated catalog of GGUF models from Hugging Face. Pick one labeled for your phone's speed and quality.
Voice input. On-device speech recognition. Material 3 dark UI designed for long sessions.

How LlamaPal compares

ChatGPT runs in the cloud and requires an OpenAI account. Google Gemini is tied to your Google account and runs on Google servers. Anthropic Claude is a cloud API. Character.AI, Pi, and Perplexity are all cloud services. LlamaPal is different — it runs open-source models locally on your Android phone with no account, no cloud, and no internet required for chat.

Supported open-source LLMs

Llama 3.2 (1B, 3B) · Llama 3.1 (8B) · Mistral 7B · Phi-3 · Phi-3.5 · Gemma 2 (2B, 9B) · Qwen 2.5 · Hermes 3 · Nous Hermes · TinyLlama — plus other community GGUF models from Hugging Face.

Device requirements

Android 9 (API 28) or higher, 64-bit ARM (arm64-v8a). 6 GB+ RAM recommended. Each model needs roughly 4–8 GB of free storage. Internet is only needed for browsing the model catalog and downloading models.

Common questions

Does LlamaPal work offline? Yes. After a model is downloaded once, the app works completely offline, including airplane mode, flights, and dead zones.

Is LlamaPal really private? Yes. Inference runs on-device using llama.cpp. Conversations never leave the phone. There are no accounts and no telemetry on chat content.

Is LlamaPal a ChatGPT alternative? Yes — a private, offline alternative. Instead of sending messages to a remote server, LlamaPal runs the model locally on your phone.

Is LlamaPal free? Yes. Core chat is free forever. An optional Pro subscription removes ads and unlocks premium themes and custom personas.

How do I run Llama 3 on my Android phone? Install LlamaPal from Google Play, open the model catalog, and download a Llama 3 GGUF model sized for your device.

Is there a ChatGPT app that works in airplane mode? Yes — LlamaPal is built for it. Inference runs locally so no internet is needed for chat.

What is GGUF? GGUF is the quantized model format used by llama.cpp. LlamaPal downloads GGUF models from Hugging Face and runs them on-device.

Download LlamaPal free on the Google Play Store.

LlamaPal — Private Offline AI for Android

AI you own — not AI you rent.

LlamaPal runs open-source AI models — Llama 3, Mistral, Phi, Gemma, Qwen, Hermes — directly on your Android phone. 100% offline, fully private, no accounts, no cloud, no internet required after the one-time model download.

Get LlamaPal free on Google Play

Why LlamaPal

100% Offline. Works in airplane mode after the first model download. No internet required for chat.
Fully private. Inference runs on-device via llama.cpp. Conversations never leave your phone.
No accounts. No signup, no cloud, no telemetry on chat content.
Open-source models. Curated catalog of community GGUF models from Hugging Face.
Free forever. Optional Pro subscription removes ads and unlocks premium themes and custom personas.
Voice input. On-device speech recognition. Material 3 dark UI designed for long sessions.

Supported open-source LLMs

Llama 3.2 · Llama 3.1 · Mistral 7B · Phi-3 · Phi-3.5 · Gemma 2 · Qwen 2.5 · Hermes 3 · TinyLlama · Nous Hermes — and more community GGUF models from Hugging Face.

How it works

Install LlamaPal free from Google Play (Android 9+, 64-bit ARM, 6 GB RAM recommended).
Browse the model catalog and download a GGUF model sized for your device (typically 4–8 GB).
Chat fully offline — even in airplane mode. Your messages stay on-device.

How LlamaPal compares

vs. ChatGPT: ChatGPT runs in the cloud and needs an OpenAI account. LlamaPal runs open-source models on your phone with no account and no cloud.
vs. Google Gemini: Gemini is tied to your Google account and runs on Google servers. LlamaPal is on-device and account-free.
vs. Claude: Claude is a cloud assistant from Anthropic. LlamaPal runs locally — no API keys, no usage limits, no internet for chat.
vs. Character.AI / Pi / Perplexity: All cloud-hosted. LlamaPal is local-first with built-in characters and offline operation.

Great use cases

Private journaling and venting
AI on flights and in airplane mode
Travel without roaming data
Brainstorming without sending ideas to a cloud
Coding help when offline
Studying / language practice on the go
Running Llama 3, Mistral, Phi, Gemma, Qwen on a phone

Is LlamaPal a ChatGPT alternative?

Yes. LlamaPal is a private, offline alternative to cloud chatbots like ChatGPT, Gemini, and Claude. Instead of sending your messages to a remote server, LlamaPal runs the model locally on your phone using the open-source llama.cpp inference engine.

Privacy

Chat content never leaves the device. There are no accounts and no chat-data telemetry. The only network calls are: catalog browsing, model downloads from Hugging Face, anonymous crash reports, and ads on the free tier.

Download LlamaPal on Google Play →