AI Tools #jan-ai#local-llm#open-source

Jan.ai Guide: Open-Source Local AI Assistant for 2026

Jan.ai is a fully open-source ChatGPT alternative that runs locally. Learn how to install Jan, run local models, and use it as a privacy-first AI assistant.

7 min read

Jan.ai is a fully open-source desktop application for running AI models locally. Built on top of llama.cpp, it offers a clean ChatGPT-like interface while keeping all your data on your machine. Unlike LM Studio (which is source-available but not fully open), Jan is MIT-licensed and community-driven. It also ships with a built-in model hub, an OpenAI-compatible API server, and experimental multi-engine support.

Why Jan.ai?

Jan stands out in the crowded local LLM space for a few reasons:

  • Fully open-source — MIT licensed, with the full source on GitHub at janhq/jan
  • Cross-platform — macOS, Windows, and Linux with native GPU support
  • Extensions system — Plugin architecture for adding new engines, themes, and tools
  • Remote model support — Connect to OpenAI, Anthropic, or Groq alongside local models
  • Jan Assistant — Built-in AI assistant with persistent memory and tool use

System Requirements

Jan runs on modest hardware, but more is always better:

HardwareMinimumFor good performance
RAM8 GB16 GB+
GPU VRAM4 GB (optional)8 GB+
Storage5 GB + models50 GB+
CPUAny x64 or ARMRecent multi-core

Apple Silicon (M1/M2/M3/M4) users get excellent Metal acceleration. NVIDIA and AMD GPUs are supported on all platforms.

Installation

Download the latest release from jan.ai or the GitHub releases page:

# Linux — download the .deb or .AppImage
# For Debian/Ubuntu:
wget https://github.com/janhq/jan/releases/latest/download/jan-linux-amd64.deb
sudo dpkg -i jan-linux-amd64.deb

# AppImage alternative
wget https://github.com/janhq/jan/releases/latest/download/jan-linux-x86_64.AppImage
chmod +x jan-linux-x86_64.AppImage
./jan-linux-x86_64.AppImage

On Windows, run the .exe installer. On macOS, mount the .dmg and drag Jan to Applications. Apple Silicon and Intel Macs each have separate installers — pick the right one.

Downloading and Running Models

Jan includes a built-in Hub for discovering models. Click the Hub icon in the left sidebar to browse:

  • Llama 3.2 3B Instruct — Great default for everyday tasks, 2 GB
  • Llama 3.1 8B Instruct — Stronger reasoning, 5 GB
  • Mistral 7B Instruct — Fast and instruction-tuned, 4 GB
  • Gemma 2 9B IT — Google’s instruction model, 5 GB
  • DeepSeek-R1 7B — Reasoning and math tasks, 4 GB
  • Qwen 2.5 Coder 7B — Excellent for code generation

Click Download next to any model. Jan shows the file size, quantization level, and VRAM estimate before you commit.

Adding Custom GGUF Models

Jan can load any GGUF model file you already have on disk:

  1. Open Jan and go to Settings > Models
  2. Click Import Model
  3. Select your .gguf file
  4. Jan registers it and makes it available in the model picker

You can also drop GGUF files directly into Jan’s model folder:

  • macOS/Linux: ~/jan/models/
  • Windows: C:\Users\<username>\jan\models\

Create a subfolder for each model and include a model.json metadata file, or let Jan auto-detect it.

Using the Chat Interface

Jan’s chat interface is intentionally minimal:

  1. Click New Thread (pencil icon)
  2. Select a model from the dropdown at the top
  3. Type your message and press Enter

The right panel shows model settings you can adjust per-conversation:

  • System Prompt — Define the AI’s role and behavior
  • Temperature — Response randomness (0–1)
  • Max Tokens — Response length limit
  • Context Length — How much conversation history the model sees
  • Top P / Top K — Sampling parameters

Creating Assistants

Jan’s Assistants feature lets you save custom configurations:

  1. Go to Assistants in the left sidebar
  2. Click Create Assistant
  3. Give it a name, system prompt, and default model
  4. Save — it appears in your assistant list for quick access

You can build specialized assistants for coding, writing, security research, or any other domain, each with their own system prompt and model settings.

Jan’s Local API Server

Jan runs an OpenAI-compatible API server that your applications can use:

  1. Go to Settings > Advanced
  2. Enable Jan API Server
  3. The server starts on http://localhost:1337 by default

Connecting to Jan’s API

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:1337/v1",
    api_key="jan-local"  # any value
)

response = client.chat.completions.create(
    model="llama3.2-3b-instruct",
    messages=[
        {"role": "user", "content": "What is the difference between TCP and UDP?"}
    ]
)
print(response.choices[0].message.content)

List available models via the API:

curl http://localhost:1337/v1/models

Connecting Remote AI Providers

Jan is not just for local models. You can configure remote providers alongside your local ones:

  1. Go to Settings > Model Providers
  2. Add your OpenAI API key for GPT-4o access
  3. Add Anthropic key for Claude access
  4. Add Groq key for fast cloud inference

This lets you switch between a local Llama model for private tasks and GPT-4o for tasks requiring more power — all from the same interface.

Extensions and Customization

Jan’s extension system is one of its differentiating features. Access it via Settings > Extensions:

  • TensorRT-LLM Engine — NVIDIA-optimized inference engine for RTX GPU owners (faster than llama.cpp)
  • Nitro Engine — Jan’s own optimized inference backend
  • Themes — Switch between light and dark themes

The extension API is documented at jan.ai/docs, and the community has built additional engines, tools, and integrations.

File Storage and Privacy

All Jan data lives in ~/jan/ on your system:

~/jan/
├── models/          # Downloaded GGUF files
├── threads/         # Chat history (JSON files)
├── assistants/      # Saved assistant configs
├── extensions/      # Installed extensions
└── settings.json    # App configuration

Nothing is sent externally when using local models. Chat history, prompts, and responses are stored in plain JSON in ~/jan/threads/. You can back these up, move them between machines, or delete them freely.

Jan vs. LM Studio vs. Ollama

FeatureJanLM StudioOllama
Open sourceYes (MIT)PartialYes (MIT)
GUIYesYesNo (needs add-on)
API serverYesYesYes
Remote providersYesLimitedNo
ExtensionsYesNoNo
Best forPrivacy + flexibilityBeginnersDevelopers

Performance Tips

  1. Enable GPU acceleration — Check Settings > GPU Settings. Jan shows how many layers are GPU-accelerated.
  2. Reduce context length — If responses are slow, lower the context window from 4096 to 2048 tokens.
  3. Use smaller quantizations for speed — Q4_K_M is the sweet spot for most users.
  4. Close unused threads — Each loaded model occupies VRAM. Unload models you’re not using.
  5. Use the TensorRT extension on NVIDIA — It can be 2–3x faster than the default engine for RTX 3000/4000 series cards.

Jan.ai delivers a complete, privacy-first AI assistant experience with no telemetry, no subscriptions, and no vendor lock-in. For users who want the full ChatGPT experience but entirely on their own hardware, Jan is among the best options available in 2026.

#privacy #ai #open-source #local-llm #jan-ai