Local Models for Cue: Run digital models privately on your own machine for enhanced privacy, offline access, and full control over your data.

Why Use Local Models?

Your code and conversations never leave your machine

Download models and start servers with Cue's GUI - no terminal commands

Work without internet connection once models are downloaded

No usage limits, monthly fees, or per-token charges

Setup Options

Easy-to-use desktop application for running local digital models

Install Ollama:

brew install ollama

Then start Ollama service:

ollama serve

Once Ollama is installed, Cue's built-in model discovery feature makes it easy to browse and download models directly from the app.

How it works:

High-performance C++ implementation for technical users

Install llama.cpp:

brew install llama.cpp

That's it! Cue handles the rest.

Once llama.cpp is installed, Cue provides a complete GUI for downloading GGUF models and managing servers - no command line needed!

What Cue does for you:

Need help choosing? Start with DeepSeek-R1 8B 0528 for reasoning tasks or Qwen3 14B for agent tasks. You can always download more models later!