Local (Ollama)

Built-inLocal / offline

Run open models locally with Ollama. Blueprint Stack talks to your local Ollama server directly — there is no API key, no Node.js, and no extra adapter to install, and your prompts and project data never leave your machine. You pick which model runs; Blueprint Stack sends it the same editor tools it gives every other agent, so choose a model that supports tool / function calling.

Setup

1. Install Ollama

Download and install Ollama for your platform from ollama.com/download (Windows, macOS, and Linux).

2. Pull a model

# a good general-purpose, tool-capable model

ollama pull llama3.1

# other tool-capable options

ollama pull qwen2.5
ollama pull mistral-nemo

Blueprint Stack drives the Unreal Editor by calling tools, so pick a model that supports tool / function calling. Larger instruction-tuned models follow tool calls more reliably; very small models (1B–3B) often struggle and may loop or ignore tools.

3. Make sure the Ollama server is running

Open the Ollama desktop app, or run:

ollama serve

The server listens on http://127.0.0.1:11434 by default. To check it's reachable, open http://127.0.0.1:11434/api/tags in a browser — you should see a JSON list of your installed models.

4. Configure it in Blueprint Stack

Open the Blueprint Stack panel in Unreal, go to Settings > Local LLM (Ollama), and set:

-Host — leave blank to use http://127.0.0.1:11434, or enter your address if you run Ollama on a different port or another machine on your network.
-Default model — the model name to use when starting a new chat (for example llama3.1). You can also switch the model from the dropdown at any time, including mid-conversation.

5. Use it

Pick Local (Ollama) from the agent picker, choose a model from the dropdown (the list is populated from your Ollama library), and start chatting. The agent reads your project, calls editor tools, and reports back exactly like the cloud agents — just running on your own hardware.

Models

Any model in your local Ollama library appears in the model dropdown. For driving the editor, prefer instruction-tuned models with solid tool-calling support, such as:

-
Llama 3.1 (8B / 70B)Good general-purpose default with reliable tool calls.
-
Qwen2.5 (7B / 14B / 32B)Strong instruction following and tool use across sizes.
-
Mistral NemoCapable mid-size option with function calling.

The agent quality depends entirely on the model you choose and the hardware you run it on. If a model keeps misusing tools or going in circles, try a larger model or a different one.

Requirements

-Ollama installed and running
-At least one model pulled (ideally one that supports tool / function calling)
-Enough RAM / VRAM for the model you choose
-No API key, no Node.js, no internet connection (once the model is pulled)

If it can't connect

If your first prompt returns “Couldn't reach Ollama at http://127.0.0.1:11434”, the server isn't running or Blueprint Stack is pointed at the wrong address:

-Start the Ollama app, or run ollama serve, then resend the prompt.
-Confirm http://127.0.0.1:11434/api/tags loads in a browser. If you changed the port, update the Host field in Blueprint Stack settings.
-If you haven't pulled any models yet, run ollama pull llama3.1 first.