LLM Setup

TracePcap’s AI features (Story Mode and AI Filter Generator) require an OpenAI-compatible inference server. Any server that implements the /v1/chat/completions endpoint works.

Supported Servers

LM Studio (recommended for offline use)

Download and install LM Studio.
Download a model (e.g. Qwen2.5-14B-Coder-Instruct).
Start the local server on port 1234 (default).

In .env:

LLM_API_BASE_URL=http://localhost:1234/v1
LLM_API_KEY=
LLM_MODEL=Qwen2.5-14B-Coder-Instruct

Ollama

Install Ollama.
Pull a model: ollama pull qwen2.5-coder:14b
Ollama serves on port 11434 by default.

In .env:

LLM_API_BASE_URL=http://localhost:11434/v1
LLM_API_KEY=ollama
LLM_MODEL=qwen2.5-coder:14b

OpenAI API (not suitable for offline deployments)

LLM_API_BASE_URL=https://api.openai.com/v1
LLM_API_KEY=sk-...
LLM_MODEL=gpt-4o

Model Recommendations

For best results with filter generation and narrative analysis, use a model with strong instruction-following and code understanding:

Qwen2.5-14B-Coder-Instruct — good balance of quality and speed on CPU.
Qwen2.5-7B-Coder-Instruct — lighter option for machines with less RAM.
Llama-3.1-8B-Instruct — general-purpose alternative.

Testing the Connection

After configuring .env and restarting the stack, open the AI Filter tab. If the LLM server is reachable, the text input will be enabled. If it shows a warning, check:

The inference server is running.
LLM_API_BASE_URL is correct and reachable from inside the Docker network (use the host’s LAN IP, not localhost, if the LLM runs on the same machine but outside Docker).
The model name in LLM_MODEL matches a loaded model on the server.

Note

If TracePcap runs in Docker and the LLM server runs on the host machine, use http://host.docker.internal:1234/v1 (Docker Desktop) or the host’s LAN IP address instead of localhost.