docs

Agents

How agents work in Ninetrix — system prompts, metadata, runtime config, and execution.

An agent is a Docker container running a generated Python runtime. It reads a system prompt built from metadata, calls an LLM via the configured provider, executes tool calls, and loops until the task is complete.

System prompt

The system prompt is automatically generated from your metadata block:

YAML
metadata:
  role: Senior software engineer
  goal: Write production-quality code and tests
  instructions: |
    You write clean, well-documented Python code.
    Always add type hints. Always write tests.
    Ask clarifying questions before starting complex tasks.
  constraints:
    - Never modify files outside the project directory
    - Always run tests before declaring a task complete
Instructions are powerful
The instructions field is the most impactful field for agent quality. Be explicit about what the agent should and should not do. Treat it like a detailed job description.

Runtime configuration

FieldTypeDefaultDescription
providerstringLLM provider: anthropic, openai, google, deepseek, mistral, groq, together_ai, openrouter, cerebras, fireworks_ai, bedrock, azure, minimax
modelstringModel ID. e.g. claude-sonnet-4-6, gpt-4o, gemini-2.0-flash
temperaturefloat0.2Sampling temperature 0.0–2.0
resources.cpustringDocker --cpus limit, e.g. "1.0"
resources.memorystringDocker memory limit, e.g. "2Gi", "512Mi"
resources.base_imagestringpython:3.12-slimOverride the Dockerfile FROM image
resources.warm_poolboolfalseKeep container alive after run completes (used by ninetrix up)

Execution loop

The default execution loop is a standard agentic tool-use pattern:

  1. User message arrives (stdin, webhook, or schedule trigger)
  2. LLM generates a response — either plain text or a tool call
  3. If tool call: execute the tool, append the result to the message history
  4. Loop back to the LLM with updated history
  5. Stop when the LLM returns plain text with no tool calls, or max_steps is reached
ConstantDefaultDescription
max_steps10Maximum tool-call iterations before stopping
TOOL_TIMEOUT30sTimeout per tool call
MAX_TOKENS8192Max output tokens per LLM call
HISTORY_WINDOW_TOKENS90,000 tokensSliding window budget — oldest messages trimmed when exceeded; measured with litellm.token_counter()

Entry agent

In a multi-agent agentfile.yaml, the first key in the agents: map is the entry agent — the one that receives the initial message. Subsequent agents are started by handoffs.

Pre-run thinking

Enable a reasoning step before the agent starts executing. The thinking output is stored in the checkpoint history:

YAML
execution:
  thinking:
    enabled: true
    provider: anthropic
    model: claude-opus-4-6
    max_tokens: 2048
    prompt: "Think carefully about the best strategy before acting."
On this page