Tracing Hermes Agent

MLflow Tracing can capture end-to-end Hermes Agent execution traces, including LLM calls, tool invocations, and long-running agent sessions, by ingesting OpenTelemetry spans emitted by the hermes-otel plugin.

Hermes Agent tool-calling trace in the MLflow UI

Once configured, Hermes traces appear in MLflow with:

An AGENT root span for each Hermes turn
LLM spans for model calls
TOOL spans for terminal and tool execution
Token usage, latency, and session metadata

Setup

Prerequisites

An MLflow tracking environment. See the Connect Your Environment guide if you need to set one up.
An MLflow experiment to receive Hermes traces. Take note of its experiment ID.
Hermes Agent installed

Step 1: Install the Hermes OTLP Plugin

bash
hermes plugins install briancaffey/hermes-otel

Step 2: Point the Plugin at MLflow

Create ~/.hermes/plugins/hermes_otel/config.yaml and point it at the MLflow OTLP endpoint:

yaml
enabled: true
force_flush_on_session_end: true
capture_sender_id: true
capture_conversation_history: true

backends:
  - type: otlp
    name: mlflow
    endpoint: http://localhost:5000/v1/traces
    metrics: false
    logs: false
    headers:
      x-mlflow-experiment-id: "<your-experiment-id>"

tip

x-mlflow-experiment-id is required for OTLP ingestion. MLflow uses it to route incoming Hermes spans to the right experiment.

Step 3: (Optional) Point Hermes LLM calls at the MLflow AI Gateway

Update ~/.hermes/config.yaml so Hermes uses the MLflow AI Gateway as the default model provider. With a small change to your configuration, you get centralized model routing for the same Hermes runs, plus:

Governance — one place to manage which models Hermes can call
Budgets — spending limits and alerts for long-running Hermes sessions
Guardrails — PII detection and safety filters applied at the gateway
Usage logs — per-request inputs, outputs, tokens, and latency

See the Hermes Agent + MLflow AI Gateway guide to set this up.

yaml
model:
  default: "gpt-5.5"
  provider: "custom"
  base_url: "http://localhost:5000/gateway/mlflow/v1"

Step 4: Run Hermes Agent

Start an interactive Hermes session and run a tool-calling prompt:

bash
hermes

For example:

text
> What is the weather in Tokyo?

What Gets Traced

Hermes traces show the full agent execution path in MLflow:

text
AGENT (agent)
├── LLM (llm.<model>)
├── LLM (api.<model>)
├── TOOL (tool.terminal)
├── TOOL (tool.<name>)
└── ...

This makes it possible to see not just the final answer, but also which tools Hermes called, what commands ran, and how long each step took.

In addition to per-trace detail, MLflow aggregates Hermes traffic into per-experiment dashboards covering token usage, cost, and latency:

MLflow token usage and cost dashboard for Hermes Agent traces

Troubleshooting

No traces appear in MLflow

Verify the MLflow server is reachable at http://localhost:5000/v1/traces
Verify x-mlflow-experiment-id points to an existing experiment
Restart Hermes after editing ~/.hermes/config.yaml or the plugin config

Hermes runs but tool spans are missing

Confirm the task actually invoked a tool
Check that hermes_otel is enabled under plugins.enabled
Make sure the plugin package is installed into the same environment Hermes runs from

You only see gateway logs, not full agent traces

Gateway logs capture model requests
Full Hermes agent traces require hermes-otel exporting spans directly to MLflow

Acknowledgments

The Hermes Agent tracing integration is made possible by the hermes-otel plugin created by Brian Caffey.

Setup​

Prerequisites​

Step 1: Install the Hermes OTLP Plugin​

Step 2: Point the Plugin at MLflow​

Step 3: (Optional) Point Hermes LLM calls at the MLflow AI Gateway​

Step 4: Run Hermes Agent​

What Gets Traced​

Troubleshooting​

Acknowledgments​