Skip to main content

Tracing Arize / Phoenix

Arize Phoenix is an open-source AI observability platform. Its OpenInference instrumentation libraries produce OpenTelemetry spans with rich AI-specific attributes for model providers like OpenAI, Anthropic, and others.

Because OpenInference is built on OpenTelemetry, you can use mlflow.otel.autolog() to mirror every instrumented call to the MLflow backend. The MLflow server automatically translates OpenInference attributes to MLflow span types, inputs, outputs, token usage, and model name. Phoenix tracing is completely unaffected; this is purely additive.

python
import mlflow

mlflow.otel.autolog()

MLflow automatically captures the following information from OpenInference spans:

  • Span inputs and outputs
  • Latencies
  • Span name
  • Span type mapped from OpenInference span kind (e.g. LLM, RETRIEVER, TOOL)
  • Token usage (prompt, completion, total)
  • Model name
  • Parent-child span nesting
  • Any exception if raised

Getting Started

1

Install Dependencies

Install MLflow and the OpenInference instrumentor for your model provider. This example uses OpenAI:

bash
pip install mlflow openai openinference-instrumentation-openai opentelemetry-sdk

Other instrumentors are available for LangChain, LlamaIndex, Anthropic, and more. See the OpenInference repository for the full list.

2

Start MLflow Server

If you have a local Python environment >= 3.10, you can start the MLflow server locally using the mlflow CLI command.

bash
mlflow server
3

Enable Tracing and Run Your Application

python
import mlflow
from openai import OpenAI
from openinference.instrumentation.openai import OpenAIInstrumentor

# Register the MLflow span processor on the global OTEL TracerProvider
mlflow.otel.autolog()

# Auto-instrument all OpenAI SDK calls
OpenAIInstrumentor().instrument()

mlflow.set_tracking_uri("http://localhost:5000")
mlflow.set_experiment("Phoenix")

client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "What is MLflow?"}],
max_tokens=256,
)
print(response.choices[0].message.content)
note

mlflow.otel.autolog() can be called before or after the OpenInference instrumentor. Both perform the same ProxyTracerProvider to SdkTracerProvider replacement and reuse an existing provider if one is already set, so initialization order does not matter.

4

View Traces in MLflow UI

Browse to the MLflow UI at http://localhost:5000 (or your MLflow server URL) and you should see the traces for your OpenInference-instrumented application. Each LLM call will appear with proper inputs, outputs, span types, token usage, and model name.

Dual Tracing with Phoenix

You can send spans to both MLflow and Phoenix simultaneously by adding an OTLP exporter to the shared TracerProvider:

python
import mlflow
from openai import OpenAI
from openinference.instrumentation.openai import OpenAIInstrumentor
from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace.export import SimpleSpanProcessor

# Register MLflow span processor
mlflow.otel.autolog()

# Also send spans to Phoenix (running on port 6006)
provider = trace.get_tracer_provider()
provider.add_span_processor(
SimpleSpanProcessor(OTLPSpanExporter("http://localhost:6006/v1/traces"))
)

# Auto-instrument OpenAI
OpenAIInstrumentor().instrument()

client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello!"}],
)

This works because both MLflow and Phoenix register as span processors on the same OpenTelemetry TracerProvider. Every span is dispatched to all registered processors.

Span Type Mapping

All OpenInference span kinds (LLM, EMBEDDING, TOOL, RETRIEVER, AGENT, CHAIN, RERANKER, GUARDRAIL, EVALUATOR) map 1-to-1 to the corresponding MLflow span type of the same name.

Export Mode

By default, mlflow.otel.autolog() uses batched export (BatchSpanProcessor), which buffers spans and flushes them in the background. This follows the MLFLOW_ENABLE_ASYNC_TRACE_LOGGING environment variable (default True).

For debugging or low-throughput use cases, you can switch to synchronous export so that each trace is sent immediately:

python
mlflow.otel.autolog(batch=False)

Or explicitly enable batched export:

python
mlflow.otel.autolog(batch=True)

Disable auto-tracing

Auto tracing can be disabled globally by calling mlflow.otel.autolog(disable=True).

python
mlflow.otel.autolog(disable=True)

After disabling, new OpenInference spans will no longer be forwarded to MLflow.