Tracing Arize / Phoenix

Arize Phoenix is an open-source AI observability platform. Its OpenInference instrumentation libraries produce OpenTelemetry spans with rich AI-specific attributes for model providers like OpenAI, Anthropic, and others.

Because OpenInference is built on OpenTelemetry, you can use mlflow.otel.autolog() to mirror every instrumented call to the MLflow backend. The MLflow server automatically translates OpenInference attributes to MLflow span types, inputs, outputs, token usage, and model name. Phoenix tracing is completely unaffected; this is purely additive.

python
import mlflow

mlflow.otel.autolog()

MLflow automatically captures the following information from OpenInference spans:

Span inputs and outputs
Latencies
Span name
Span type mapped from OpenInference span kind (e.g. LLM, RETRIEVER, TOOL)
Token usage (prompt, completion, total)
Model name
Parent-child span nesting
Any exception if raised

Getting Started

Install Dependencies

Install MLflow and the OpenInference instrumentor for your model provider. This example uses OpenAI:

bash
pip install mlflow openai openinference-instrumentation-openai opentelemetry-sdk

Other instrumentors are available for LangChain, LlamaIndex, Anthropic, and more. See the OpenInference repository for the full list.

Start MLflow Server

Local (pip)
Local (docker)

If you have a local Python environment >= 3.10, you can start the MLflow server locally using the mlflow CLI command.

bash
mlflow server

MLflow also provides a Docker Compose file to start a local MLflow server with a postgres database and a minio server.

bash
git clone --depth 1 --filter=blob:none --sparse https://github.com/mlflow/mlflow.git
cd mlflow
git sparse-checkout set docker-compose
cd docker-compose
cp .env.dev.example .env
docker compose up -d

Refer to the instruction for more details, e.g., overriding the default environment variables.

Enable Tracing and Run Your Application

python
import mlflow
from openai import OpenAI
from openinference.instrumentation.openai import OpenAIInstrumentor

# Register the MLflow span processor on the global OTEL TracerProvider
mlflow.otel.autolog()

# Auto-instrument all OpenAI SDK calls
OpenAIInstrumentor().instrument()

mlflow.set_tracking_uri("http://localhost:5000")
mlflow.set_experiment("Phoenix")

client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "What is MLflow?"}],
    max_tokens=256,
)
print(response.choices[0].message.content)

note

mlflow.otel.autolog() can be called before or after the OpenInference instrumentor. Both perform the same ProxyTracerProvider to SdkTracerProvider replacement and reuse an existing provider if one is already set, so initialization order does not matter.

View Traces in MLflow UI

Browse to the MLflow UI at http://localhost:5000 (or your MLflow server URL) and you should see the traces for your OpenInference-instrumented application. Each LLM call will appear with proper inputs, outputs, span types, token usage, and model name.

Dual Tracing with Phoenix

You can send spans to both MLflow and Phoenix simultaneously by adding an OTLP exporter to the shared TracerProvider:

python
import mlflow
from openai import OpenAI
from openinference.instrumentation.openai import OpenAIInstrumentor
from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace.export import SimpleSpanProcessor

# Register MLflow span processor
mlflow.otel.autolog()

# Also send spans to Phoenix (running on port 6006)
provider = trace.get_tracer_provider()
provider.add_span_processor(
    SimpleSpanProcessor(OTLPSpanExporter("http://localhost:6006/v1/traces"))
)

# Auto-instrument OpenAI
OpenAIInstrumentor().instrument()

client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}],
)

This works because both MLflow and Phoenix register as span processors on the same OpenTelemetry TracerProvider. Every span is dispatched to all registered processors.

Span Type Mapping

All OpenInference span kinds (LLM, EMBEDDING, TOOL, RETRIEVER, AGENT, CHAIN, RERANKER, GUARDRAIL, EVALUATOR) map 1-to-1 to the corresponding MLflow span type of the same name.

Export Mode

By default, mlflow.otel.autolog() uses batched export (BatchSpanProcessor), which buffers spans and flushes them in the background. This follows the MLFLOW_ENABLE_ASYNC_TRACE_LOGGING environment variable (default True).

For debugging or low-throughput use cases, you can switch to synchronous export so that each trace is sent immediately:

python
mlflow.otel.autolog(batch=False)

Or explicitly enable batched export:

python
mlflow.otel.autolog(batch=True)

Disable auto-tracing

Auto tracing can be disabled globally by calling mlflow.otel.autolog(disable=True).

python
mlflow.otel.autolog(disable=True)

After disabling, new OpenInference spans will no longer be forwarded to MLflow.

Getting Started​

Install Dependencies

Start MLflow Server

Enable Tracing and Run Your Application

View Traces in MLflow UI

Dual Tracing with Phoenix​

Span Type Mapping​

Export Mode​

Disable auto-tracing​

Getting Started

Dual Tracing with Phoenix

Span Type Mapping

Export Mode

Disable auto-tracing