Tracing FAQ
Getting Started and Basic Usage
Q: How do I start using MLflow Tracing?
The easiest way to start is with automatic tracing for supported libraries:
import mlflow
import openai
# Enable automatic tracing for OpenAI
mlflow.openai.autolog()
# Your existing code now generates traces automatically
client = openai.OpenAI()
response = client.chat.completions.create(
model="gpt-4o-mini", messages=[{"role": "user", "content": "Hello!"}]
)
For custom code, use the @mlflow.trace
decorator:
@mlflow.trace
def my_function(input_data):
# Your logic here
return "processed result"
Q: Which libraries does MLflow Tracing support automatically?
MLflow provides automatic tracing (autolog) for 20+ popular libraries. See the complete list at Automatic Tracing Integrations.
User Interface and Jupyter Integration
Q: Can I view traces directly in Jupyter notebooks?
Yes! Jupyter integration is available in MLflow 2.20 and above. The trace UI automatically displays within notebooks when:
- Cell code generates a trace
- You call
mlflow.search_traces()
- You display a trace object
import mlflow
# Set tracking URI to your MLflow server
mlflow.set_tracking_uri("http://localhost:5000")
@mlflow.trace
def my_function():
return "Hello World"
# Trace UI will appear automatically in the notebook
my_function()
To control the display:
# Disable notebook display
mlflow.tracing.disable_notebook_display()
# Enable notebook display
mlflow.tracing.enable_notebook_display()
Q: How can I customize the request and response previews in the UI?
You can customize what appears in the Request and Response columns of the trace list using mlflow.update_current_trace()
:
@mlflow.trace
def predict(messages: list[dict]) -> str:
# Customize the request preview for long message histories
custom_preview = f'{messages[0]["content"][:10]} ... {messages[-1]["content"][:10]}'
mlflow.update_current_trace(request_preview=custom_preview)
# Your model logic here
result = process_messages(messages)
# Customize response preview
mlflow.update_current_trace(response_preview=f"Result: {result[:50]}...")
return result
Production and Performance
Q: Can I use MLflow Tracing for production applications?
Yes, MLflow Tracing is stable and designed to be used in production environments.
When using MLflow Tracing in production environments, we recommend using the MLflow Tracing SDK (mlflow-tracing
) to instrument your code/models/agents with a minimal set of dependencies and a smaller installation footprint. The SDK is designed to be a perfect fit for production environments where you want an efficient and lightweight tracing solution. Please refer to the Production Monitoring section for more details.
Q: How do I enable asynchronous trace logging?
Asynchronous logging can significantly reduce performance overhead (about 80% for typical workloads):
import mlflow
# Enable async logging
mlflow.config.enable_async_logging()
# Traces will be logged asynchronously
with mlflow.start_span(name="foo") as span:
span.set_inputs({"a": 1})
span.set_outputs({"b": 2})
# Manually flush if needed
mlflow.flush_trace_async_logging()
Configuration options:
You can configure the detailed behavior of asynchronous logging using the following environment variables:
Environment Variable | Description | Default |
---|---|---|
MLFLOW_ASYNC_TRACE_LOGGING_MAX_WORKERS | Maximum worker threads | 10 |
MLFLOW_ASYNC_TRACE_LOGGING_MAX_QUEUE_SIZE | Maximum queued traces | 1000 |
MLFLOW_ASYNC_TRACE_LOGGING_RETRY_TIMEOUT | Retry timeout in seconds | 500 |
Q: How to optimize the trace size in production?
MLflow's Automatic Tracing integration captures rich information that are helpful for debugging and evaluating the model/agent. However, this comes at the cost of trace size. For example, you may not want to log the all retrieved document texts from your RAG application.
MLflow supports plugging-in custom post-processing hooks applied to trace data before exporting to the backend. This allows you to reduce the trace size by removing unnecessary data, or applying security guardrails such as masking sensitive data.
To register a custom hooks, use the mlflow.tracing.configure
API. For example, the following code filters out the document contents from the retriever span output to reduce the trace size:
import mlflow
from mlflow.entities.span import Span, SpanType
# Define a custom hook that takes a span as input and mutates it in-place.
def filter_retrieval_output(span: Span):
"""Filter out the document contents from the retriever span output and only keep the document ids."""
if span.span_type == SpanType.RETRIEVAL:
documents = span.outputs.get("documents")
document_ids = [doc.id for doc in documents]
span.set_outputs({"document_ids": document_ids})
# Register the hook
mlflow.tracing.configure(span_processors=[filter_retrieval_output])
# Any traces created after the configuration will be filtered by the hook.
...
Refer to the Redacting Sensitive Data Safe guide for more details about the hook API and examples.
Q. Can I log traces to different experiments from a single application?
Yes, you can log traces to different experiments from a single application. By default, MLflow logs traces to the current active experiment set via either the mlflow.set_experiment
API or the MLFLOW_EXPERIMENT_ID
environment variable.
However, you may sometimes want to dynamically route traces to different experiments from a single application. For example, your application server may expose two endpoints, each serves a different model. Switching active experiment does not work because active experiment is globally defined and not isolated per thread or async context.
Therefore, MLflow provides two different ways to switch target experiments to log traces:
Optional 1. Set the trace_destination
parameter when starting a manual trace
The trace_destination
parameter was introduced to the @mlflow.trace decorator and the mlflow.start_span API in MLflow 3.3 to allow you to specify the target experiment for each trace explicitly.
import mlflow
from mlflow.tracing.destination import MlflowExperiment
@mlflow.trace(trace_destination=MlflowExperiment(experiment_id="1234"))
def math_agent(request: Request):
# Your model logic here
...
Note that the trace_destination
parameter is only effective when it is set to the root span of the trace. If it is set to a child span, MLflow will ignore it and print a warning.
Option 2. Use mlflow.tracing.set_destination
with context_local=True
The mlflow.tracing.set_destination()
API is a purpose-built API for setting the destination of traces, while bypassing the overhead of mlflow.set_experiment
. The context_local
parameter allows you to set the destination per async task or thread, providing isolation in concurrent applications. This option is useful when you use automatic tracing and not using the manual tracing APIs.
import mlflow
@app.get("/math-agent")
def math_agent(request: Request):
# The API is super low-overhead, so you can call it inside the request handler.
mlflow.tracing.set_destination(MlflowExperiment(experiment_id="1234"))
# Your model logic here
with mlflow.start_span(name="math-agent") as span:
...
@app.get("/chat-agent")
def chat_agent(request: Request):
mlflow.tracing.set_destination(MlflowExperiment(experiment_id="5678"))
# Your model logic here
with mlflow.start_span(name="chat-agent") as span:
...