Skip to main content

Tracing Qwen (DashScope)

MLflow Tracing provides automatic tracing capability for Qwen (DashScope) models through the OpenAI SDK integration. Since Qwen (DashScope) offers an OpenAI-compatible API format, you can use mlflow.openai.autolog() to trace interactions with Qwen models.

Tracing via autolog

MLflow trace automatically captures the following information about Qwen calls:

  • Prompts and completion responses
  • Latencies
  • Token usage
  • Model name
  • Additional metadata such as temperature, max_completion_tokens, if specified.
  • Function calling if returned in the response
  • Built-in tools such as web search, file search, computer use, etc.
  • Any exception if raised

Getting Started

1

Install dependencies

bash
pip install mlflow openai
2

Start MLflow server

If you have a local Python environment >= 3.10, you can start the MLflow server locally using the mlflow CLI command.

bash
mlflow server
3

Enable tracing and call Qwen

python
import openai
import mlflow

# Enable auto-tracing for OpenAI (works with Qwen)
mlflow.openai.autolog()

# Optional: Set a tracking URI and an experiment
mlflow.set_tracking_uri("http://localhost:5000")
mlflow.set_experiment("Qwen")

# Initialize the OpenAI client with Qwen API endpoint
client = openai.OpenAI(
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
api_key="<DASHSCOPE_API_KEY>",
)

response = client.chat.completions.create(
model="qwen-plus",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"},
],
)
4

View traces in MLflow UI

Browse to your MLflow UI (for example, http://localhost:5000) and open the Qwen experiment to see traces for the calls above.

Qwen Tracing

-> View Next Steps for learning about more MLflow features like user feedback tracking, prompt management, and evaluation.

Streaming and Async Support

MLflow supports tracing for streaming and async Qwen APIs. Visit the OpenAI Tracing documentation for example code snippets for tracing streaming and async calls through OpenAI SDK.

Combine with frameworks or manual tracing

The automatic tracing capability in MLflow is designed to work seamlessly with the Manual Tracing SDK or multi-framework integrations. The examples below show Python (manual span) and JS/TS (manual span) at the same level of complexity.

python
import json
from openai import OpenAI
import mlflow
from mlflow.entities import SpanType

# Initialize the OpenAI client with Qwen API endpoint
client = OpenAI(
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
api_key="<DASHSCOPE_API_KEY>",
)


# Create a parent span for the Qwen call
@mlflow.trace(span_type=SpanType.CHAIN)
def answer_question(question: str):
messages = [{"role": "user", "content": question}]
response = client.chat.completions.create(
model="qwen-plus",
messages=messages,
)

# Attach session/user metadata to the trace
mlflow.update_current_trace(
metadata={
"mlflow.trace.session": "session-12345",
"mlflow.trace.user": "user-a",
}
)
return response.choices[0].message.content


answer = answer_question("What is the capital of France?")

Running either example will produce a trace that includes the Qwen LLM span; the traced function creates the parent span automatically.

Qwen Tracing with Manual Tracing

Next steps