Tracing Vercel AI SDK
MLflow Tracing provides automatic tracing for applications built with the Vercel AI SDK (the ai package) via OpenTelemetry, unlocking powerful observability capabilities for TypeScript and Javascript application developers.
When the integration is enabled, MLflow allows you to record the following information for Vercel AI SDK calls:
- Prompts or messages and generated responses
- Latencies
- Call hierarchy
- Token usage when the provider returns it
- Any exception if raised
Quickstart (NextJS)
It is fairy straightforward to enable MLflow tracing for Vercel AI SDK if you are using NextJS.
If you don't have a handy app to test, you can use the demo chatbot app provided by Vercel.
1. Start MLflow Tracking Server
Start MLflow Tracking Server if you don't have one already:
mlflow server --backend-store-uri sqlite:///mlruns.db --port 5000
Alternatively, you can use Docker Compose to start the server without setting up Python environment. See Self-Hosting Guide for more details.
2. Configure Environment Variables
Set the following environment variable in your .env.local file:
OTEL_EXPORTER_OTLP_ENDPOINT=<your-mlflow-tracking-server-endpoint>
OTEL_EXPORTER_OTLP_TRACES_HEADERS=x-mlflow-experiment-id=<your-experiment-id>
OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=http/protobuf
For example, OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:5000.
3. Enable OpenTelemetry
Install the following packages to use Vercel OpenTelemetry integration.
pnpm i @opentelemetry/api @vercel/otel
Create a instrumentation.ts file in your NextJS project root and add the following code:
import { registerOTel } from '@vercel/otel';
export async function register() {
registerOTel({ serviceName: 'next-app' })
}
Then specify experimental_telemetry: {isEnabled: true} wherever you are using the Vercel AI SDK in the app.
import { openai } from '@ai-sdk/openai';
import { generateText } from 'ai';
export async function POST(req: Request) {
const { prompt } = await req.json();
const { text } = await generateText({
model: openai('gpt-4o-mini'),
maxOutputTokens: 100,
prompt,
experimental_telemetry: {isEnabled: true},
});
return new Response(JSON.stringify({ text }), {
headers: { 'Content-Type': 'application/json' },
});
}
See Vercel OpenTelemetry documentation for advanced usage such as context propagation.
5. Run the Application and View Traces
Run the application and view traces in MLflow UI. The UI is available at the tracking server endpoint you specified in the environment variables, e.g., http://localhost:5000.
Other Node.js Applications
If you are using other Node.js frameworks, set the OpenTelemetry Node SDK and OTLP exporter manually to export traces to MLflow.
import { init } from "mlflow-tracing";
import { generateText } from "ai";
import { openai } from "@ai-sdk/openai";
import { NodeSDK } from '@opentelemetry/sdk-node';
import { SimpleSpanProcessor } from '@opentelemetry/sdk-trace-node';
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-proto';
const sdk = new NodeSDK({
spanProcessors: [
new SimpleSpanProcessor(
new OTLPTraceExporter({
url: '<your-mlflow-tracking-server-endpoint>/v1/traces',
headers: { 'x-mlflow-experiment-id': '<your-experiment-id>' },
}),
),
],
});
sdk.start();
// Make an AI SDK call with telemetry enabled
const result = await generateText({
model: openai('gpt-4o-mini'),
prompt: 'What is MLflow?',
// IMPORTANT: enable telemetry is required for tracing
experimental_telemetry: { isEnabled: true }
});
console.log(result.text);
sdk.shutdown();
npx tsx main.ts
Streaming
Streaming is supported as well. Similarly to the generateText function, specify the experimental_telemetry.isEnabled option to true to enable tracing.
import { streamText } from 'ai';
import { openai } from '@ai-sdk/openai';
const stream = await streamText({
model: openai('gpt-4o-mini'),
prompt: 'Explain vector databases in one paragraph.',
experimental_telemetry: { isEnabled: true }
});
for await (const part of stream.textStream) {
process.stdout.write(part);
}
Token usage
When the underlying provider supplies token usage (e.g., input and output tokens), MLflow aggregates it on the trace. You can retrieve it from the trace info using the TypeScript SDK:
// Flush any pending spans then fetch the most recent trace
await mlflow.flushTraces();
const lastTraceId = mlflow.getLastActiveTraceId();
if (lastTraceId) {
const client = new mlflow.MlflowClient({ trackingUri: 'http://localhost:5000' });
const trace = await client.getTrace(lastTraceId);
console.log('Token usage:', trace.info.tokenUsage); // { input_tokens, output_tokens, total_tokens }
}
Disable auto-tracing
Disable tracing for Vercel AI SDK, set experimental_telemetry: { isEnabled: false } on the AI SDK call