Version Tracking for GenAI Applications
MLflow's LoggedModel provides systematic version control for your entire GenAI application—code, configurations, evaluations, and traces. Stop losing track of what works and start building with confidence through complete application lifecycle management.

Why Version Control Matters for GenAI
GenAI applications are complex systems with interdependent components. Without systematic versioning, development becomes chaotic and deployments risky.
Eliminate 'It Worked Yesterday' Syndrome
Know exactly which combination of code, prompts, and configurations produced any result. Reproduce successes and debug failures with complete context.
Deploy with Data-Driven Confidence
Compare application versions objectively using metrics like quality scores, cost, and latency. Choose the best performer based on evidence, not intuition.
Track Every Change's Impact
Link code commits, configuration changes, and evaluation results. When quality drops, pinpoint exactly what changed and when.
Maintain Production Auditability
Know exactly what version was deployed when. Essential for compliance, incident response, and regulatory requirements.
How LoggedModel Powers GenAI Version Control
MLflow's LoggedModel adapts traditional ML model versioning for GenAI applications. Instead of just tracking model weights, it becomes a comprehensive metadata hub that coordinates all the moving parts of your AI system.
Application State Snapshots
Each LoggedModel version captures a complete application state—code references, configurations, dependencies, and performance data in one versioned entity.
Flexible Code Management
Link to external git commits for lightweight versioning, or bundle code directly for deployment. Choose the approach that fits your workflow.
Automatic Trace Association
When you set an active model context, all subsequent traces automatically link to that version. No manual bookkeeping required.
Start Version Tracking in 5 Minutes
Transform chaotic GenAI development into systematic version control with just a few lines of code.
Automatic Version Tracking with Git Integration
Link your application versions to git commits for complete traceability:
import mlflow
import openai
import os
# Fix: Added missing import
os.environ["OPENAI_API_KEY"] = "your-api-key-here"
# Configure MLflow experiment
mlflow.set_experiment("customer-support-agent")
# Get current git commit using MLflow's built-in utilities
from mlflow.utils.git_utils import get_git_commit
git_commit = get_git_commit(".")
if git_commit:
git_commit = git_commit[:8] # Use short hash
else:
git_commit = "local-dev" # Fallback if not in git repo
# Create version identifier
app_name = "customer_support_agent"
version_name = f"{app_name}-{git_commit}"
# Set active model context - all traces will link to this version
mlflow.set_active_model(name=version_name)
# Enable automatic tracing
mlflow.openai.autolog()
# Your application code - now automatically versioned and traced
client = openai.OpenAI()
test_questions = [
"How do I reset my password?",
"What are your business hours?",
"Can I get a refund for my order?",
]
for question in test_questions:
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": question}],
temperature=0.7,
max_tokens=1000,
)
# ✅ Automatically: traced, versioned, and linked to git commit
What happens automatically:
- Every LLM call generates a detailed trace
- All traces link to your specific application version
- Git commit provides exact code reproducibility
- Version performance can be compared objectively
Version Management Made Simple
# Create a new version for experimentation
with mlflow.set_active_model(name=f"agent-v2-{new_commit}"):
# Test new prompt engineering approach
improved_response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{
"role": "system",
"content": "You are a helpful customer support agent. Be concise and actionable.",
},
{"role": "user", "content": question},
],
temperature=0.3, # Lower temperature for consistency
max_tokens=500, # More focused responses
)
# ✅ New version automatically tracked with different configurations
Context manager automatically handles version switching—clean, explicit, and error-free.
Compare Versions Systematically
import pandas as pd
# Evaluate multiple versions against the same test set
eval_data = pd.DataFrame(
{
"inputs": test_questions,
"expected_categories": ["account", "business_info", "billing"],
}
)
# Version A: Original configuration
results_v1 = mlflow.evaluate(
model_uri=f"models:/{app_name}-{commit_v1}",
data=eval_data,
extra_metrics=[
mlflow.metrics.toxicity(),
mlflow.metrics.latency(),
mlflow.metrics.flesch_kincaid_grade_level(),
],
)
# Version B: Improved prompts
results_v2 = mlflow.evaluate(
model_uri=f"models:/{app_name}-{commit_v2}",
data=eval_data,
extra_metrics=[
mlflow.metrics.toxicity(),
mlflow.metrics.latency(),
mlflow.metrics.flesch_kincaid_grade_level(),
],
)
# ✅ Side-by-side comparison shows which version performs better
Objective metrics remove guesswork from version selection.
Prerequisites
Ready to implement systematic version tracking? You'll need:
- MLflow 3.0+ (
pip install --upgrade "mlflow>=3.1"
) - Git repository for your application code
- Python 3.10+
- LLM API access (OpenAI, Anthropic, etc.)
For Databricks-hosted MLflow Tracking: pip install --upgrade "mlflow[databricks]>=3.1"
Advanced Version Tracking Capabilities
Once you've mastered basic version tracking, explore these advanced patterns for production GenAI applications.
Track Application Versions
Learn comprehensive patterns for versioning complex GenAI applications with external code management
Deployment Strategies
Deploy versioned applications with confidence using MLflow's serving capabilities
Start with the code examples above, then explore the advanced capabilities as your application grows in complexity.