# MLflow ## Docs - [Community](/docs/latest/community.md): Welcome to the MLflow community! Connect with thousands of data scientists, ML engineers, software engineers, and practitioners who are building the future of machine learning, Agent, and GenAI applications together. - [Usage Tracking](/docs/latest/community/usage-tracking.md): Starting with version 3.2.0, MLflow collects anonymized usage data by default. This data contains no sensitive or personally identifiable information. - [MLflow GenAI: Ship High-quality GenAI, Fast](/docs/latest/genai.md): MLflow GenAI is an open-source, all-in-one integrated platform that helps enhance your Agent & GenAI applications with end-to-end observability, evaluations, AI gateway, prompt management & optimization and tracking. - [Ground Truth Expectations](/docs/latest/genai/assessments/expectations.md): MLflow Expectations provide a systematic way to capture ground truth - the correct or desired outputs that your AI should produce. By establishing these reference points, you create the foundation for meaningful evaluation and continuous improvement of your GenAI applications. - [Feedback Collection](/docs/latest/genai/assessments/feedback.md): MLflow Feedback provides a comprehensive system for capturing quality evaluations from multiple sources - whether automated AI judges, programmatic rules, or human reviewers. This systematic approach to feedback collection enables you to understand and improve your GenAI application's performance at scale. - [Evaluation Dataset Concepts](/docs/latest/genai/concepts/evaluation-datasets.md): Evaluation Datasets require an MLflow Tracking Server with a SQL backend (PostgreSQL, MySQL, SQLite, or MSSQL). - [Expectation Concepts](/docs/latest/genai/concepts/expectations.md): What are Expectations? - [Feedback Concepts](/docs/latest/genai/concepts/feedback.md): What is Feedback? - [Scorer Concepts](/docs/latest/genai/concepts/scorers.md): What are Scorers? - [Spans](/docs/latest/genai/concepts/span.md): What is a Span? - [Trace Concepts](/docs/latest/genai/concepts/trace.md): What is Tracing? - [Feedback Concepts](/docs/latest/genai/concepts/trace/feedback.md): This guide introduces the core concepts of feedback and assessment in MLflow's GenAI evaluation framework. Understanding these concepts is essential for effectively measuring and improving the quality of your GenAI applications. - [Building MLflow evaluation datasets](/docs/latest/genai/datasets.md): To systematically test and improve a GenAI application, you use an evaluation dataset. An evaluation dataset is a selected set of example inputs — either labeled (with known expected outputs, i.e. ground-truth expectations) or unlabeled (without ground-truth). Evaluation datasets help you improve your app's performance in the following ways: - [Conversation Simulation Datasets](/docs/latest/genai/datasets/conversation-simulation.md): Store and manage test cases for conversation simulation using MLflow Evaluation Datasets. This enables reproducible multi-turn testing across agent versions. - [End-to-End Workflow: Evaluation-Driven Development](/docs/latest/genai/datasets/end-to-end-workflow.md): This guide demonstrates the complete workflow for building and evaluating GenAI applications using MLflow's evaluation-driven development approach. - [Evaluation Datasets SDK Reference](/docs/latest/genai/datasets/sdk-guide.md): Complete API reference for creating, managing, and querying evaluation datasets programmatically. - [Evaluating LLMs/Agents with MLflow](/docs/latest/genai/eval-monitor.md): MLflow's evaluation and monitoring capabilities help you systematically measure, improve, and maintain the quality of your GenAI applications throughout their lifecycle from development through production. - [AI Issue Discovery](/docs/latest/genai/eval-monitor/ai-insights/ai-issue-discovery.md): Automatically analyze traces in your MLflow experiments to find operational issues, quality problems, and performance patterns. The Analyze Experiment tool uses hypothesis-driven analysis to systematically examine your GenAI application's behavior, identify the most important problems, and create a plan for addressing them in the form of a comprehensive markdown report. - [Automatic Evaluation](/docs/latest/genai/eval-monitor/automatic-evaluations.md): Automatically evaluate traces and multi-turn conversations as they're logged - no code required - [Evaluate & Monitor FAQ](/docs/latest/genai/eval-monitor/faq.md): This page addresses frequently asked questions about MLflow's GenAI evaluation. - [Migrating from Legacy LLM Evaluation](/docs/latest/genai/eval-monitor/legacy-llm-evaluation.md): LLM evaluation involves assessing how well a model performs on a task. MLflow provides a simple API to evaluate your LLMs with popular metrics. - [GenAI Evaluation Quickstart](/docs/latest/genai/eval-monitor/notebooks/quickstart-eval.md): Download this notebook - [GenAI Evaluation Quickstart](/docs/latest/genai/eval-monitor/quickstart.md): Need help setting up evaluation? Try MLflow Assistant - a powerful AI assistant that can help you set up evaluation for your project. - [Evaluating Agents](/docs/latest/genai/eval-monitor/running-evaluation/agents.md): AI Agents are an emerging pattern of GenAI applications that can use tools, make decisions, and execute multi-step workflows. However, evaluating the performance of those complex agents is challenging. MLflow provides a powerful toolkit to systematically evaluate the agent behavior precisely using traces and scorers. - [Conversation Simulation](/docs/latest/genai/eval-monitor/running-evaluation/conversation-simulation.md): Conversation simulation enables you to generate synthetic multi-turn conversations for testing your conversational AI agents. Instead of manually creating test conversations or waiting for production data, you can define test scenarios and let MLflow automatically simulate realistic user interactions. - [MLflow evaluation examples for GenAI](/docs/latest/genai/eval-monitor/running-evaluation/eval-examples.md): This page presents some common usage patterns for the evaluation harness, including data patterns and predict_fn patterns. - [Evaluate Conversations](/docs/latest/genai/eval-monitor/running-evaluation/multi-turn.md): Conversation evaluation enables you to assess entire conversation sessions rather than individual turns. This is essential for evaluating conversational AI systems where quality emerges over multiple interactions, such as user frustration patterns, conversation completeness, or overall dialogue coherence. - [Evaluating Prompts](/docs/latest/genai/eval-monitor/running-evaluation/prompts.md): Prompts are the core components of GenAI applications. However, iterating over prompts can be challenging because it is hard to know if the new prompt is better than the old one. MLflow provides a framework to systematically evaluate prompt templates and track performance over time. - [Evaluating (Production) Traces](/docs/latest/genai/eval-monitor/running-evaluation/traces.md): Traces are the core data of MLflow. They capture the complete execution flow of your LLM applications. Evaluating traces is a powerful way to understand the performance of your LLM applications and get insights for quality improvement. - [LLM Judges and Scorers](/docs/latest/genai/eval-monitor/scorers.md): Judges are a key component of the MLflow GenAI evaluation framework. They provide a unified interface to define evaluation criteria for your models, agents, and applications. Like their name suggests, judges judge how well your application did based on the evaluation criteria. This could be a pass/fail, true/false, numerical value, or a categorical value. - [Create custom code-based scorers](/docs/latest/genai/eval-monitor/scorers/custom.md): Custom code-based scorers offer the ultimate flexibility to define precisely how your GenAI application's quality is measured. You can define evaluation metrics tailored to your specific business use case, whether based on simple heuristics, advanced logic, or programmatic evaluations. - [Code-based scorer examples](/docs/latest/genai/eval-monitor/scorers/custom/code-examples.md): In MLflow Evaluation for GenAI, custom code-based scorers allow you to define flexible evaluation metrics for your AI agent or application. This set of examples illustrate many patterns for using code-based scorers with different options for inputs, outputs, implementation, and error handling. - [Develop code-based scorers](/docs/latest/genai/eval-monitor/scorers/custom/tutorial.md): In MLflow Evaluation for GenAI, custom code-based scorers allow you to define flexible evaluation metrics for your AI agent or application. - [Judge Alignment: Teaching AI to Match Human Preferences](/docs/latest/genai/eval-monitor/scorers/llm-judge/alignment.md): Transform Generic Judges into Domain Experts - [Custom Judges](/docs/latest/genai/eval-monitor/scorers/llm-judge/custom-judges.md): Custom LLM judges let you define complex and nuanced judging guidelines for GenAI applications using natural language. - [Create a custom judge using make_judge()](/docs/latest/genai/eval-monitor/scorers/llm-judge/custom-judges/create-custom-judge.md): Custom judges are LLM-based judges that evaluate your GenAI agents against specific quality criteria. This tutorial shows you how to create custom judges and use them to evaluate a customer support agent using make_judge(). - [Supported Models](/docs/latest/genai/eval-monitor/scorers/llm-judge/custom-judges/supported-models.md): When no model is specified, MLflow uses a default based on your environment: - [Custom Alignment Optimizers](/docs/latest/genai/eval-monitor/scorers/llm-judge/custom-optimizers.md): MLflow's alignment system is designed as a plugin architecture, allowing you to create custom optimizers for different alignment strategies. This extensibility enables you to implement domain-specific optimization approaches while leveraging MLflow's judge infrastructure. - [GEPA Alignment Optimizer](/docs/latest/genai/eval-monitor/scorers/llm-judge/gepa.md): MLflow provides the GEPA alignment optimizer using DSPy's implementation of GEPA (Genetic-Pareto). GEPA uses LLM-driven reflection to analyze execution traces and iteratively propose improved judge instructions based on human feedback. - [Create a guidelines LLM Judge](/docs/latest/genai/eval-monitor/scorers/llm-judge/guidelines.md): Guidelines LLM judges use pass/fail natural language criteria to evaluate GenAI outputs. They excel at evaluating: - [MemAlign Optimizer (Experimental)](/docs/latest/genai/eval-monitor/scorers/llm-judge/memalign.md): MemAlign is an experimental optimizer. The API may change in future releases. - [Built-in LLM Judges](/docs/latest/genai/eval-monitor/scorers/llm-judge/predefined.md): MLflow provides several pre-configured LLM judges optimized for common evaluation scenarios. - [Bring Your Own Prompts](/docs/latest/genai/eval-monitor/scorers/llm-judge/prompt.md): The custompromptjudge API is being phased out. We strongly recommend using the make_judge API instead, which provides: - [RAG Evaluation with Built-in Judges](/docs/latest/genai/eval-monitor/scorers/llm-judge/rag.md): Retrieval-Augmented Generation (RAG) systems combine retrieval and generation to provide contextually relevant responses. Evaluating RAG applications requires assessing both the retrieval quality (are the right documents retrieved?) and the generation quality (is the response grounded in those documents?). - [RetrievalSufficiency judge](/docs/latest/genai/eval-monitor/scorers/llm-judge/rag/context-sufficiency.md): The RetrievalSufficiency judge evaluates whether the retrieved context (from RAG applications, agents, or any system that retrieves documents) contains enough information to adequately answer the user's request based on the ground truth label provided as expectedfacts or an expectedresponse. - [RetrievalGroundedness judge](/docs/latest/genai/eval-monitor/scorers/llm-judge/rag/groundedness.md): The RetrievalGroundedness judge assesses whether your application's response is factually supported by the provided context (either from a RAG system or generated by a tool call), helping detect hallucinations or statements not backed by that context. - [Answer and Context Relevance Judges](/docs/latest/genai/eval-monitor/scorers/llm-judge/rag/relevance.md): MLflow provides two built-in LLM judges to assess relevance in your GenAI applications. These judges help diagnose quality issues - if context isn't relevant, the generation step cannot produce a helpful response. - [Correctness Judge](/docs/latest/genai/eval-monitor/scorers/llm-judge/response-quality/correctness.md): The Correctness judge assesses whether your GenAI application's response is factually correct by comparing it against provided ground truth information (expectedfacts or expectedresponse). - [Safety Judge](/docs/latest/genai/eval-monitor/scorers/llm-judge/response-quality/safety.md): The Safety judge assesses the safety of given content (whether generated by the application or provided by a user), checking for harmful, unethical, or inappropriate material. - [SIMBA Alignment Optimizer](/docs/latest/genai/eval-monitor/scorers/llm-judge/simba.md): MLflow provides the default alignment optimizer using DSPy's implementation of SIMBA (Simplified Multi-Bootstrap Aggregation). When you call align() without specifying an optimizer, the SIMBA optimizer is used automatically. - [Tool Call Evaluation with Built-in Judges](/docs/latest/genai/eval-monitor/scorers/llm-judge/tool-call.md): AI agents often use tools (functions) to complete tasks - from fetching data to performing calculations. Evaluating tool-calling applications requires assessing whether agents select appropriate tools and provide correct arguments to fulfill user requests. - [ToolCallCorrectness Judge](/docs/latest/genai/eval-monitor/scorers/llm-judge/tool-call/correctness.md): The ToolCallCorrectness judge evaluates whether the tools called by an agent and the arguments they are called with are correct given the user request. - [ToolCallEfficiency Judge](/docs/latest/genai/eval-monitor/scorers/llm-judge/tool-call/efficiency.md): The ToolCallEfficiency judge evaluates the agent's trajectory for redundancy in tool usage, such as tool calls with the same or similar arguments. - [End-to-End Judge Workflow](/docs/latest/genai/eval-monitor/scorers/llm-judge/workflow.md): Complete workflow for developing, testing, and deploying custom LLM judges - [Third-party Scorers](/docs/latest/genai/eval-monitor/scorers/third-party.md): MLflow integrates with popular third-party evaluation frameworks, allowing you to leverage their specialized metrics within MLflow's evaluation workflow. This provides access to third party library's evaluation metrics while maintaining a consistent MLflow interface. - [DeepEval](/docs/latest/genai/eval-monitor/scorers/third-party/deepeval.md): DeepEval is a comprehensive evaluation framework for LLM applications that provides metrics for RAG systems, agents, conversational AI, and safety evaluation. MLflow's DeepEval integration allows you to use most DeepEval metrics as MLflow scorers. - [Guardrails AI](/docs/latest/genai/eval-monitor/scorers/third-party/guardrails.md): Guardrails AI is a framework for validating LLM outputs using a community-driven hub of validators for safety, PII detection, content quality, and more. MLflow's Guardrails AI integration allows you to use Guardrails validators as MLflow scorers, providing rule-based evaluation without requiring LLM calls. - [Arize Phoenix](/docs/latest/genai/eval-monitor/scorers/third-party/phoenix.md): Arize Phoenix is an open-source LLM observability and evaluation framework from Arize AI. MLflow's Phoenix integration allows you to use Phoenix evaluators as MLflow scorers for detecting hallucinations, evaluating relevance, identifying toxicity, and more. - [RAGAS](/docs/latest/genai/eval-monitor/scorers/third-party/ragas.md): RAGAS (Retrieval Augmented Generation Assessment) is an evaluation framework designed for LLM applications. MLflow's RAGAS integration allows you to use RAGAS metrics as MLflow judges for evaluating retrieval quality, answer generation, and other aspects of LLM applications. - [TruLens](/docs/latest/genai/eval-monitor/scorers/third-party/trulens.md): TruLens is an evaluation and observability framework for LLM applications that provides feedback functions for RAG systems and agent trace analysis. MLflow's TruLens integration allows you to use TruLens feedback functions as MLflow scorers, including benchmarked goal-plan-action alignment evaluations for agent traces. - [Registering and Versioning Scorers](/docs/latest/genai/eval-monitor/scorers/versioning.md): Scorers can be registered to MLflow experiments for version control and team collaboration. - [MLflow GenAI Packaging Integrations](/docs/latest/genai/flavors.md): MLflow 3 delivers built-in support for packaging and deploying applications written with the GenAI frameworks you depend on. Whether you're orchestrating chains with LangChain or LangGraph, indexing documents in LlamaIndex, wiring up agent patterns via ChatModel and ResponseAgent, or rolling your own with a PythonModel, MLflow provides native packaging and deployment APIs ("flavors") to streamline your path to production. - [Tutorial: Custom GenAI Models using ChatModel](/docs/latest/genai/flavors/chat-model-guide.md): Starting in MLflow 3.0.0, we recommend ResponsesAgent instead of ChatModel. See more details in the ResponsesAgent Introduction. - [Build a tool-calling model with mlflow.pyfunc.ChatModel](/docs/latest/genai/flavors/chat-model-guide/chat-model-tool-calling.md): Download this notebook - [Tutorial: Getting Started with ChatModel](/docs/latest/genai/flavors/chat-model-intro.md): Starting in MLflow 3.0.0, we recommend ResponsesAgent instead of ChatModel. See more details in the ResponsesAgent Introduction. - [Deploying Advanced LLMs with Custom PyFuncs in MLflow](/docs/latest/genai/flavors/custom-pyfunc-for-llms.md): Starting in MLflow 3.0.0, we recommend ResponsesAgent instead of ChatModel. See more details in the ResponsesAgent Introduction. - [Serving LLMs with MLflow: Leveraging Custom PyFunc](/docs/latest/genai/flavors/custom-pyfunc-for-llms/notebooks/custom-pyfunc-advanced-llm.md): Download this notebook - [MLflow DSPy Flavor](/docs/latest/genai/flavors/dspy.md): The dspy flavor is under active development and is marked as Experimental. Public APIs are - [DSPy Quickstart](/docs/latest/genai/flavors/dspy/notebooks/dspy_quickstart.md): Download this notebook - [DSPy Optimizer Autologging](/docs/latest/genai/flavors/dspy/optimizer.md): A DSPy optimizer is an algorithm that tunes the parameters of a DSPy program (i.e., the prompts and/or the LM weights) to maximize the metrics you specify. - [MLflow LangChain Flavor](/docs/latest/genai/flavors/langchain.md): The langchain flavor is under active development and is marked as Experimental. Public APIs are - [MLflow Langchain Autologging](/docs/latest/genai/flavors/langchain/autologging.md): MLflow LangChain flavor supports autologging, a powerful feature that allows you to log crucial details about the LangChain model and execution without the need for explicit logging statements. MLflow LangChain autologging covers various aspects of the model, including traces, models, signatures and more. - [LangChain within MLflow (Experimental)](/docs/latest/genai/flavors/langchain/guide.md): The langchain flavor is currently under active development and is marked as Experimental. Public APIs are evolving, and new features are being added to enhance its functionality. - [Introduction to Using LangChain with MLflow](/docs/latest/genai/flavors/langchain/notebooks/langchain-quickstart.md): Download this notebook - [Introduction to RAG with MLflow and LangChain](/docs/latest/genai/flavors/langchain/notebooks/langchain-retriever.md): Download this notebook - [MLflow LlamaIndex Flavor](/docs/latest/genai/flavors/llama-index.md): Introduction - [Introduction to Using LlamaIndex with MLflow](/docs/latest/genai/flavors/llama-index/notebooks/llama_index_quickstart.md): Download this notebook - [Building a Tool-calling Agent with LlamaIndex Workflow and MLflow](/docs/latest/genai/flavors/llama-index/notebooks/llama_index_workflow_tutorial.md): Download this notebook - [ResponsesAgent Introduction](/docs/latest/genai/flavors/responses-agent-intro.md): What is a ResponsesAgent? - [Set Up MLflow Server](/docs/latest/genai/getting-started/connect-environment.md): Learn how to setup the MLflow server for GenAI application development. - [Try Managed MLflow](/docs/latest/genai/getting-started/databricks-trial.md): The Databricks Free Trial offers an opportunity to experience the Databricks platform without prior cloud provider access. - [MLflow Assistant](/docs/latest/genai/getting-started/try-assistant.md): Supercharge your MLflow workflow with AI-powered coding assistants that understand your codebase and can set up MLflow automatically. - [MLflow AI Gateway](/docs/latest/genai/governance/ai-gateway.md): MLflow AI Gateway provides a unified interface for deploying and managing multiple LLM providers within your organization. It simplifies interactions with services like OpenAI, Anthropic, and others through a single, secure endpoint. - [Create and Manage API Keys](/docs/latest/genai/governance/ai-gateway/api-keys/create-and-manage.md): API keys serve as reusable credentials that can be shared across multiple endpoints. When you have several endpoints using the same provider, this approach simplifies both initial setup and ongoing credential management. - [Encryption & Rotation](/docs/latest/genai/governance/ai-gateway/api-keys/key-rotation.md): This page covers API key security, including encryption configuration and credential rotation best practices. - [Create and Manage Endpoints](/docs/latest/genai/governance/ai-gateway/endpoints/create-and-manage.md): Endpoints define how requests are routed to AI models. Each endpoint can use a single model or leverage advanced routing features like traffic splitting and fallbacks. - [Model Providers](/docs/latest/genai/governance/ai-gateway/endpoints/model-providers.md): MLflow AI Gateway supports 100+ model providers through the LiteLLM integration. This page covers the major providers, their capabilities, and how to use their passthrough APIs. - [Query Endpoints](/docs/latest/genai/governance/ai-gateway/endpoints/query-endpoints.md): Once you've created an endpoint, you can call it through several different API styles depending on your needs. - [Gateway Server (Legacy)](/docs/latest/genai/governance/ai-gateway/legacy.md): The Gateway Server provides a YAML-based configuration approach for deploying and managing LLM endpoints. This legacy method offers flexibility for users who prefer file-based configuration and command-line server management. - [AI Gateway Server Configuration](/docs/latest/genai/governance/ai-gateway/legacy/configuration.md): Configure providers, endpoints, and advanced settings for your MLflow AI Gateway. - [AI Gateway Server Setup](/docs/latest/genai/governance/ai-gateway/legacy/setup.md): Get your MLflow AI Gateway up and running quickly with this step-by-step setup guide. - [AI Gateway Server Usage](/docs/latest/genai/governance/ai-gateway/legacy/usage.md): Learn how to query your AI Gateway endpoints, integrate with applications, and leverage different APIs and tools. - [AI Gateway Quickstart](/docs/latest/genai/governance/ai-gateway/quickstart.md): Get your AI Gateway running in minutes with this simple walkthrough. - [Traffic Routing & Fallbacks](/docs/latest/genai/governance/ai-gateway/traffic-routing-fallbacks.md): Beyond basic endpoint configuration, MLflow AI Gateway supports advanced routing features that enable traffic splitting for A/B testing and automatic fallbacks for high availability. - [Usage Tracking](/docs/latest/genai/governance/ai-gateway/usage-tracking.md): AI Gateway usage tracking logs all requests to an endpoint as traces, allowing you to monitor request volume, latency, errors, token consumption, and costs. - [MLflow MCP Server](/docs/latest/genai/mcp.md): - This feature is experimental and may change in future releases. - [Prompt Registry](/docs/latest/genai/prompt-registry.md): MLflow Prompt Registry - [Create and Edit Prompts](/docs/latest/genai/prompt-registry/create-and-edit-prompts.md): Learn how to create new prompts and edit existing ones in the MLflow Prompt Registry using both the UI and Python APIs. - [Evaluating Prompts](/docs/latest/genai/prompt-registry/evaluate-prompts.md): Combining MLflow Prompt Registry with MLflow LLM Evaluation enables you to evaluate prompt performance across different models and datasets, and track the evaluation results in a centralized registry. You can also inspect model outputs from the traces logged during evaluation to understand how the model responds to different prompts. - [Log Prompts with Models](/docs/latest/genai/prompt-registry/log-with-model.md): Prompts are often used as a part of GenAI applications. Managing the association between prompts and models is crucial for tracking the evolution of models and ensuring consistency across different environments. MLflow Prompt Registry is integrated with MLflow's model tracking capability, allowing you to track which prompts (and versions) are used by your models and applications. - [Manage Prompt Lifecycles](/docs/latest/genai/prompt-registry/manage-prompt-lifecycles-with-aliases.md): Discover how to use aliases in the MLflow Prompt Registry to manage the lifecycle of your prompts, from development to production, and for implementing governance. - [Optimize Prompts](/docs/latest/genai/prompt-registry/optimize-prompts.md): The simple way to continuously improve your AI agents and prompts. - [Optimizing Prompts for LangChain](/docs/latest/genai/prompt-registry/optimize-prompts/langchain-optimization.md): This guide demonstrates how to leverage alongside LangChain to enhance your chain's prompts automatically. The API is framework-agnostic, enabling you to perform end-to-end prompt optimization of your chains from any framework using state-of-the-art techniques. For more information about the API, please visit Optimize Prompts. - [Optimizing Prompts for LangGraph](/docs/latest/genai/prompt-registry/optimize-prompts/langgraph-optimization.md): This guide demonstrates how to leverage alongside LangGraph to enhance your agent's prompts automatically. The API is framework-agnostic, enabling you to perform end-to-end prompt optimization of your graphs from any framework using state-of-the-art techniques. For more information about the API, please visit Optimize Prompts. - [Optimizing Prompts for OpenAI Agents](/docs/latest/genai/prompt-registry/optimize-prompts/openai-agent-optimization.md): This guide demonstrates how to leverage alongside the OpenAI Agent framework to enhance your agent's prompts automatically. The API is framework-agnostic, enabling you to perform end-to-end prompt optimization of your agents from any framework using state-of-the-art techniques. For more information about the API, please visit Optimize Prompts. - [Optimizing Prompts for Pydantic AI](/docs/latest/genai/prompt-registry/optimize-prompts/pydantic-ai-optimization.md): This guide demonstrates how to leverage alongside Pydantic AI to enhance your agent's prompts automatically. The API is framework-agnostic, enabling you to perform end-to-end prompt optimization of your agents from any framework using state-of-the-art techniques. For more information about the API, please visit Optimize Prompts. - [Prompt Engineering UI (Experimental)](/docs/latest/genai/prompt-registry/prompt-engineering.md): Starting in MLflow 2.7, the MLflow Tracking UI provides a best-in-class experience for prompt - [Auto-rewrite Prompts for New Models (Experimental)](/docs/latest/genai/prompt-registry/rewrite-prompts.md): When migrating to a new language model, you often discover that your carefully crafted prompts don't work as well with the new model. MLflow's API helps you automatically rewrite prompts to maintain output quality when switching models, using your existing application's outputs as training data. - [Structured Output](/docs/latest/genai/prompt-registry/structured-output.md): Learn how to define structured output schemas for your prompts to ensure consistent and validated responses from language models. - [Use Prompts in Apps](/docs/latest/genai/prompt-registry/use-prompts-in-apps.md): Learn how to integrate prompts from the MLflow Prompt Registry into your applications and link them to MLflow Models for end-to-end lineage. - [Request Features](/docs/latest/genai/references/request-features.md): Your feedback drives our roadmap! Vote on the most requested features (👍) and share your ideas to help us build what matters most to you. - [MLflow Model Serving](/docs/latest/genai/serving.md): Transform your trained models into production-ready inference servers with MLflow's comprehensive serving capabilities. Deploy locally, in the cloud, or through managed endpoints with standardized REST APIs. - [MLflow Agent Server](/docs/latest/genai/serving/agent-server.md): Agent Server Features - [Custom Serving Applications](/docs/latest/genai/serving/custom-apps.md): MLflow's custom serving applications allow you to build sophisticated model serving solutions that go beyond simple prediction endpoints. Using the PyFunc framework, you can create custom applications with complex preprocessing, postprocessing, multi-model inference, and business logic integration. - [ResponsesAgent for Model Serving](/docs/latest/genai/serving/responses-agent.md): The ResponsesAgent class in MLflow provides a specialized interface for serving generative AI models that handle structured responses with tool calling capabilities. This agent is designed to work seamlessly with MLflow's serving infrastructure while providing compatibility with OpenAI-style APIs. - [MLflow Tracing for LLM Observability](/docs/latest/genai/tracing.md): MLflow Tracing is a fully OpenTelemetry-compatible LLM observability solution for your applications. It captures the inputs, outputs, and metadata associated with each intermediate step of a request, enabling you to easily pinpoint the source of bugs and unexpected behaviors. - [Automatic Tracing](/docs/latest/genai/tracing/app-instrumentation/automatic.md): MLflow Tracing is integrated with various GenAI libraries and provides one-line automatic tracing experience for each library (and the combination of them!). This page shows detailed examples to integrate MLflow with popular GenAI libraries. - [Distributed Tracing](/docs/latest/genai/tracing/app-instrumentation/distributed-tracing.md): When your application spans multiple services, you may want to connect spans from these services into a single trace for tracking the end-to-end execution in one place. MLflow supports this via Distributed Tracing, by propagating the active trace context over HTTP so that spans recorded in different services are stitched together. - [Manual Tracing](/docs/latest/genai/tracing/app-instrumentation/manual-tracing.md): In addition to the Auto Tracing integrations, you can instrument your GenAI application by using MLflow's manual tracing APIs. - [Tracing with OpenTelemetry](/docs/latest/genai/tracing/app-instrumentation/opentelemetry.md): OpenTelemetry is a CNCF-backed project that provides vendor-neutral observability APIs and SDKs to collect telemetry data from your applications. MLflow Tracing is fully compatible with OpenTelemetry, making it free from vendor lock-in. - [Setting Trace Tags](/docs/latest/genai/tracing/attach-tags.md): Tags are mutable key-value pairs that you can attach to traces to add valuable labels and context for grouping and filtering traces. For example, you can tag traces based on the topic of the user's input or the type of request being processed and group them together for analysis and quality evaluation. - [Collect User Feedback](/docs/latest/genai/tracing/collect-user-feedback.md): Capturing user feedback is critical for understanding the real-world quality of your GenAI application. MLflow's Feedback API provides a structured, standardized approach to collecting, storing, and analyzing user feedback directly within your traces. - [Tracing FAQ](/docs/latest/genai/tracing/faq.md): Getting Started and Basic Usage - [Auto Tracing Integrations](/docs/latest/genai/tracing/integrations.md): MLflow Tracing is integrated with 40+ popular Generative AI libraries and frameworks, offering one-line automatic tracing experience. This allows you to gain immediate observability into your GenAI applications with minimal setup. - [Contributing to MLflow Tracing](/docs/latest/genai/tracing/integrations/contribute.md): Welcome to the MLflow Tracing contribution guide! This step-by-step resource will assist you in implementing additional GenAI library integrations for tracing into MLflow. - [Tracing AG2🤖](/docs/latest/genai/tracing/integrations/listing/ag2.md): AG2 Tracing via autolog - [Tracing Agno](/docs/latest/genai/tracing/integrations/listing/agno.md): Agno is a flexible agent framework for orchestrating LLMs, reasoning steps, tools, and memory into a unified pipeline. - [Tracing Anthropic](/docs/latest/genai/tracing/integrations/listing/anthropic.md): MLflow Tracing provides automatic tracing capability for Anthropic LLMs. By enabling auto tracing - [Tracing AutoGen](/docs/latest/genai/tracing/integrations/listing/autogen.md): AutoGen tracing via autolog - [Tracing Amazon Bedrock with MLflow](/docs/latest/genai/tracing/integrations/listing/bedrock.md): MLflow supports automatic tracing for Amazon Bedrock, a fully managed service on AWS that provides high-performing - [Tracing Amazon Bedrock AgentCore](/docs/latest/genai/tracing/integrations/listing/bedrock-agentcore.md): Enable OpenTelemetry in Amazon Bedrock AgentCore - [Tracing BytePlus](/docs/latest/genai/tracing/integrations/listing/byteplus.md) - [Tracing Claude Code](/docs/latest/genai/tracing/integrations/listing/claude_code.md): Claude Code Tracing via CLI autolog - [Tracing Cohere](/docs/latest/genai/tracing/integrations/listing/cohere.md) - [Tracing CrewAI](/docs/latest/genai/tracing/integrations/listing/crewai.md): CrewAI is an open-source framework for orchestrating role-playing, autonomous AI agent. - [Tracing Databricks](/docs/latest/genai/tracing/integrations/listing/databricks.md): Databricks offers a unified platform for data, analytics and AI. Databricks Foundation Model APIs provide an OpenAI-compatible API format for accessing state-of-the-art models such as OpenAI GPT, Anthropic Claude, Google Gemini, and more, through a single platform. Since Databricks Foundation Model APIs are OpenAI-compatible, you can use MLflow tracing to trace your interactions with Databricks Foundation Model APIs. - [Tracing Databricks AI Gateway](/docs/latest/genai/tracing/integrations/listing/databricks-ai-gateway.md): Databricks AI Gateway (formerly Mosaic AI Gateway) is the Databricks solution for governing and monitoring access to generative AI models and their associated model serving endpoints. It is a centralized service that brings governance, monitoring, and production readiness to model serving endpoints. - [Tracing LangChain Deep Agent](/docs/latest/genai/tracing/integrations/listing/deepagent.md): LangChain Deep Agent is an open-source library for building autonomous agents that can plan, research, and execute complex tasks. Deep Agent is built on top of LangGraph, providing a high-level abstraction for creating sophisticated agents with built-in capabilities like todo management, file operations, and spawning specialized subagents. - [Tracing DeepSeek](/docs/latest/genai/tracing/integrations/listing/deepseek.md) - [Tracing DSPy🧩](/docs/latest/genai/tracing/integrations/listing/dspy.md): DSPy is an open-source framework for building modular AI systems and offers algorithms for optimizing their prompts and weights. - [Tracing FireworksAI](/docs/latest/genai/tracing/integrations/listing/fireworksai.md): FireworksAI is an inference and customization engine for open source AI. It provides day zero access to the latest SOTA OSS models and allows developers to build lightning AI applications. - [Tracing Gemini](/docs/latest/genai/tracing/integrations/listing/gemini.md): MLflow Tracing provides automatic tracing capability for Google Gemini. By enabling auto tracing - [Tracing Google Agent Development Kit (ADK)](/docs/latest/genai/tracing/integrations/listing/google-adk.md): MLflow Tracing provides automatic tracing capability for Google ADK, a flexible and modular AI agents framework developed by Google. MLflow supports tracing for Google ADK through the OpenTelemetry integration. - [Tracing Groq](/docs/latest/genai/tracing/integrations/listing/groq.md): MLflow Tracing provides automatic tracing capability when using Groq. - [Tracing Haystack](/docs/latest/genai/tracing/integrations/listing/haystack.md): Haystack is an open-source AI orchestration framework developed by deepset, designed to help Python developers build production-ready LLM-powered applications. - [Tracing Helicone](/docs/latest/genai/tracing/integrations/listing/helicone.md) - [Tracing Instructor](/docs/latest/genai/tracing/integrations/listing/instructor.md): Instructor Tracing via autolog - [Tracing Kong AI Gateway](/docs/latest/genai/tracing/integrations/listing/kong.md) - [Tracing Koog](/docs/latest/genai/tracing/integrations/listing/koog.md): Enable OpenTelemetry in Koog - [Tracing LangChain🦜⛓️](/docs/latest/genai/tracing/integrations/listing/langchain.md): LangChain is an open-source framework for building LLM-powered applications. - [Tracing Langflow](/docs/latest/genai/tracing/integrations/listing/langflow.md): Enable OpenTelemetry in Langflow via Traceloop - [Tracing LangGraph🦜🕸️](/docs/latest/genai/tracing/integrations/listing/langgraph.md): LangGraph is an open-source library for building stateful, multi-actor applications with LLMs, used to create agent and multi-agent workflows. - [Tracing LiteLLM🚄](/docs/latest/genai/tracing/integrations/listing/litellm.md): LiteLLM is an open-source LLM Gateway that allow accessing 100+ LLMs in the unified interface. - [Tracing LiteLLM Proxy](/docs/latest/genai/tracing/integrations/listing/litellm-proxy.md): LiteLLM Proxy is a self-hosted LLM gateway that provides a unified OpenAI-compatible API to access 100+ LLM providers. It offers features like load balancing, spend tracking, and rate limiting across multiple providers. - [Tracing LiveKit Agents](/docs/latest/genai/tracing/integrations/listing/livekit.md): MLflow Tracing provides automatic tracing capability for LiveKit Agents, an open-source framework for building real-time multimodal AI applications. MLflow supports tracing for LiveKit Agents through the OpenTelemetry integration. - [Tracing LlamaIndex🦙](/docs/latest/genai/tracing/integrations/listing/llama_index.md): LlamaIndex is an open-source framework for building agentic generative AI applications that allow large language models to work with your data in any format. - [Tracing Mastra](/docs/latest/genai/tracing/integrations/listing/mastra.md): MLflow Tracing provides automatic tracing capability for Mastra, a flexible and modular AI agents framework developed by Mastra. MLflow supports tracing for Mastra through the OpenTelemetry integration. - [Tracing Microsoft Agent Framework](/docs/latest/genai/tracing/integrations/listing/microsoft-agent-framework.md): MLflow Tracing provides automatic tracing capability for Microsoft Agent Framework, a flexible and modular AI agents framework developed by Microsoft. MLflow supports tracing for Microsoft Agent Framework through the OpenTelemetry integration. - [Tracing Mistral](/docs/latest/genai/tracing/integrations/listing/mistral.md): MLflow Tracing ensures observability for your interactions with Mistral AI models. - [Tracing MLflow AI Gateway](/docs/latest/genai/tracing/integrations/listing/mlflow-ai-gateway.md): MLflow AI Gateway is a unified, centralized interface for accessing multiple LLM providers. It simplifies API key management, provides a consistent API across providers, and enables seamless switching between models from OpenAI, Anthropic, Google, and other providers. - [Tracing Kimi (Moonshot AI)](/docs/latest/genai/tracing/integrations/listing/moonshot.md) - [Tracing Novita AI](/docs/latest/genai/tracing/integrations/listing/novitaai.md) - [Tracing Ollama](/docs/latest/genai/tracing/integrations/listing/ollama.md): MLflow Tracing provides automatic tracing capability for Ollama models through the OpenAI SDK integration. Because Ollama exposes an OpenAI-compatible API, you can simply use mlflow.openai.autolog() to trace Ollama calls. - [Tracing OpenAI](/docs/latest/genai/tracing/integrations/listing/openai.md): MLflow Tracing provides automatic tracing capability for OpenAI. By enabling auto tracing - [Tracing OpenAI Agent🤖](/docs/latest/genai/tracing/integrations/listing/openai-agent.md): OpenAI Tracing via autolog - [Tracing OpenRouter](/docs/latest/genai/tracing/integrations/listing/openrouter.md) - [Tracing Pipecat](/docs/latest/genai/tracing/integrations/listing/pipecat.md): Enable OpenTelemetry in Pipecat - [Tracing Portkey](/docs/latest/genai/tracing/integrations/listing/portkey.md) - [Tracing PydanticAI](/docs/latest/genai/tracing/integrations/listing/pydantic_ai.md): PydanticAI Tracing via autolog - [Tracing Pydantic AI Gateway](/docs/latest/genai/tracing/integrations/listing/pydantic-ai-gateway.md): Pydantic AI Gateway is a unified interface for accessing multiple AI providers with a single key. It supports models from OpenAI, Anthropic, Google Vertex, Groq, AWS Bedrock, and more. Key features include spending limits, failover management, and zero translation—requests flow through directly in each provider's native format, giving you immediate access to new model features as soon as they are released. - [Tracing Quarkus LangChain4j](/docs/latest/genai/tracing/integrations/listing/quarkus-langchain4j.md): Enable OpenTelemetry in Quarkus LangChain4j - [Tracing Qwen (DashScope)](/docs/latest/genai/tracing/integrations/listing/qwen.md) - [Tracing Semantic Kernel](/docs/latest/genai/tracing/integrations/listing/semantic_kernel.md): Semantic Kernel is a lightweight, open-source SDK that functions as AI middleware, enabling you to integrate AI models into your C#, Python, or Java codebase via a uniform API layer. By abstracting model interactions, it lets you swap in new models without rewriting your application logic. - [Tracing Smolagents](/docs/latest/genai/tracing/integrations/listing/smolagents.md): Smolagents tracing via autolog - [Tracing Spring AI](/docs/latest/genai/tracing/integrations/listing/spring-ai.md): Spring AI Tracing - [Tracing Strands Agents SDK](/docs/latest/genai/tracing/integrations/listing/strands.md): Strands Agents SDK is an open‑source, model‑driven SDK developed by AWS that enables developers to create autonomous AI agents - [Tracing Together AI](/docs/latest/genai/tracing/integrations/listing/togetherai.md) - [Tracing TrueFoundry](/docs/latest/genai/tracing/integrations/listing/truefoundry.md) - [Tracing txtai](/docs/latest/genai/tracing/integrations/listing/txtai.md): txtai Tracing via autolog - [Tracing Vercel AI Gateway](/docs/latest/genai/tracing/integrations/listing/vercel-ai-gateway.md) - [Tracing Vercel AI SDK](/docs/latest/genai/tracing/integrations/listing/vercelai.md): MLflow Tracing provides automatic tracing for applications built with the Vercel AI SDK (the ai package) via OpenTelemetry, unlocking powerful observability capabilities for TypeScript and Javascript application developers. - [Tracing VoltAgent](/docs/latest/genai/tracing/integrations/listing/voltagent.md): MLflow Tracing provides automatic tracing capability for VoltAgent, an open-source TypeScript framework for building AI agents. MLflow supports tracing for VoltAgent through the OpenTelemetry integration. - [Tracing Watsonx Orchestrate](/docs/latest/genai/tracing/integrations/listing/watsonx-orchestrate.md): Reference - [Tracing xAI / Grok](/docs/latest/genai/tracing/integrations/listing/xai-grok.md) - [Production Tracing SDK](/docs/latest/genai/tracing/lightweight-sdk.md): MLflow offers a Production Tracing SDK package called mlflow-tracing that includes only the essential functionality for tracing and monitoring of your GenAI applications. This package is designed for production environments where minimizing dependencies and deployment size is critical. - [Delete Traces](/docs/latest/genai/tracing/observe-with-traces/delete-traces.md): You can delete traces based on specific criteria using the method. This method allows you to delete traces by timestamp or trace IDs. - [Redacting Sensitive Data from Traces](/docs/latest/genai/tracing/observe-with-traces/masking.md): Traces capture powerful insights for debugging and monitoring your application, however, they may contain sensitive data, such as Personal Identifiable Information (PII), that you don't want to share with others. MLflow provides a fully configurable way to mask sensitive data from traces before they are saved to the backend. - [MLflow Tracing UI](/docs/latest/genai/tracing/observe-with-traces/ui.md): GenAI Experiment Overview - [OpenTelemetry Integration](/docs/latest/genai/tracing/opentelemetry.md): OpenTelemetry is a CNCF-backed project that provides vendor-neutral observability APIs and SDKs to instrument your applications and collect telemetry data in a consistent way. MLflow Tracing is fully compatible with OpenTelemetry, making it free from vendor lock-in. - [Export MLflow Traces/Metrics via OTLP](/docs/latest/genai/tracing/opentelemetry/export.md): Set Up OTLP Exporter - [Collect OpenTelemetry Traces into MLflow](/docs/latest/genai/tracing/opentelemetry/ingest.md): Basic Example - [ingest-shared](/docs/latest/genai/tracing/opentelemetry/ingest-shared.md): OpenTelemetry trace ingestion is supported in MLflow 3.6.0 and above. - [Production Tracing and Monitoring](/docs/latest/genai/tracing/prod-tracing.md): When you deploy an agent or LLM application to production, real users behave differently than test data—they find edge cases, ask unexpected questions, and expose issues you didn't anticipate. This guide covers how to configure MLflow Tracing for production environments—including automatic (online) quality evaluation—to catch these issues early and continuously improve your application. - [Tracing Quickstart](/docs/latest/genai/tracing/quickstart.md): Need help setting up tracing? Try MLflow Assistant - a powerful AI assistant that can add MLflow tracing to your project automatically. - [Search Traces](/docs/latest/genai/tracing/search-traces.md): This guide will walk you through how to search for traces in MLflow using both the MLflow UI and Python API. This resource will be valuable if you're interested in querying specific traces based on their metadata, tags, execution time, status, or other trace attributes. - [Token Usage and Cost Tracking](/docs/latest/genai/tracing/token-usage-cost.md): MLflow automatically tracks token usage and cost for LLM calls within your traces. This enables you to monitor resource consumption and optimize costs across your GenAI applications. - [Track Versions & Environments](/docs/latest/genai/tracing/track-environments-context.md): Tracking the execution environment and application version of your GenAI application allows you to debug performance and quality issues relative to the code. This metadata enables: - [Track Users & Sessions](/docs/latest/genai/tracing/track-users-sessions.md): Many real-world AI applications use session to maintain multi-turn user interactions. MLflow Tracing provides built-in support for associating traces with users and grouping them into sessions. Tracking users and sessions in your GenAI application provides essential context for understanding user behavior, analyzing conversation flows, and improving personalization. - [Version Tracking for GenAI Applications](/docs/latest/genai/version-tracking.md): Understand how MLflow enables version tracking for your complete GenAI applications using LoggedModels, linking code, configurations, evaluations, and traces. - [Compare Application Versions with Traces](/docs/latest/genai/version-tracking/compare-app-versions.md): Compare different application versions using traces to track improvements and identify the best performing iteration. - [Version Tracking Quickstart](/docs/latest/genai/version-tracking/quickstart.md): Build and track a LangChain-based chatbot with MLflow's version management capabilities. This quickstart demonstrates prompt versioning, application tracking, trace generation, and performance evaluation using MLflow's GenAI features. - [Track versions of Git-based applications with MLflow](/docs/latest/genai/version-tracking/track-application-versions-with-mlflow.md): Learn how to track versions of your GenAI application when your app's code resides in Git, using MLflow's automatic Git versioning capabilities. - [MLflow: A Tool for Managing the Machine Learning Lifecycle](/docs/latest/ml.md): MLflow is an open-source platform, purpose-built to assist machine learning practitioners and teams in - [Community Model Flavors](/docs/latest/ml/community-model-flavors.md): MLflow's vibrant community has developed flavors for specialized ML frameworks and use cases, extending MLflow's capabilities beyond the built-in flavors. These community-maintained packages enable seamless integration with domain-specific tools for time series forecasting, anomaly detection, visualization, and more. - [MLflow Dataset Tracking](/docs/latest/ml/dataset.md): The mlflow.data module is a comprehensive solution for dataset management throughout the machine learning lifecycle. It enables you to track, version, and manage datasets used in training, validation, and evaluation, providing complete lineage from raw data to model predictions. - [MLflow for Deep Learning](/docs/latest/ml/deep-learning.md): MLflow provides comprehensive experiment tracking, model management, and deployment capabilities for deep learning workflows. From PyTorch training loops to TensorFlow models, MLflow streamlines your path from experimentation to production. - [MLflow Keras 3.0 Integration](/docs/latest/ml/deep-learning/keras.md): Introduction - [MLflow PyTorch Integration](/docs/latest/ml/deep-learning/pytorch.md): Introduction - [MLflow Sentence Transformers Flavor](/docs/latest/ml/deep-learning/sentence-transformers.md): The MLflow Sentence Transformers flavor provides integration with the Sentence Transformers library for generating semantic embeddings from text. - [Advanced Paraphrase Mining with Sentence Transformers and MLflow](/docs/latest/ml/deep-learning/sentence-transformers/tutorials/paraphrase-mining/paraphrase-mining-sentence-transformers.md): Download this notebook - [Introduction to Sentence Transformers and MLflow](/docs/latest/ml/deep-learning/sentence-transformers/tutorials/quickstart/sentence-transformers-quickstart.md): Download this notebook - [Advanced Semantic Search with Sentence Transformers and MLflow](/docs/latest/ml/deep-learning/sentence-transformers/tutorials/semantic-search/semantic-search-sentence-transformers.md): Download this notebook - [Introduction to Advanced Semantic Similarity Analysis with Sentence Transformers and MLflow](/docs/latest/ml/deep-learning/sentence-transformers/tutorials/semantic-similarity/semantic-similarity-sentence-transformers.md): Download this notebook - [MLflow spaCy Integration](/docs/latest/ml/deep-learning/spacy.md): Introduction - [MLflow TensorFlow Integration](/docs/latest/ml/deep-learning/tensorflow.md): Introduction - [MLflow Transformers Flavor](/docs/latest/ml/deep-learning/transformers.md): The MLflow Transformers flavor provides native integration with the Hugging Face Transformers library, supporting model logging, loading, and inference for NLP, audio, vision, and multimodal tasks. - [🤗 Transformers within MLflow](/docs/latest/ml/deep-learning/transformers/guide.md): The transformers model flavor enables logging of transformers models, components, and pipelines - [Working with Large Models in MLflow Transformers flavor](/docs/latest/ml/deep-learning/transformers/large-models.md): The features described in this guide are intended for advanced users familiar with Transformers and MLflow. Please understand the limitations and potential risks associated with these features before use. - [Tasks in MLflow Transformers Flavor](/docs/latest/ml/deep-learning/transformers/task.md): This page provides an overview of how to use the task parameter in the MLflow Transformers flavor to control - [MLflow Transformers Flavor - Tutorials and Guides](/docs/latest/ml/deep-learning/transformers/tutorials.md): Below, you will find a number of guides that focus on different use cases using transformers that leverage MLflow's - [Introduction to MLflow and OpenAI's Whisper](/docs/latest/ml/deep-learning/transformers/tutorials/audio-transcription/whisper.md): Download this notebook - [Introduction to Conversational AI with MLflow and DialoGPT](/docs/latest/ml/deep-learning/transformers/tutorials/conversational/conversational-model.md): Download this notebook - [Deploying a Transformer model as an OpenAI-compatible Chatbot](/docs/latest/ml/deep-learning/transformers/tutorials/conversational/pyfunc-chat-model.md): Download this notebook - [Fine-Tuning Transformers with MLflow for Enhanced Model Management](/docs/latest/ml/deep-learning/transformers/tutorials/fine-tuning/transformers-fine-tuning.md): Download this notebook - [Fine-Tuning Open-Source LLM using QLoRA with MLflow and PEFT](/docs/latest/ml/deep-learning/transformers/tutorials/fine-tuning/transformers-peft.md): Download this notebook - [Prompt Templating with MLflow and Transformers](/docs/latest/ml/deep-learning/transformers/tutorials/prompt-templating/prompt-templating.md): Download this notebook - [Introduction to MLflow and Transformers](/docs/latest/ml/deep-learning/transformers/tutorials/text-generation/text-generation.md): Download this notebook - [Introduction to Translation with Transformers and MLflow](/docs/latest/ml/deep-learning/transformers/tutorials/translation/component-translation.md): Download this notebook - [MLflow Serving](/docs/latest/ml/deployment.md): After training your machine learning model and ensuring its performance, the next step is deploying it to a production environment. - [Deploy MLflow Model as a Local Inference Server](/docs/latest/ml/deployment/deploy-model-locally.md): MLflow allows you to deploy your model locally using just a single command. - [Deploy MLflow Model to Kubernetes](/docs/latest/ml/deployment/deploy-model-to-kubernetes.md): Using MLServer as the Inference Server - [Develop ML model with MLflow and deploy to Kubernetes](/docs/latest/ml/deployment/deploy-model-to-kubernetes/tutorial.md): This tutorial assumes that you have access to a Kubernetes cluster. However, you can also complete this tutorial on your local machine - [Deploy MLflow Model to Modal](/docs/latest/ml/deployment/deploy-model-to-modal.md): Modal is a serverless cloud platform optimized for AI/ML workloads, offering on-demand GPU access - [Deploy MLflow Model to Amazon SageMaker](/docs/latest/ml/deployment/deploy-model-to-sagemaker.md): Amazon SageMaker is a fully managed service designed for scaling ML inference containers. - [Official MLflow Docker image](/docs/latest/ml/docker.md): The official MLflow Docker image is available on GitHub Container Registry at https://ghcr.io/mlflow/mlflow. - [Model Evaluation](/docs/latest/ml/evaluation.md): This documentation covers MLflow's classic evaluation system (mlflow.models.evaluate) which uses EvaluationMetric and make_metric for custom metrics. - [Getting Started with the MLflow AI Engineering Platform](/docs/latest/ml/getting-started.md): If you're new to MLflow or seeking a refresher on its core functionalities, these - [Deep Learning Quickstart](/docs/latest/ml/getting-started/deep-learning.md): Need help setting up tracking? Try MLflow Assistant - a powerful AI assistant that can help you set up MLflow tracking for your project. - [Tracking Hyperparameter Tuning with MLflow](/docs/latest/ml/getting-started/hyperparameter-tuning.md): Need help setting up tracking? Try MLflow Assistant - a powerful AI assistant that can help you set up MLflow tracking for your project. - [MLflow Tracking Quickstart](/docs/latest/ml/getting-started/quickstart.md): Looking for using MLflow for LLMs/Agent development? Checkout the MLflow for GenAI documentation instead. This guide is intended for data scientists who train traditional machine learning models, such as decision trees. - [Connect Your Development Environment to MLflow](/docs/latest/ml/getting-started/running-notebooks.md): Learn how to connect your development environment to MLflow, whether using OSS MLflow or a managed offering. - [MLflow 3 Migration Guide](/docs/latest/ml/mlflow-3.md): Guide for migrating from MLflow 2.x to MLflow 3.x - [MLflow Models](/docs/latest/ml/model.md): An MLflow Model is a standard format for packaging machine learning models that can be used in a - [Managing Dependencies in MLflow Models](/docs/latest/ml/model/dependencies.md): MLflow Model is a standard format that packages a machine learning model with its dependencies and other metadata. - [Models From Code](/docs/latest/ml/model/models-from-code.md): Models from Code is available in MLflow 2.12.2 and above. For earlier versions, use the legacy serialization methods outlined in the Custom Python Model documentation. - [MLflow Signature Playground Notebook](/docs/latest/ml/model/notebooks/signature_examples.md): Download this notebook - [MLflow PythonModel Guide](/docs/latest/ml/model/python_model.md): Introduction to MLflow PythonModel - [Model Signatures and Input Examples](/docs/latest/ml/model/signatures.md): Model signatures and input examples are foundational components that define how your models should be used, ensuring consistent and reliable interactions across MLflow's ecosystem. - [MLflow Model Registry](/docs/latest/ml/model-registry.md): The MLflow Model Registry is a centralized model store, set of APIs and a UI designed to - [Model Registry Tutorials](/docs/latest/ml/model-registry/tutorial.md): Explore the full functionality of the Model Registry in this tutorial — from registering a model and inspecting its structure, to loading a specific model version for further use. - [Model Registry Workflows](/docs/latest/ml/model-registry/workflow.md): This guide walks you through using the MLflow Model Registry via both the UI and API. Learn how to register models, manage versions, apply aliases and tags, and organize your models for deployment. - [MLflow Plugins](/docs/latest/ml/plugins.md): MLflow's plugin architecture enables seamless integration with third-party tools and custom infrastructure. As a framework-agnostic platform, MLflow provides developer APIs for extending functionality across storage, authentication, execution backends, and model evaluation. - [MLflow Projects](/docs/latest/ml/projects.md): MLflow Projects provide a standard format for packaging and sharing reproducible data science code. Based on simple conventions, Projects enable seamless collaboration and automated execution across different environments and platforms. - [Search Experiments](/docs/latest/ml/search/search-experiments.md): and MlflowClient.search_experiments() - [Search Logged Models](/docs/latest/ml/search/search-models.md): This guide will walk you through how to search for logged models in MLflow using both the MLflow UI and Python API. This resource will be valuable if you're interested in querying specific models based on their metrics, params, tags, or model metadata. - [Search Runs](/docs/latest/ml/search/search-runs.md): This guide will walk you through how to search your MLflow runs through the MLflow UI and Python API. - [MLflow Tracking](/docs/latest/ml/tracking.md): The MLflow Tracking is an API and UI for logging parameters, code versions, metrics, and output files - [Automatic Logging with MLflow Tracking](/docs/latest/ml/tracking/autolog.md): Auto logging is a powerful feature that allows you to log metrics, parameters, and models without the need for explicit log statements. All you need to do is to - [MLflow Tracking Quickstart](/docs/latest/ml/tracking/quickstart.md): Need help setting up tracking? Try MLflow Assistant - a powerful AI assistant that can help you set up MLflow tracking for your project. - [System Metrics](/docs/latest/ml/tracking/system-metrics.md): MLflow allows users to log system metrics including CPU stats, GPU stats, memory usage, network traffic, and - [MLflow Tracking APIs](/docs/latest/ml/tracking/tracking-api.md): MLflow Tracking provides comprehensive APIs across multiple programming languages to capture your machine learning experiments. Whether you prefer automatic instrumentation or granular control, MLflow adapts to your workflow. - [Tracking Experiments with Local Database](/docs/latest/ml/tracking/tutorials/local-database.md): In this tutorial, you will learn how to use a local database to track your experiment metadata with MLflow. - [Remote Experiment Tracking with MLflow Tracking Server](/docs/latest/ml/tracking/tutorials/remote-server.md): In this tutorial, you will learn how to set up MLflow Tracking environment for team development using the MLflow Tracking Server. - [MLflow for Traditional Machine Learning](/docs/latest/ml/traditional-ml.md): MLflow provides comprehensive experiment tracking, model management, and deployment capabilities for traditional machine learning workflows. From scikit-learn pipelines to gradient boosting models, MLflow streamlines your path from experimentation to production. - [MLflow Prophet Integration](/docs/latest/ml/traditional-ml/prophet.md): Introduction - [MLflow Scikit-learn Integration](/docs/latest/ml/traditional-ml/sklearn.md): Introduction - [MLflow Spark MLlib Integration](/docs/latest/ml/traditional-ml/sparkml.md): Apache Spark MLlib provides distributed machine learning algorithms for processing large-scale datasets across clusters. MLflow integrates with Spark MLlib to track distributed ML pipelines, manage models, and enable flexible deployment from cluster training to standalone inference. - [Building Custom Python Function Models with MLflow](/docs/latest/ml/traditional-ml/tutorials/creating-custom-pyfunc.md): MLflow offers a wide range of pre-defined model flavors, but there are instances where you'd want to go - [Custom PyFuncs with MLflow - Notebooks](/docs/latest/ml/traditional-ml/tutorials/creating-custom-pyfunc/notebooks.md): If you would like to view the notebooks in this guide in their entirety, each notebook can viewed or downloaded directly below. - [Introduction to MLflow Custom Pyfunc](/docs/latest/ml/traditional-ml/tutorials/creating-custom-pyfunc/notebooks/basic-pyfunc.md): Download this notebook - [Creating a Custom Model: "Add N" Model](/docs/latest/ml/traditional-ml/tutorials/creating-custom-pyfunc/notebooks/introduction.md): Download this notebook - [Customizing a Model's predict method](/docs/latest/ml/traditional-ml/tutorials/creating-custom-pyfunc/notebooks/override-predict.md): Download this notebook - [Models, Flavors, and PyFuncs in MLflow](/docs/latest/ml/traditional-ml/tutorials/creating-custom-pyfunc/part1-named-flavors.md): In the MLflow ecosystem, "flavors" play a pivotal role in model management. Essentially, a "flavor" is a designated wrapper for specific machine - [Understanding PyFunc in MLflow](/docs/latest/ml/traditional-ml/tutorials/creating-custom-pyfunc/part2-pyfunc-components.md): In the realm of MLflow, while named flavors offer specific functionalities tailored to popular frameworks, there are situations and - [Hyperparameter Tuning with MLflow and Optuna](/docs/latest/ml/traditional-ml/tutorials/hyperparameter-tuning.md): In this guide, we venture into a frequent use case of MLflow Tracking: hyperparameter tuning. - [Hyperparameter tuning with MLflow and child runs - Notebooks](/docs/latest/ml/traditional-ml/tutorials/hyperparameter-tuning/notebooks.md): If you would like to view the notebooks in this guide in their entirety, each notebook can be either viewed or downloaded below. - [MLflow with Optuna: Hyperparameter Optimization and Tracking](/docs/latest/ml/traditional-ml/tutorials/hyperparameter-tuning/notebooks/hyperparameter-tuning-with-child-runs.md): Download this notebook - [Logging Visualizations with MLflow](/docs/latest/ml/traditional-ml/tutorials/hyperparameter-tuning/notebooks/logging-plots-in-mlflow.md): Download this notebook - [Leveraging Child Runs in MLflow for Hyperparameter Tuning](/docs/latest/ml/traditional-ml/tutorials/hyperparameter-tuning/notebooks/parent-child-runs.md): Download this notebook - [Understanding Parent and Child Runs in MLflow](/docs/latest/ml/traditional-ml/tutorials/hyperparameter-tuning/part1-child-runs.md): Introduction - [Leveraging Visualizations and MLflow for In-depth Model Analysis](/docs/latest/ml/traditional-ml/tutorials/hyperparameter-tuning/part2-logging-plots.md): Introduction - [Serving Multiple Models on a Single Endpoint with a Custom PyFunc Model](/docs/latest/ml/traditional-ml/tutorials/serving-multiple-models-with-pyfunc.md): This tutorial addresses a common scenario in machine learning: serving multiple models through a - [Deploy an MLflow PyFunc model with Model Serving](/docs/latest/ml/traditional-ml/tutorials/serving-multiple-models-with-pyfunc/notebooks/MME_Tutorial.md): Download this notebook - [MLflow XGBoost Integration](/docs/latest/ml/traditional-ml/xgboost.md): Introduction - [Tutorials and Examples](/docs/latest/ml/tutorials-and-examples.md): Welcome to our Tutorials and Examples hub! Here you'll find a curated set of resources to help you get started and deepen your knowledge of MLflow. Whether you're fine-tuning hyperparameters, orchestrating complex workflows, or integrating MLflow into your training code, these examples will guide you step by step. - [Webhooks](/docs/latest/ml/webhooks.md): - This feature is still experimental and may change in future releases. - [Search the documentation](/docs/latest/search.md) - [Self-Hosting MLflow](/docs/latest/self-hosting.md): #### _The most vendor-neutral MLOps/LLMOps platform in the world._ - [Artifact Stores](/docs/latest/self-hosting/architecture/artifact-store.md): The artifact store is a core component in MLflow Tracking where MLflow stores (typically large) artifacts - [Backend Stores](/docs/latest/self-hosting/architecture/backend-store.md): The backend store is a core component in MLflow that stores metadata for - [Architecture Overview](/docs/latest/self-hosting/architecture/overview.md): MLflow's architecture is simple yet flexible. Whether your needs are for local solo development or production-scale deployment, you can choose the right components and backend options to fit your needs. - [MLflow Tracking Server](/docs/latest/self-hosting/architecture/tracking-server.md): MLflow tracking server is a stand-alone HTTP server that serves multiple REST API endpoints for tracking runs/experiments. - [Migrate from File Store to Database](/docs/latest/self-hosting/migrate-from-file-store.md): If you have existing data in a file-based backend (./mlruns), you can migrate it to a database using the built-in migration command. - [How to Upgrade MLflow](/docs/latest/self-hosting/migration.md): MLflow evolves rapidly to provide new features and improve the framework. This document outlines the steps to upgrade self-hosted MLflow servers to the latest version. - [Authentication with Username and Password](/docs/latest/self-hosting/security/basic-http-auth.md): MLflow supports basic HTTP authentication to enable access control over experiments, registered models, and scorers. - [Custom Authentication](/docs/latest/self-hosting/security/custom.md): MLflow's authentication system is designed to be extensible. You can use custom authentication methods through plugins or pluggable functions. - [Protect Your Tracking Server from Network Exposure](/docs/latest/self-hosting/security/network.md): MLflow 3.5.0+ includes security middleware to protect against DNS rebinding, CORS attacks, and clickjacking. These features are available with the default FastAPI-based tracking server (uvicorn). - [SSO (Single Sign-On)](/docs/latest/self-hosting/security/sso.md): You can use SSO (Single Sign-On) to authenticate users to your MLflow instance, by installing a custom plugin or using a reverse proxy. - [Troubleshooting & FAQs](/docs/latest/self-hosting/troubleshooting.md): This page aggregates common production issues for self-hosted MLflow deployments and how to resolve them. - [Workspaces](/docs/latest/self-hosting/workspaces.md): Workspaces add an optional organizational layer and permission scheme for MLflow resources such as experiments, registered models, prompts, AI Gateway resources, and artifacts, letting teams share one deployment without running multiple servers. - [Workspace Configuration](/docs/latest/self-hosting/workspaces/configuration.md): Reference for the core flags, env vars, and startup checks for workspace mode. For setup and end-to-end flows, see Getting Started. For provider details and artifact routing options, see Workspace Providers. - [Getting Started with Workspaces](/docs/latest/self-hosting/workspaces/getting-started.md): This guide walks through setting up and using MLflow workspaces to organize teams and projects on a shared MLflow server. - [Workspace Permissions](/docs/latest/self-hosting/workspaces/permissions.md): Workspace-scoped permissions provide convenient access control for all resources within a workspace when authentication is enabled. - [Workspace Providers](/docs/latest/self-hosting/workspaces/workspace-providers.md): Workspace providers (also referred to as workspace stores) are pluggable backends that manage workspace metadata and determine which workspaces are visible to users. This architecture allows MLflow to integrate with existing platform constructs (for example, Kubernetes namespaces or identity providers) while providing a default SQL-backed implementation.