An AI Gateway is a centralized proxy layer that routes requests to LLM providers through a single, unified API. It manages credentials, tracks usage, enforces governance policies, and provides complete observability across all LLM calls. As agents increasingly connect to external tools and data sources through MCP (Model Context Protocol) servers, AI Gateways also provide a centralized layer for securely managing access to those MCP servers and related tools.
AI Gateways give engineering teams centralized control over how their applications access LLMs. They route requests, manage credentials securely, track token costs, enforce governance policies, and maintain complete audit trails. As AI systems move from prototypes to production, gateways become essential for security, compliance, and cost control.
Unlike direct LLM API calls, which scatter credentials across your infrastructure and provide no visibility into usage patterns, an AI Gateway centralizes everything. It provides a single authentication point, automatic usage tracking, cost monitoring dashboards, traffic splitting for A/B testing, automatic fallback chains for reliability, and complete tracing integration so you can analyze every request in context.
Quick Navigation:
AI systems, such as agents, LLM applications, and RAG systems, introduce unique operational challenges that direct API calls can't address:
Problem: API keys scattered across notebooks, CI environments, and developer machines create security risks and compliance headaches.
Solution: Centralize all credentials in the gateway. Applications authenticate to the gateway, never directly to LLM providers.
Problem: Token costs spiral out of control when teams have no visibility into who's using what models or how much they're spending.
Solution: Track usage and costs per endpoint, team, or project. Identify expensive queries and optimize spending.
Problem: Switching LLM providers requires code changes across every application that calls them.
Solution: Change provider configurations in the gateway without touching application code. A/B test models or set up automatic fallbacks.
Problem: Sensitive data and PII can leak to third-party APIs without centralized controls or audit trails.
Solution: Enforce PII redaction, content policies, and access controls at the gateway level. Maintain complete audit logs.
An LLM Gateway routes requests to large language model providers like OpenAI, Anthropic, and Bedrock through a single, unified API. Instead of integrating with each provider's SDK separately, your application points to the gateway's OpenAI-compatible endpoint and specifies which model to use by name.
For LLM applications (chatbots, content generators, summarization tools), an LLM Gateway centralizes credential management so API keys never touch application code, tracks token usage and costs across all providers in one dashboard, enables traffic splitting for A/B testing different models, and provides automatic fallback chains when providers have outages.
MLflow AI Gateway runs as part of your MLflow Tracking Server and exposes an OpenAI-compatible endpoint for any LLM provider. Configure endpoints in the MLflow UI, and your application code stays unchanged when switching providers or models.
As AI agents grow more capable, they increasingly connect to external tools and data sources through MCP (Model Context Protocol) servers. AI Gateway provides a centralized layer for securely managing that access — governing which MCP servers your agents can reach, tracking tool usage across sessions, and enforcing policies without modifying your agent code.
MLflow AI Gateway integrates natively with MLflow Tracing, so every request through the gateway — whether to an LLM provider or an MCP server — automatically becomes an MLflow trace. This gives you complete visibility into agent behavior, token costs, and tool usage without additional instrumentation.
AI Gateway solves real-world problems across production AI systems:
A comprehensive AI Gateway platform combines seven capabilities:
Modern open source AI platforms like MLflow make it easy to deploy a production-grade AI Gateway with minimal setup. MLflow AI Gateway runs as part of the MLflow Tracking Server, so there's no separate infrastructure to deploy or maintain.
For a comprehensive setup guide, visit the MLflow AI Gateway quickstart documentation. Here's a quick overview to get started:
1. Install MLflow with GenAI support:
pip install 'mlflow[genai]'
2. Start the MLflow server:
mlflow server
3. Configure your first gateway endpoint in the MLflow UI:
Navigate to the AI Gateway tab in the MLflow UI, create a new endpoint, select your LLM provider (OpenAI, Anthropic, Bedrock, etc.), configure your API credentials, and save. The gateway is now ready to route requests.
Check out the MLflow AI Gateway documentation for detailed configuration options and advanced features like traffic splitting and fallback chains.
Once your gateway is configured, point your application to the gateway's base URL using the OpenAI SDK (or any OpenAI-compatible client). The gateway handles authentication, routes requests to the correct provider, and automatically captures traces for every request.
Example: Querying with OpenAI SDK
from openai import OpenAIclient = OpenAI(base_url="https://your-mlflow-server/gateway/mlflow/v1",api_key="", # authentication handled by gateway)response = client.chat.completions.create(model="prod-gpt5", # name of your gateway endpointmessages=[{"role": "user", "content": "Summarize this support ticket..."}],)
Example: Querying with Anthropic Claude SDK
import anthropicclient = anthropic.Anthropic(base_url="https://your-mlflow-server/gateway/anthropic",api_key="dummy", # authentication handled by gateway)response = client.messages.create(model="my-claude-endpoint", # name of your gateway endpointmax_tokens=1024,messages=[{"role": "user", "content": "Summarize this support ticket..."}],)
MLflow is the largest open-source AI engineering platform, with over 30 million monthly downloads. Thousands of organizations use MLflow to debug, evaluate, monitor, and optimize production-quality AI agents and LLM applications while controlling costs and managing access to models and data. Backed by the Linux Foundation and licensed under Apache 2.0, MLflow provides a complete AI Gateway solution with no vendor lock-in. Get started →
When evaluating AI Gateway solutions, the most important decision is whether to use a standalone gateway or one integrated into an end-to-end AI platform. This choice has significant implications for your team's productivity, infrastructure complexity, and ability to debug and improve AI applications.
Standalone Gateways (LiteLLM, etc.): A standalone AI gateway solves one piece of the puzzle: it proxies your LLM calls and centralizes credentials. But in practice, routing requests is just the beginning. You still need to trace what happened inside your application after the LLM responded, evaluate whether the output was actually good, and tie cost and latency data back to specific features, prompts, or model versions. With a standalone gateway, that means integrating a separate observability tool, a separate evaluation framework, and building the glue code to connect them all to the same data. Every new tool in the stack is another thing to deploy, monitor, and keep in sync.
End-to-End Platform (MLflow): MLflow eliminates the integration tax. Because the AI Gateway, tracing, and evaluation all live in the same platform, you get automatic benefits that standalone gateways can't provide:
The alternative - stitching together a gateway, an observability platform, and an evaluation framework - creates data silos, duplicated configuration, and a fragile integration surface. MLflow's approach is to make the gateway a natural extension of the platform teams are already using for GenAI development, so that governance and observability come for free rather than as an afterthought.
When choosing an AI Gateway platform, the decision between open source and proprietary SaaS tools has significant long-term implications for your infrastructure, security posture, and costs.
Open Source (MLflow): With MLflow AI Gateway, you maintain complete control over your gateway infrastructure and routing policies. Deploy on your own infrastructure or use managed versions on Databricks or AWS. There are no per-request fees, no usage limits, and no vendor lock-in. Your API keys and request data stay under your control, and you can customize the gateway to your exact security and compliance requirements. MLflow integrates with any LLM provider through OpenTelemetry-compatible tracing.
Proprietary SaaS Gateways: Commercial AI Gateway platforms offer convenience but at the cost of flexibility and control. They typically charge per request or per seat, which can become expensive at scale. Your API keys and request data are sent to their servers, raising privacy and compliance concerns. You're locked into their ecosystem, making it difficult to switch providers or add custom functionality. Most proprietary gateways only support a subset of LLM providers.
Why Teams Choose Open Source: Organizations building production AI applications increasingly choose MLflow AI Gateway because it offers enterprise-grade routing and governance without compromising on data sovereignty, cost predictability, or flexibility. The Apache 2.0 license and Linux Foundation backing ensure MLflow remains truly open and community-driven, not controlled by a single vendor.