Budget Alerts & Limits

Budget policies let you control AI Gateway spending by setting a threshold (USD) over a recurring time window (daily, weekly, or monthly). When the threshold is exceeded, the policy takes one of two actions:

Alert — fires a webhook notification. Requests continue normally.
Reject — blocks subsequent requests with HTTP 429. The request that causes the spend to exceed the budget is allowed to complete; rejection applies to requests that arrive after the threshold has been exceeded.

Spend resets automatically at the start of each new time window. When workspaces are enabled, budget policies can be scoped to a specific workspace so that spend is tracked per workspace.

Managing Budget Policies

Navigate to AI Gateway > Budgets in the MLflow UI to view and manage your budget policies.

Creating a Budget Policy

Click Create budget policy to open the creation dialog. Specify the budget amount (USD), reset period, and the action to take when the threshold is exceeded.

Alert Webhooks

When an ALERT policy's threshold is exceeded, the gateway fires a webhook with details about the breach. The alert fires once per window, subsequent requests within the same window do not trigger additional webhooks.

Webhook endpoints can be configured directly from the Budgets page under the Budget alert webhooks section.

The webhook payload includes:

json
{
  "budget_policy_id": "bp-abc123",
  "budget_unit": "USD",
  "budget_amount": 500.0,
  "current_spend": 523.40,
  "duration_unit": "MONTHS",
  "duration_value": 1,
  "target_scope": "GLOBAL",
  "workspace": "default",
  "window_start": 1704067200000
}

Reject Behavior

When a REJECT policy's threshold is exceeded, the gateway blocks all subsequent requests to AI Gateway endpoints with an HTTP 429 response:

text
HTTP/1.1 429 Too Many Requests

{
  "detail": "Budget limit exceeded for policy 'bp-abc123'. Limit: $500.00 USD per 1 month. Request rejected."
}

Time Windows

Budget windows are fixed intervals:

Daily — resets every day at midnight UTC
Weekly — resets every 7 days on Sundays
Monthly — resets on the 1st of each month

Accumulated spend resets to zero at the start of each new window.

Authorization

When authentication is enabled for the tracking server, only admin users can create, update, or delete budget policies.

Budget Tracker Strategies

The gateway uses a budget tracker to accumulate spend and evaluate budget policies on every request. Two strategies are available: local and redis.

Local

Pros:

No external dependencies — runs entirely in-process.
Lowest latency; no network round-trips per request.
Accumulated spend survives restarts via trace backfill on startup.

Cons:

Budget state is not shared across workers or replicas. Each process tracks spend independently, so the total across all workers can exceed the configured limit.
If trace backfill is disabled or unavailable, spend resets to zero on restart.

bash
# No extra environment variables required
mlflow server --host 0.0.0.0 --port 5000

Redis

Pros:

Budget state is shared across all gateway workers and replicas — the limit is enforced globally.
Atomic operations guarantee race-free window initialization and cost accumulation.

Cons:

Requires a running Redis instance reachable from every gateway process.
Adds a small per-request latency for the Redis round-trip.
Requires an additional dependency: pip install redis.

Environment Variable	Default	Description
`MLFLOW_GATEWAY_BUDGET_REDIS_URL`	`None`	Redis connection URL. Setting this variable activates the `redis` strategy. Examples: `redis://localhost:6379/0` (plain), `rediss://host:6380/0` (TLS), `redis://:password@host:6379/0` (auth).

bash
export MLFLOW_GATEWAY_BUDGET_REDIS_URL=redis://localhost:6379/0
mlflow server --host 0.0.0.0 --port 5000

Policy Refresh Interval

Both strategies periodically re-fetch budget policies from the database. The interval is controlled by MLFLOW_GATEWAY_BUDGET_REFRESH_INTERVAL (default: 600 seconds). Decrease the value to pick up policy changes faster; increase it to reduce database load.

bash
export MLFLOW_GATEWAY_BUDGET_REFRESH_INTERVAL=30
mlflow server --host 0.0.0.0 --port 5000

Managing Budget Policies​

Creating a Budget Policy​

Alert Webhooks​

Reject Behavior​

Time Windows​

Authorization​

Budget Tracker Strategies​

Local​

Redis​

Policy Refresh Interval​