Skip to main content

Architecture Overview

MLflow's architecture is simple yet flexible. Whether your needs are for local solo development or production-scale deployment, you can choose the right components and backend options to fit your needs.

Core Components

MLflow SDK

MLflow provides client SDKs in multiple languages (Python, TypeScript, Java, R) with which users interact with the backend.

Backend Store

A database (or file system that emulates it) that stores the metadata of experiments, runs, traces, etc. MLflow supports different databases through SQLAlchemy, including postgresql, mysql, sqlite and mssql. See Backend Stores for more configurations of the backend store.

Artifact Store

Artifact store persists (typically large) artifacts for each run, such as model weights (e.g. a pickled scikit-learn model), images (e.g. PNGs), model and data files (e.g. Parquet file). These files are too large to be stored within the tracking backend and are often less-frequently accessed than metadata, rendering them more suited for cheap object storage such as Amazon S3. See Artifact Stores for supported storage options and low-level configurations.

Tracking Server

MLflow Tracking Server is a FastAPI server that serves REST APIs for accessing the backend and the artifact store, as well as hosting the MLflow UI. This is optional for local development since MLflow SDK can directly interact with a local database and file system. However, the server is essential for team development and provides features such as access control. Read Tracking Server for how to configure the server.

Common Setups

By configuring these components properly, you can create an MLflow Tracking environment suitable for your team's development workflow. The following diagram and table show a few common setups for the MLflow Tracking environment.

1. Localhost (default)2. Local Tracking with Local Database3. Remote Tracking with MLflow Tracking Server
ScenarioSolo developmentSolo developmentTeam development
Use CaseBy default, MLflow records metadata and artifacts for each run to a local directory, mlruns. This is the simplest way to get started with MLflow Tracking, without setting up any external server, database, and storage.Database backend provides better performance and reliability than the default file backend. MLflow client SDK interfaces with a SQLAlchemy-compatible database (e.g., SQLite, PostgreSQL, MySQL) to manage metadata, and store artifacts to the local file system.MLflow Tracking Server serves as a proxy for the remote access to the metadata and artifacts. This is particularly useful for team development scenarios where you want to store artifacts and experiment metadata in a shared location with proper access control.
Setup

No additional setup (default).

Set the tracking URI to the database URI (e.g., sqlite:///mlflow.db) with the mlflow.set_tracking_uri or the MLFLOW_TRACKING_URI environment variable.

Refer to the Docker Compose setup. Alternatively, you can use the managed MLflow service from popular cloud providers to avoid the maintenance overhead.