mlflow.models
The mlflow.models module provides an API for saving machine learning models in
“flavors” that can be understood by different downstream tools.
The built-in flavors are:
For details, see MLflow Models.
- 
class mlflow.models.EvaluationArtifact(uri, content=None)[source]
- Bases: - object- A model evaluation artifact containing an artifact uri and content. 
- 
class mlflow.models.EvaluationMetric(eval_fn, name, greater_is_better, long_name=None, version=None, metric_details=None, metric_metadata=None, genai_metric_args=None)[source]
- Bases: - object- An evaluation metric. - Parameters
- eval_fn – - A function that computes the metric with the following signature: - def eval_fn( predictions: pandas.Series, targets: pandas.Series, metrics: Dict[str, MetricValue], **kwargs, ) -> Union[float, MetricValue]: """ Args: predictions: A pandas Series containing the predictions made by the model. targets: (Optional) A pandas Series containing the corresponding labels for the predictions made on that input. metrics: (Optional) A dictionary containing the metrics calculated by the default evaluator. The keys are the names of the metrics and the values are the metric values. To access the MetricValue for the metrics calculated by the system, make sure to specify the type hint for this parameter as Dict[str, MetricValue]. Refer to the DefaultEvaluator behavior section for what metrics will be returned based on the type of model (i.e. classifier or regressor). kwargs: Includes a list of args that are used to compute the metric. These args could be information coming from input data, model outputs, other metrics, or parameters specified in the `evaluator_config` argument of the `mlflow.evaluate` API. Returns: MetricValue with per-row scores, per-row justifications, and aggregate results. """ ... 
- name – The name of the metric. 
- greater_is_better – Whether a higher value of the metric is better. 
- long_name – (Optional) The long name of the metric. For example, - "root_mean_squared_error"for- "mse".
- version – (Optional) The metric version. For example - v1.
- metric_details – (Optional) A description of the metric and how it is calculated. 
- metric_metadata – (Optional) A dictionary containing metadata for the metric. 
- genai_metric_args – (Optional) A dictionary containing arguments specified by users when calling make_genai_metric or make_genai_metric_from_prompt. Those args are persisted so that we can deserialize the same metric object later. 
 
 
- 
class mlflow.models.EvaluationResult(metrics, artifacts, run_id=None)[source]
- Bases: - object- Represents the model evaluation outputs of a mlflow.evaluate() API call, containing both scalar metrics and output artifacts such as performance plots. - 
property artifacts
- A dictionary mapping standardized artifact names (e.g. “roc_data”) to artifact content and location information 
 - 
classmethod load(path)[source]
- Load the evaluation results from the specified local filesystem path 
 - 
save(path)[source]
- Write the evaluation results to the specified local filesystem path 
 
- 
property 
- 
class mlflow.models.FlavorBackend(config, **kwargs)[source]
- Bases: - object- Abstract class for Flavor Backend. This class defines the API interface for local model deployment of MLflow model flavors. - 
abstract build_image(model_uri, image_name, install_mlflow, mlflow_home, enable_mlserver, base_image=None)[source]
 - 
can_build_image()[source]
- Returns
- True if this flavor has a build_image method defined for building a docker container capable of serving the model, False otherwise. 
 
 - 
abstract can_score_model()[source]
- Check whether this flavor backend can be deployed in the current environment. - Returns
- True if this flavor backend can be applied in the current environment. 
 
 - 
abstract generate_dockerfile(model_uri, output_path, install_mlflow, mlflow_home, enable_mlserver, base_image=None)[source]
 - 
abstract predict(model_uri, input_path, output_path, content_type)[source]
- Generate predictions using a saved MLflow model referenced by the given URI. Input and output are read from and written to a file or stdin / stdout. - Parameters
- model_uri – URI pointing to the MLflow model to be used for scoring. 
- input_path – Path to the file with input data. If not specified, data is read from stdin. 
- output_path – Path to the file with output predictions. If not specified, data is written to stdout. 
- content_type – Specifies the input format. Can be one of { - json,- csv}
 
 
 - 
prepare_env(model_uri, capture_output=False)[source]
- Performs any preparation necessary to predict or serve the model, for example downloading dependencies or initializing a conda environment. After preparation, calling predict or serve should be fast. 
 - 
abstract serve(model_uri, port, host, timeout, enable_mlserver, synchronous=True, stdout=None, stderr=None)[source]
- Serve the specified MLflow model locally. - Parameters
- model_uri – URI pointing to the MLflow model to be used for scoring. 
- port – Port to use for the model deployment. 
- host – Host to use for the model deployment. Defaults to - localhost.
- timeout – Timeout in seconds to serve a request. Defaults to 60. 
- enable_mlserver – Whether to use MLServer or the local scoring server. 
- synchronous – If True, wait until server process exit and return 0, if process exit with non-zero return code, raise exception. If False, return the server process Popen instance immediately. 
- stdout – Redirect server stdout 
- stderr – Redirect server stderr 
 
 
 
- 
abstract 
- 
class mlflow.models.MetricThreshold(threshold=None, min_absolute_change=None, min_relative_change=None, greater_is_better=None, higher_is_better=None)[source]
- Bases: - object- This class allows you to define metric thresholds for model validation. Allowed thresholds are: threshold, min_absolute_change, min_relative_change. - Parameters
- threshold – - (Optional) A number representing the value threshold for the metric. - If higher is better for the metric, the metric value has to be >= threshold to pass validation. 
- Otherwise, the metric value has to be <= threshold to pass the validation. 
 
- min_absolute_change – - (Optional) A positive number representing the minimum absolute change required for candidate model to pass validation with the baseline model. - If higher is better for the metric, metric value has to be >= baseline model metric value + min_absolute_change to pass the validation. 
- Otherwise, metric value has to be <= baseline model metric value - min_absolute_change to pass the validation. 
 
- min_relative_change – - (Optional) A floating point number between 0 and 1 representing the minimum relative change (in percentage of baseline model metric value) for candidate model to pass the comparison with the baseline model. - If higher is better for the metric, metric value has to be >= baseline model metric value * (1 + min_relative_change) 
- Otherwise, metric value has to be <= baseline model metric value * (1 - min_relative_change) 
- Note that if the baseline model metric value is equal to 0, the threshold falls back performing a simple verification that the candidate metric value is better than the baseline metric value, i.e. metric value >= baseline model metric value + 1e-10 if higher is better; metric value <= baseline model metric value - 1e-10 if lower is better. 
 
- greater_is_better – A required boolean representing whether higher value is better for the metric. 
- higher_is_better – - Deprecated since version 2.3.0: Use - greater_is_betterinstead.- A required boolean representing whether higher value is better for the metric. 
 
 - 
property greater_is_better
- Boolean value representing whether higher value is better for the metric. 
 - 
property higher_is_better
- Warning - mlflow.models.evaluation.validation.MetricThreshold.higher_is_betteris deprecated. This method will be removed in a future release. Use- The attribute `higher_is_better` is deprecated. Use `greater_is_better` instead.instead.- Boolean value representing whether higher value is better for the metric. 
 - 
property min_absolute_change
- Value of the minimum absolute change required to pass model comparison with baseline model. 
 
- 
class mlflow.models.Model(artifact_path=None, run_id=None, utc_time_created=None, flavors=None, signature=None, saved_input_example_info: Optional[dict] = None, model_uuid: Optional[Union[str, Callable]] = <function Model.<lambda>>, mlflow_version: Optional[str] = '3.0.0rc0', metadata: Optional[dict] = None, model_size_bytes: Optional[int] = None, resources: Optional[Union[str, list]] = None, env_vars: Optional[list] = None, auth_policy: Optional[mlflow.models.auth_policy.AuthPolicy] = None, model_id: Optional[str] = None, prompts: Optional[list] = None, **kwargs)[source]
- Bases: - object- An MLflow Model that can support multiple model flavors. Provides APIs for implementing new Model flavors. - 
add_flavor(name, **params) → mlflow.models.model.Model[source]
- Add an entry for how to serve the model in a given format. 
 - 
property auth_policy
- An optional dictionary that contains the auth policy required to serve the model. - Getter
- Retrieves the auth_policy required to serve the model 
- Setter
- Sets the auth_policy required to serve the model 
- Type
- Dict[str, dict] 
 - Note - Experimental: This property may change or be removed in a future release without warning. 
 - 
classmethod from_dict(model_dict) → mlflow.models.model.Model[source]
- Load a model from its YAML representation. 
 - 
get_input_schema()[source]
- Retrieves the input schema of the Model iff the model was saved with a schema definition. 
 - 
get_model_info(logged_model: Optional[LoggedModel] = None) → mlflow.models.model.ModelInfo[source]
- Create a - ModelInfoinstance that contains the model metadata.
 - 
get_output_schema()[source]
- Retrieves the output schema of the Model iff the model was saved with a schema definition. 
 - 
get_params_schema()[source]
- Retrieves the parameters schema of the Model iff the model was saved with a schema definition. 
 - 
get_serving_input(path: str) → Optional[str][source]
- Load serving input example from a model directory. Returns None if there is no serving input example. - Parameters
- path – Path to the model directory. 
- Returns
- Serving input example or None if the model has no serving input example. 
 
 - 
classmethod load(path) → mlflow.models.model.Model[source]
- Load a model from its YAML representation. - Parameters
- path – A local filesystem path or URI referring to the MLmodel YAML file representation of the Model object or to the directory containing the MLmodel YAML file representation. 
- Returns
- An instance of Model. 
 
 - 
load_input_example(path: Optional[str] = None) → Optional[str][source]
- Load the input example saved along a model. Returns None if there is no example metadata (i.e. the model was saved without example). Raises FileNotFoundError if there is model metadata but the example file is missing. - Parameters
- path – Model or run URI, or path to the model directory. e.g. models://<model_name>/<model_version>, runs:/<run_id>/<artifact_path> or /path/to/model 
- Returns
- Input example (NumPy ndarray, SciPy csc_matrix, SciPy csr_matrix, pandas DataFrame, dict) or None if the model has no example. 
 
 - 
load_input_example_params(path: str)[source]
- Load the params of input example saved along a model. Returns None if there are no params in the input_example. - Parameters
- path – Path to the model directory. 
- Returns
- params (dict) or None if the model has no params. 
 
 - 
classmethod log(artifact_path, flavor, registered_model_name=None, await_registration_for=300, metadata=None, run_id=None, resources=None, auth_policy=None, prompts=None, name: Optional[str] = None, model_type: Optional[str] = None, params: Optional[dict] = None, tags: Optional[dict] = None, step: int = 0, model_id: Optional[str] = None, **kwargs) → mlflow.models.model.ModelInfo[source]
- Log model using supplied flavor module. If no run is active, this method will create a new active run. - Parameters
- artifact_path – Deprecated. Use name instead. 
- flavor – Flavor module to save the model with. The module must have the - save_modelfunction that will persist the model as a valid MLflow model.
- registered_model_name – If given, create a model version under - registered_model_name, also creating a registered model if one with the given name does not exist.
- await_registration_for – Number of seconds to wait for the model version to finish being created and is in - READYstatus. By default, the function waits for five minutes. Specify 0 or None to skip waiting.
- metadata – {{ metadata }} 
- run_id – The run ID to associate with this model. 
- resources – {{ resources }} 
- auth_policy – {{ auth_policy }} 
- prompts – {{ prompts }} 
- name – The name of the model. 
- model_type – {{ model_type }} 
- params – {{ params }} 
- tags – {{ tags }} 
- step – {{ step }} 
- model_id – {{ model_id }} 
- kwargs – Extra args passed to the model flavor. 
 
- Returns
- A - ModelInfoinstance that contains the metadata of the logged model.
 
 - 
property metadata
- Custom metadata dictionary passed to the model and stored in the MLmodel file. - Getter
- Retrieves custom metadata that have been applied to a model instance. 
- Setter
- Sets a dictionary of custom keys and values to be included with the model instance 
- Type
- Optional[Dict[str, Any]] 
- Returns
- A Dictionary of user-defined metadata iff defined. 
 - # Create and log a model with metadata to the Model Registry from sklearn import datasets from sklearn.ensemble import RandomForestClassifier import mlflow from mlflow.models import infer_signature with mlflow.start_run(): iris = datasets.load_iris() clf = RandomForestClassifier() clf.fit(iris.data, iris.target) signature = infer_signature(iris.data, iris.target) mlflow.sklearn.log_model( clf, "iris_rf", signature=signature, registered_model_name="model-with-metadata", metadata={"metadata_key": "metadata_value"}, ) # model uri for the above model model_uri = "models:/model-with-metadata/1" # Load the model and access the custom metadata model = mlflow.pyfunc.load_model(model_uri=model_uri) assert model.metadata.metadata["metadata_key"] == "metadata_value" - Note - Experimental: This property may change or be removed in a future release without warning. 
 - 
property model_size_bytes
- An optional integer that represents the model size in bytes - Getter
- Retrieves the model size if it’s calculated when the model is saved 
- Setter
- Sets the model size to a model instance 
- Type
- Optional[int] 
 
 - 
property resources
- An optional dictionary that contains the resources required to serve the model. - Getter
- Retrieves the resources required to serve the model 
- Setter
- Sets the resources required to serve the model 
- Type
- Dict[str, Dict[ResourceType, List[Dict]]] 
 - Note - Experimental: This property may change or be removed in a future release without warning. 
 - 
save(path) → None[source]
- Write the model as a local YAML file. 
 - 
property saved_input_example_info
- A dictionary that contains the metadata of the saved input example, e.g., - {"artifact_path": "input_example.json", "type": "dataframe", "pandas_orient": "split"}.
 - 
property signature
- An optional definition of the expected inputs to and outputs from a model object, defined with both field names and data types. Signatures support both column-based and tensor-based inputs and outputs. - Getter
- Retrieves the signature of a model instance iff the model was saved with a signature definition. 
- Setter
- Sets a signature to a model instance. 
- Type
- Optional[ModelSignature] 
 
 - 
to_dict() → dict[source]
- Serialize the model to a dictionary. 
 - 
to_json() → str[source]
- Write the model as json. 
 - 
to_yaml(stream=None) → str[source]
- Write the model as yaml string. 
 
- 
- 
class mlflow.models.ModelConfig(*, development_config: Optional[Union[str, dict]] = None)[source]
- Bases: - object- ModelConfig used in code to read a YAML configuration file or a dictionary. - Parameters
- development_config – Path to the YAML configuration file or a dictionary containing the configuration. If the configuration is not provided, an error is raised 
 - from mlflow.models import ModelConfig # Load the configuration from a dictionary config = ModelConfig(development_config={"key1": "value1"}) print(config.get("key1")) - from mlflow.models import ModelConfig # Load the configuration from a file config = ModelConfig(development_config="config.yaml") print(config.get("key1")) - When invoking the ModelConfig locally in a model file, development_config can be passed in which would be used as configuration for the model. - import mlflow from mlflow.models import ModelConfig config = ModelConfig(development_config={"key1": "value1"}) class TestModel(mlflow.pyfunc.PythonModel): def predict(self, context, model_input, params=None): return config.get("key1") mlflow.models.set_model(TestModel()) - But this development_config configuration file will be overridden when logging a model. When no model_config is passed in while logging the model, an error will be raised when trying to load the model using ModelConfig. Note: development_config is not used when logging the model. - model_config = {"key1": "value2"} with mlflow.start_run(): model_info = mlflow.pyfunc.log_model( artifact_path="model", python_model="agent.py", model_config=model_config ) loaded_model = mlflow.pyfunc.load_model(model_info.model_uri) # This will print "value2" as the model_config passed in while logging the model print(loaded_model.predict(None)) - 
get(key)[source]
- Gets the value of a top-level parameter in the configuration. 
 - 
to_dict()[source]
- Returns the configuration as a dictionary. 
 
- 
class mlflow.models.ModelSignature(inputs: Optional[Union[mlflow.types.schema.Schema, dataclasses.dataclass]] = None, outputs: Optional[Union[mlflow.types.schema.Schema, dataclasses.dataclass]] = None, params: Optional[mlflow.types.schema.ParamSchema] = None)[source]
- Bases: - object- ModelSignature specifies schema of model’s inputs, outputs and params. - ModelSignature can be - inferredfrom training dataset, model predictions using and params for inference, or constructed by hand by passing an input and output- Schema, and params- ParamSchema.- 
classmethod from_dict(signature_dict: dict)[source]
- Deserialize from dictionary representation. - Parameters
- signature_dict – Dictionary representation of model signature. Expected dictionary format: {‘inputs’: <json string>, ‘outputs’: <json string>, ‘params’: <json string>” } 
- Returns
- ModelSignature populated with the data form the dictionary. 
 
 - 
to_dict() → dict[source]
- Serialize into a ‘jsonable’ dictionary. - Input and output schema are represented as json strings. This is so that the representation is compact when embedded in an MLmodel yaml file. - Returns
- dictionary representation with input and output schema represented as json strings. 
 
 
- 
classmethod 
- 
class mlflow.models.Resource[source]
- Bases: - abc.ABC- Base class for defining the resources needed to serve a model. - Parameters
- type (ResourceType) – The resource type. 
- target_uri (str) – The target URI where these resources are hosted. 
 
 - 
abstract classmethod from_dict(data: dict)[source]
- Convert the dictionary to a Resource. Subclasses must implement this method. 
 - 
abstract property target_uri
- The target URI where the resource is hosted (must be defined by subclasses). 
 - 
abstract to_dict()[source]
- Convert the resource to a dictionary. Subclasses must implement this method. 
 
- 
class mlflow.models.ResourceType(value)[source]
- Bases: - enum.Enum- Enum to define the different types of resources needed to serve a model. 
- 
mlflow.models.add_libraries_to_model(model_uri, run_id=None, registered_model_name=None)[source]
- Note - Experimental: This function may change or be removed in a future release without warning. - Given a registered model_uri (e.g. models:/<model_name>/<model_version>), this utility re-logs the model along with all the required model libraries back to the Model Registry. The required model libraries are stored along with the model as model artifacts. In addition, supporting files to the model (e.g. conda.yaml, requirements.txt) are modified to use the added libraries. - By default, this utility creates a new model version under the same registered model specified by - model_uri. This behavior can be overridden by specifying the- registered_model_nameargument.- Parameters
- model_uri – A registered model uri in the Model Registry of the form models:/<model_name>/<model_version/stage/latest> 
- run_id – The ID of the run to which the model with libraries is logged. If None, the model with libraries is logged to the source run corresponding to model version specified by - model_uri; if the model version does not have a source run, a new run created.
- registered_model_name – The new model version (model with its libraries) is registered under the inputted registered_model_name. If None, a new version is logged to the existing model in the Model Registry. 
 
 - Note - This utility only operates on a model that has been registered to the Model Registry. - Note - The libraries are only compatible with the platform on which they are added. Cross platform libraries are not supported. - # Create and log a model to the Model Registry import pandas as pd from sklearn import datasets from sklearn.ensemble import RandomForestClassifier import mlflow import mlflow.sklearn from mlflow.models import infer_signature with mlflow.start_run(): iris = datasets.load_iris() iris_train = pd.DataFrame(iris.data, columns=iris.feature_names) clf = RandomForestClassifier(max_depth=7, random_state=0) clf.fit(iris_train, iris.target) signature = infer_signature(iris_train, clf.predict(iris_train)) mlflow.sklearn.log_model( clf, "iris_rf", signature=signature, registered_model_name="model-with-libs" ) # model uri for the above model model_uri = "models:/model-with-libs/1" # Import utility from mlflow.models.utils import add_libraries_to_model # Log libraries to the original run of the model add_libraries_to_model(model_uri) # Log libraries to some run_id existing_run_id = "21df94e6bdef4631a9d9cb56f211767f" add_libraries_to_model(model_uri, run_id=existing_run_id) # Log libraries to a new run with mlflow.start_run(): add_libraries_to_model(model_uri) # Log libraries to a new registered model named 'new-model' with mlflow.start_run(): add_libraries_to_model(model_uri, registered_model_name="new-model") 
- 
mlflow.models.build_docker(model_uri=None, name='mlflow-pyfunc', env_manager='virtualenv', mlflow_home=None, install_java=False, install_mlflow=False, enable_mlserver=False, base_image=None)[source]
- Builds a Docker image whose default entrypoint serves an MLflow model at port 8080, using the python_function flavor. The container serves the model referenced by - model_uri, if specified. If- model_uriis not specified, an MLflow Model directory must be mounted as a volume into the /opt/ml/model directory in the container.- Important - Since MLflow 2.10.1, the Docker image built with - --model-uridoes not install Java for improved performance, unless the model flavor is one of- ["johnsnowlabs", "h2o", "mleap", "spark"]. If you need to install Java for other flavors, e.g. custom Python model that uses SparkML, please specify- install-java=Trueto enforce Java installation. For earlier versions, Java is always installed to the image.- Warning - If - model_uriis unspecified, the resulting image doesn’t support serving models with the RFunc or Java MLeap model servers.- NB: by default, the container will start nginx and gunicorn processes. If you don’t need the nginx process to be started (for instance if you deploy your container to Google Cloud Run), you can disable it via the DISABLE_NGINX environment variable: - docker run -p 5001:8080 -e DISABLE_NGINX=true "my-image-name" - See https://www.mlflow.org/docs/latest/python_api/mlflow.pyfunc.html for more information on the ‘python_function’ flavor. - Parameters
- model_uri – URI to the model. A local path, a ‘runs:/’ URI, or a remote storage URI (e.g., an ‘s3://’ URI). For more information about supported remote URIs for model artifacts, see https://mlflow.org/docs/latest/tracking.html#artifact-stores 
- name – Name of the Docker image to build. Defaults to ‘mlflow-pyfunc’. 
- env_manager – If specified, create an environment for MLmodel using the specified environment manager. The following values are supported: (1) virtualenv (default): use virtualenv and pyenv for Python version management (2) conda: use conda (3) local: use the local environment without creating a new one. 
- mlflow_home – Path to local clone of MLflow project. Use for development only. 
- install_java – If specified, install Java in the image. Default is False in order to reduce both the image size and the build time. Model flavors requiring Java will enable this setting automatically, such as the Spark flavor. (This argument is only available in MLflow 2.10.1 and later. In earlier versions, Java is always installed to the image.) 
- install_mlflow – If specified and there is a conda or virtualenv environment to be activated mlflow will be installed into the environment after it has been activated. The version of installed mlflow will be the same as the one used to invoke this command. 
- enable_mlserver – If specified, the image will be built with the Seldon MLserver as backend. 
- base_image – Base image for the Docker image. If not specified, the default image is either UBUNTU_BASE_IMAGE = “ubuntu:20.04” or PYTHON_SLIM_BASE_IMAGE = “python:{version}-slim” Note: If custom image is used, there are no guarantees that the image will work. You may find greater compatibility by building your image on top of the ubuntu images. In addition, you must install Java and virtualenv to have the image work properly. 
 
 
- 
mlflow.models.convert_input_example_to_serving_input(input_example) → Optional[str][source]
- Note - Experimental: This function may change or be removed in a future release without warning. - Helper function to convert a model’s input example to a serving input example that can be used for model inference in the scoring server. - Parameters
- input_example – model input example. Supported types are pandas.DataFrame, numpy.ndarray, dictionary of (name -> numpy.ndarray), list, scalars and dicts with json serializable values. 
- Returns
- serving input example as a json string 
 
- 
mlflow.models.evaluate(model=None, data=None, *, model_type=None, targets=None, predictions=None, dataset_path=None, feature_names=None, evaluators=None, evaluator_config=None, custom_metrics=None, extra_metrics=None, custom_artifacts=None, validation_thresholds=None, baseline_model=None, env_manager='local', model_config=None, baseline_config=None, inference_params=None)[source]
- Evaluate the model performance on given data and selected metrics. - This function evaluates a PyFunc model or custom callable on the specified dataset using specified - evaluators, and logs resulting metrics & artifacts to MLflow tracking server. Users can also skip setting- modeland put the model outputs in- datadirectly for evaluation. For detailed information, please read the Model Evaluation documentation.- Default Evaluator behavior:
- The default evaluator, which can be invoked with - evaluators="default"or- evaluators=None, supports model types listed below. For each pre-defined model type, the default evaluator evaluates your model on a selected set of metrics and generate artifacts like plots. Please find more details below.
- For both the - "regressor"and- "classifier"model types, the default evaluator generates model summary plots and feature importance plots using SHAP.
- For regressor models, the default evaluator additionally logs:
- metrics: example_count, mean_absolute_error, mean_squared_error, root_mean_squared_error, sum_on_target, mean_on_target, r2_score, max_error, mean_absolute_percentage_error. 
 
 
- For binary classifiers, the default evaluator additionally logs:
- metrics: true_negatives, false_positives, false_negatives, true_positives, recall, precision, f1_score, accuracy_score, example_count, log_loss, roc_auc, precision_recall_auc. 
- artifacts: lift curve plot, precision-recall plot, ROC plot. 
 
 
- For multiclass classifiers, the default evaluator additionally logs:
- metrics: accuracy_score, example_count, f1_score_micro, f1_score_macro, log_loss 
- artifacts: A CSV file for “per_class_metrics” (per-class metrics includes true_negatives/false_positives/false_negatives/true_positives/recall/precision/roc_auc, precision_recall_auc), precision-recall merged curves plot, ROC merged curves plot. 
 
 
- For question-answering models, the default evaluator logs:
- metrics: - exact_match,- token_count, toxicity (requires evaluate, torch, flesch_kincaid_grade_level (requires textstat) and ari_grade_level.
- artifacts: A JSON file containing the inputs, outputs, targets (if the - targetsargument is supplied), and per-row metrics of the model in tabular format.
 
 
- For text-summarization models, the default evaluator logs:
- metrics: - token_count, ROUGE (requires evaluate, nltk, and rouge_score to be installed), toxicity (requires evaluate, torch, transformers), ari_grade_level (requires textstat), flesch_kincaid_grade_level (requires textstat).
- artifacts: A JSON file containing the inputs, outputs, targets (if the - targetsargument is supplied), and per-row metrics of the model in the tabular format.
 
 
- For text models, the default evaluator logs:
- metrics: - token_count, toxicity (requires evaluate, torch, transformers), ari_grade_level (requires textstat), flesch_kincaid_grade_level (requires textstat).
- artifacts: A JSON file containing the inputs, outputs, targets (if the - targetsargument is supplied), and per-row metrics of the model in tabular format.
 
 
- For retriever models, the default evaluator logs:
- metrics: - precision_at_k(k),- recall_at_k(k)and- ndcg_at_k(k)- all have a default value of- retriever_k= 3.
- artifacts: A JSON file containing the inputs, outputs, targets, and per-row metrics of the model in tabular format. 
 
 
- For sklearn models, the default evaluator additionally logs the model’s evaluation criterion (e.g. mean accuracy for a classifier) computed by model.score method. 
- The metrics/artifacts listed above are logged to the active MLflow run. If no active run exists, a new MLflow run is created for logging these metrics and artifacts. 
- Additionally, information about the specified dataset - hash, name (if specified), path (if specified), and the UUID of the model that evaluated it - is logged to the - mlflow.datasetstag.
- The available evaluator_configoptions for the default evaluator include:
- log_model_explainability: A boolean value specifying whether or not to log model explainability insights, default value is True. 
- explainability_algorithm: A string to specify the SHAP Explainer algorithm for model explainability. Supported algorithm includes: ‘exact’, ‘permutation’, ‘partition’, ‘kernel’. If not set, - shap.Explaineris used with the “auto” algorithm, which chooses the best Explainer based on the model.
- explainability_nsamples: The number of sample rows to use for computing model explainability insights. Default value is 2000. 
- explainability_kernel_link: The kernel link function used by shap kernel explainer. Available values are “identity” and “logit”. Default value is “identity”. 
- max_classes_for_multiclass_roc_pr: For multiclass classification tasks, the maximum number of classes for which to log the per-class ROC curve and Precision-Recall curve. If the number of classes is larger than the configured maximum, these curves are not logged. 
- metric_prefix: An optional prefix to prepend to the name of each metric and artifact produced during evaluation. 
- log_metrics_with_dataset_info: A boolean value specifying whether or not to include information about the evaluation dataset in the name of each metric logged to MLflow Tracking during evaluation, default value is True. 
- pos_label: If specified, the positive label to use when computing classification metrics such as precision, recall, f1, etc. for binary classification models. For multiclass classification and regression models, this parameter will be ignored. 
- average: The averaging method to use when computing classification metrics such as precision, recall, f1, etc. for multiclass classification models (default: - 'weighted'). For binary classification and regression models, this parameter will be ignored.
- sample_weights: Weights for each sample to apply when computing model performance metrics. 
- col_mapping: A dictionary mapping column names in the input dataset or output predictions to column names used when invoking the evaluation functions. 
- retriever_k: A parameter used when - model_type="retriever"as the number of top-ranked retrieved documents to use when computing the built-in metric- precision_at_k(k),- recall_at_k(k)and- ndcg_at_k(k). Default value is 3. For all other model types, this parameter will be ignored.
 
 
- The available 
- Limitations of evaluation dataset:
- For classification tasks, dataset labels are used to infer the total number of classes. 
- For binary classification tasks, the negative label value must be 0 or -1 or False, and the positive label value must be 1 or True. 
 
 
- Limitations of metrics/artifacts computation:
- For classification tasks, some metric and artifact computations require the model to output class probabilities. Currently, for scikit-learn models, the default evaluator calls the - predict_probamethod on the underlying model to obtain probabilities. For other model types, the default evaluator does not compute metrics/artifacts that require probability outputs.
 
 
- Limitations of default evaluator logging model explainability insights:
- The - shap.Explainer- autoalgorithm uses the- Linearexplainer for linear models and the- Treeexplainer for tree models. Because SHAP’s- Linearand- Treeexplainers do not support multi-class classification, the default evaluator falls back to using the- Exactor- Permutationexplainers for multi-class classification tasks.
- Logging model explainability insights is not currently supported for PySpark models. 
- The evaluation dataset label values must be numeric or boolean, all feature values must be numeric, and each feature column must only contain scalar values. 
 
 
- Limitations when environment restoration is enabled:
- When environment restoration is enabled for the evaluated model (i.e. a non-local - env_manageris specified), the model is loaded as a client that invokes a MLflow Model Scoring Server process in an independent Python environment with the model’s training time dependencies installed. As such, methods like- predict_proba(for probability outputs) or- score(computes the evaluation criterian for sklearn models) of the model become inaccessible and the default evaluator does not compute metrics or artifacts that require those methods.
- Because the model is an MLflow Model Server process, SHAP explanations are slower to compute. As such, model explainaibility is disabled when a non-local - env_managerspecified, unless the- evaluator_configoption log_model_explainability is explicitly set to- True.
 
 
 
 - Parameters
- model – - Optional. If specified, it should be one of the following: - A pyfunc model instance 
- A URI referring to a pyfunc model 
- A URI referring to an MLflow Deployments endpoint e.g. - "endpoints:/my-chat"
- A callable function: This function should be able to take in model input and return predictions. It should follow the signature of the - predictmethod. Here’s an example of a valid function:- model = mlflow.pyfunc.load_model(model_uri) def fn(model_input): return model.predict(model_input) 
 - If omitted, it indicates a static dataset will be used for evaluation instead of a model. In this case, the - dataargument must be a Pandas DataFrame or an mlflow PandasDataset that contains model outputs, and the- predictionsargument must be the name of the column in- datathat contains model outputs.
- data – - One of the following: - A numpy array or list of evaluation features, excluding labels. 
- A Pandas DataFrame containing evaluation features, labels, and optionally model
- outputs. Model outputs are required to be provided when model is unspecified. If - feature_namesargument not specified, all columns except for the label column and model_output column are regarded as feature columns. Otherwise, only column names present in- feature_namesare regarded as feature columns.
 
- A Spark DataFrame containing evaluation features and labels. If
- feature_namesargument not specified, all columns except for the label column are regarded as feature columns. Otherwise, only column names present in- feature_namesare regarded as feature columns. Only the first 10000 rows in the Spark DataFrame will be used as evaluation data.
 
- A mlflow.data.dataset.Datasetinstance containing evaluation
- features, labels, and optionally model outputs. Model outputs are only supported with a PandasDataset. Model outputs are required when model is unspecified, and should be specified via the - predictionsprerty of the PandasDataset.
 
- A 
 
- model_type – - (Optional) A string describing the model type. The default evaluator supports the following model types: - 'classifier'
- 'regressor'
- 'question-answering'
- 'text-summarization'
- 'text'
- 'retriever'
 - If no - model_typeis specified, then you must provide a a list of metrics to compute via the- extra_metricsparam.- Note - 'question-answering',- 'text-summarization',- 'text', and- 'retriever'are experimental and may be changed or removed in a future release.
- targets – If - datais a numpy array or list, a numpy array or list of evaluation labels. If- datais a DataFrame, the string name of a column from- datathat contains evaluation labels. Required for classifier and regressor models, but optional for question-answering, text-summarization, and text models. If- datais a- mlflow.data.dataset.Datasetthat defines targets, then- targetsis optional.
- predictions – - Optional. The name of the column that contains model outputs. - When - modelis specified and outputs multiple columns,- predictionscan be used to specify the name of the column that will be used to store model outputs for evaluation.
- When - modelis not specified and- datais a pandas dataframe,- predictionscan be used to specify the name of the column in- datathat contains model outputs.
 - # Evaluate a model that outputs multiple columns data = pd.DataFrame({"question": ["foo"]}) def model(inputs): return pd.DataFrame({"answer": ["bar"], "source": ["baz"]}) results = evaluate( model=model, data=data, predictions="answer", # other arguments if needed ) # Evaluate a static dataset data = pd.DataFrame({"question": ["foo"], "answer": ["bar"], "source": ["baz"]}) results = evaluate( data=data, predictions="answer", # other arguments if needed ) 
- dataset_path – (Optional) The path where the data is stored. Must not contain double quotes ( - “). If specified, the path is logged to the- mlflow.datasetstag for lineage tracking purposes.
- feature_names – (Optional) A list. If the - dataargument is a numpy array or list,- feature_namesis a list of the feature names for each feature. If- feature_names=None, then the- feature_namesare generated using the format- feature_{feature_index}. If the- dataargument is a Pandas DataFrame or a Spark DataFrame,- feature_namesis a list of the names of the feature columns in the DataFrame. If- feature_names=None, then all columns except the label column and the predictions column are regarded as feature columns.
- evaluators – The name of the evaluator to use for model evaluation, or a list of evaluator names. If unspecified, all evaluators capable of evaluating the specified model on the specified dataset are used. The default evaluator can be referred to by the name - "default". To see all available evaluators, call- mlflow.models.list_evaluators().
- evaluator_config – A dictionary of additional configurations to supply to the evaluator. If multiple evaluators are specified, each configuration should be supplied as a nested dictionary whose key is the evaluator name. 
- custom_metrics – Deprecated. Use - extra_metricsinstead.
- extra_metrics – - (Optional) A list of - EvaluationMetricobjects. These metrics are computed in addition to the default metrics associated with pre-defined model_type, and setting model_type=None will only compute the metrics specified in extra_metrics. See the mlflow.metrics module for more information about the builtin metrics and how to define extra metrics.- import mlflow import numpy as np def root_mean_squared_error(eval_df, _builtin_metrics): return np.sqrt((np.abs(eval_df["prediction"] - eval_df["target"]) ** 2).mean()) rmse_metric = mlflow.models.make_metric( eval_fn=root_mean_squared_error, greater_is_better=False, ) mlflow.evaluate(..., extra_metrics=[rmse_metric]) 
- custom_artifacts – - (Optional) A list of custom artifact functions with the following signature: - def custom_artifact( eval_df: Union[pandas.Dataframe, pyspark.sql.DataFrame], builtin_metrics: Dict[str, float], artifacts_dir: str, ) -> Dict[str, Any]: """ Args: eval_df: A Pandas or Spark DataFrame containing ``prediction`` and ``target`` column. The ``prediction`` column contains the predictions made by the model. The ``target`` column contains the corresponding labels to the predictions made on that row. builtin_metrics: A dictionary containing the metrics calculated by the default evaluator. The keys are the names of the metrics and the values are the scalar values of the metrics. Refer to the DefaultEvaluator behavior section for what metrics will be returned based on the type of model (i.e. classifier or regressor). artifacts_dir: A temporary directory path that can be used by the custom artifacts function to temporarily store produced artifacts. The directory will be deleted after the artifacts are logged. Returns: A dictionary that maps artifact names to artifact objects (e.g. a Matplotlib Figure) or to artifact paths within ``artifacts_dir``. """ ... - Object types that artifacts can be represented as: - A string uri representing the file path to the artifact. MLflow will infer the type of the artifact based on the file extension. 
- A string representation of a JSON object. This will be saved as a .json artifact. 
- Pandas DataFrame. This will be resolved as a CSV artifact. 
- Numpy array. This will be saved as a .npy artifact. 
- Matplotlib Figure. This will be saved as an image artifact. Note that - matplotlib.pyplot.savefigis called behind the scene with default configurations. To customize, either save the figure with the desired configurations and return its file path or define customizations through environment variables in- matplotlib.rcParams.
- Other objects will be attempted to be pickled with the default protocol. 
 - import mlflow import matplotlib.pyplot as plt def scatter_plot(eval_df, builtin_metrics, artifacts_dir): plt.scatter(eval_df["prediction"], eval_df["target"]) plt.xlabel("Targets") plt.ylabel("Predictions") plt.title("Targets vs. Predictions") plt.savefig(os.path.join(artifacts_dir, "example.png")) plt.close() return {"pred_target_scatter": os.path.join(artifacts_dir, "example.png")} def pred_sample(eval_df, _builtin_metrics, _artifacts_dir): return {"pred_sample": pred_sample.head(10)} mlflow.evaluate(..., custom_artifacts=[scatter_plot, pred_sample]) 
- validation_thresholds – DEPRECATED. Please use - mlflow.validate_evaluation_results()API instead for running model validation against baseline.
- baseline_model – DEPRECATED. Please use - mlflow.validate_evaluation_results()API instead for running model validation against baseline.
- env_manager – - Specify an environment manager to load the candidate - modelin isolated Python environments and restore their dependencies. Default value is- local, and the following values are supported:- virtualenv: (Recommended) Use virtualenv to restore the python environment that was used to train the model.
- conda: Use Conda to restore the software environment that was used to train the model.
- local: Use the current Python environment for model inference, which may differ from the environment used to train the model and may lead to errors or invalid predictions.
 
- model_config – the model configuration to use for loading the model with pyfunc. Inspect the model’s pyfunc flavor to know which keys are supported for your specific model. If not indicated, the default model configuration from the model is used (if any). 
- baseline_config – DEPRECATED. Please use - mlflow.validate_evaluation_results()API instead for running model validation against baseline.
- inference_params – (Optional) A dictionary of inference parameters to be passed to the model when making predictions, such as - {"max_tokens": 100}. This is only used when the- modelis an MLflow Deployments endpoint URI e.g.- "endpoints:/my-chat"
 
- Returns
- An - mlflow.models.EvaluationResultinstance containing metrics of evaluating the model with the given dataset.
 
- 
mlflow.models.get_model_info(model_uri: str) → mlflow.models.model.ModelInfo[source]
- Get metadata for the specified model, such as its input/output signature. - Parameters
- model_uri – - The location, in URI format, of the MLflow model. For example: - /Users/me/path/to/local/model
- relative/path/to/local/model
- s3://my_bucket/path/to/model
- runs:/<mlflow_run_id>/run-relative/path/to/model
- models:/<model_name>/<model_version>
- models:/<model_name>/<stage>
- mlflow-artifacts:/path/to/model
 - For more information about supported URI schemes, see Referencing Artifacts. 
- Returns
- A - ModelInfoinstance that contains the metadata of the logged model.
 - import mlflow.models import mlflow.sklearn from sklearn.ensemble import RandomForestRegressor with mlflow.start_run() as run: params = {"n_estimators": 3, "random_state": 42} X, y = [[0, 1]], [1] signature = mlflow.models.infer_signature(X, y) rfr = RandomForestRegressor(**params).fit(X, y) mlflow.log_params(params) mlflow.sklearn.log_model(rfr, artifact_path="sklearn-model", signature=signature) model_uri = f"runs:/{run.info.run_id}/sklearn-model" # Get model info with model_uri model_info = mlflow.models.get_model_info(model_uri) # Get model signature directly model_signature = model_info.signature assert model_signature == signature 
- 
mlflow.models.infer_pip_requirements(model_uri, flavor, fallback=None, timeout=None, extra_env_vars=None)[source]
- Infers the pip requirements of the specified model by creating a subprocess and loading the model in it to determine which packages are imported. - Parameters
- model_uri – The URI of the model. 
- flavor – The flavor name of the model. 
- fallback – If provided, an unexpected error during the inference procedure is swallowed and the value of - fallbackis returned. Otherwise, the error is raised.
- timeout – If specified, the inference operation is bound by the timeout (in seconds). 
- extra_env_vars – A dictionary of extra environment variables to pass to the subprocess. Default to None. 
 
- Returns
- A list of inferred pip requirements (e.g. - ["scikit-learn==0.24.2", ...]).
 
- 
mlflow.models.infer_signature(model_input: Any = None, model_output: MlflowInferableDataset = None, params: Optional[dict] = None) → mlflow.models.signature.ModelSignature[source]
- Infer an MLflow model signature from the training data (input), model predictions (output) and parameters (for inference). - The signature represents model input and output as data frames with (optionally) named columns and data type specified as one of types defined in - mlflow.types.DataType. It also includes parameters schema for inference, . This method will raise an exception if the user data contains incompatible types or is not passed in one of the supported formats listed below.- The input should be one of these:
- pandas.DataFrame 
- pandas.Series 
- dictionary of { name -> numpy.ndarray} 
- numpy.ndarray 
- pyspark.sql.DataFrame 
- scipy.sparse.csr_matrix 
- scipy.sparse.csc_matrix 
- dictionary / list of dictionaries of JSON-convertible types 
 
 - The element types should be mappable to one of - mlflow.types.DataType.- For pyspark.sql.DataFrame inputs, columns of type DateType and TimestampType are both inferred as type - datetime, which is coerced to TimestampType at inference.- Parameters
- model_input – Valid input to the model. E.g. (a subset of) the training dataset. 
- model_output – Valid model output. E.g. Model predictions for the (subset of) training dataset. 
- params – - Valid parameters for inference. It should be a dictionary of parameters that can be set on the model during inference by passing params to pyfunc predict method. - An example of valid parameters: - from mlflow.models import infer_signature from mlflow.transformers import generate_signature_output # Define parameters for inference params = { "num_beams": 5, "max_length": 30, "do_sample": True, "remove_invalid_values": True, } # Infer the signature including parameters signature = infer_signature( data, generate_signature_output(model, data), params=params, ) # Saving model with model signature mlflow.transformers.save_model( model, path=model_path, signature=signature, ) pyfunc_loaded = mlflow.pyfunc.load_model(model_path) # Passing params to `predict` function directly result = pyfunc_loaded.predict(data, params=params) 
 
- Returns
- ModelSignature 
 
- 
mlflow.models.list_evaluators()[source]
- Return a name list for all available Evaluators. 
- 
mlflow.models.make_metric(*, eval_fn, greater_is_better, name=None, long_name=None, version=None, metric_details=None, metric_metadata=None, genai_metric_args=None)[source]
- A factory function to create an - EvaluationMetricobject.- Parameters
- eval_fn – - A function that computes the metric with the following signature: - def eval_fn( predictions: pandas.Series, targets: pandas.Series, metrics: Dict[str, MetricValue], **kwargs, ) -> Union[float, MetricValue]: """ Args: predictions: A pandas Series containing the predictions made by the model. targets: (Optional) A pandas Series containing the corresponding labels for the predictions made on that input. metrics: (Optional) A dictionary containing the metrics calculated by the default evaluator. The keys are the names of the metrics and the values are the metric values. To access the MetricValue for the metrics calculated by the system, make sure to specify the type hint for this parameter as Dict[str, MetricValue]. Refer to the DefaultEvaluator behavior section for what metrics will be returned based on the type of model (i.e. classifier or regressor). kwargs: Includes a list of args that are used to compute the metric. These args could information coming from input data, model outputs or parameters specified in the `evaluator_config` argument of the `mlflow.evaluate` API. kwargs: Includes a list of args that are used to compute the metric. These args could be information coming from input data, model outputs, other metrics, or parameters specified in the `evaluator_config` argument of the `mlflow.evaluate` API. Returns: MetricValue with per-row scores, per-row justifications, and aggregate results. """ ... 
- greater_is_better – Whether a higher value of the metric is better. 
- name – The name of the metric. This argument must be specified if - eval_fnis a lambda function or the- eval_fn.__name__attribute is not available.
- long_name – (Optional) The long name of the metric. For example, - "mean_squared_error"for- "mse".
- version – (Optional) The metric version. For example - v1.
- metric_details – (Optional) A description of the metric and how it is calculated. 
- metric_metadata – (Optional) A dictionary containing metadata for the metric. 
- genai_metric_args – (Optional) A dictionary containing arguments specified by users when calling make_genai_metric or make_genai_metric_from_prompt. Those args are persisted so that we can deserialize the same metric object later. 
 
 
- 
mlflow.models.predict(model_uri, input_data=None, input_path=None, content_type='json', output_path=None, env_manager='virtualenv', install_mlflow=False, pip_requirements_override=None, extra_envs=None)[source]
- Note - Experimental: This function may change or be removed in a future release without warning. - Generate predictions in json format using a saved MLflow model. For information about the input data formats accepted by this function, see the following documentation: https://www.mlflow.org/docs/latest/models.html#built-in-deployment-tools. - Parameters
- model_uri – URI to the model. A local path, a local or remote URI e.g. runs:/, s3://. 
- input_data – - Input data for prediction. Must be valid input for the PyFunc model. Refer to the - mlflow.pyfunc.PyFuncModel.predict()for the supported input types.- Note - If this API fails due to errors in input_data, use mlflow.models.convert_input_example_to_serving_input to manually validate your input data. 
- input_path – Path to a file containing input data. If provided, ‘input_data’ must be None. 
- content_type – Content type of the input data. Can be one of {‘json’, ‘csv’}. 
- output_path – File to output results to as json. If not provided, output to stdout. 
- env_manager – - Specify a way to create an environment for MLmodel inference: - ”virtualenv” (default): use virtualenv (and pyenv for Python version management) 
- ”uv”: use uv 
- ”local”: use the local environment 
- ”conda”: use conda 
 
- install_mlflow – If specified and there is a conda or virtualenv environment to be activated mlflow will be installed into the environment after it has been activated. The version of installed mlflow will be the same as the one used to invoke this command. 
- pip_requirements_override – - If specified, install the specified python dependencies to the model inference environment. This is particularly useful when you want to add extra dependencies or try different versions of the dependencies defined in the logged model. - Tip - After validating the pip requirements override works as expected, you can update the logged model’s dependency using mlflow.models.update_model_requirements API without re-logging it. Note that a registered model is immutable, so you need to register a new model version with the updated model. 
- extra_envs – - If specified, a dictionary of extra environment variables will be passed to the model inference environment. This is useful for testing what environment variables are needed for the model to run correctly. By default, environment variables existing in the current os.environ are passed, and this parameter can be used to override them. - Note - This parameter is only supported when env_manager is set to “virtualenv”, “conda” or “uv”. 
 
 - Code example: - import mlflow run_id = "..." mlflow.models.predict( model_uri=f"runs:/{run_id}/model", input_data={"x": 1, "y": 2}, content_type="json", ) # Run prediction with "uv" as the environment manager mlflow.models.predict( model_uri=f"runs:/{run_id}/model", input_data={"x": 1, "y": 2}, env_manager="uv", ) # Run prediction with additional pip dependencies and extra environment variables mlflow.models.predict( model_uri=f"runs:/{run_id}/model", input_data={"x": 1, "y": 2}, content_type="json", pip_requirements_override=["scikit-learn==0.23.2"], extra_envs={"OPENAI_API_KEY": "some_value"}, ) 
- 
mlflow.models.set_model(model) → None[source]
- Note - Experimental: This function may change or be removed in a future release without warning. - When logging model as code, this function can be used to set the model object to be logged. - Parameters
- model – - The model object to be logged. Supported model types are: - A Python function or callable object. 
- A Langchain model or path to a Langchain model. 
- A Llama Index model or path to a Llama Index model. 
 
 
- 
mlflow.models.set_retriever_schema(*, primary_key: str, text_column: str, doc_uri: Optional[str] = None, other_columns: Optional[list] = None, name: Optional[str] = 'retriever')[source]
- Note - Experimental: This function may change or be removed in a future release without warning. - Specify the return schema of a retriever span within your agent or generative AI app code. - Note: MLflow recommends that your retriever return the default MLflow retriever output schema described in https://mlflow.org/docs/latest/tracing/tracing-schema#retriever-spans, in which case you do not need to call set_retriever_schema. APIs that read MLflow traces and look for retriever spans, such as MLflow evaluation, will automatically detect retriever spans that match MLflow’s default retriever schema. - If your retriever does not return the default MLflow retriever output schema, call this API to specify which fields in each retrieved document correspond to the page content, document URI, document ID, etc. This enables downstream features like MLflow evaluation to properly identify these fields. Note that set_retriever_schema assumes that your retriever span returns a list of objects. - Parameters
- primary_key – The primary key of the retriever or vector index. 
- text_column – The name of the text column to use for the embeddings. 
- doc_uri – The name of the column that contains the document URI. 
- other_columns – A list of other columns that are part of the vector index that need to be retrieved during trace logging. 
- name – The name of the retriever tool or vector store index. 
 
 - from mlflow.models import set_retriever_schema # The following call sets the schema for a custom retriever that retrieves content from # MLflow documentation, with an output schema like: # [ # { # 'document_id': '9a8292da3a9d4005a988bf0bfdd0024c', # 'chunk_text': 'MLflow is an open-source platform, purpose-built to assist...', # 'doc_uri': 'https://mlflow.org/docs/latest/index.html', # 'title': 'MLflow: A Tool for Managing the Machine Learning Lifecycle' # }, # { # 'document_id': '7537fe93c97f4fdb9867412e9c1f9e5b', # 'chunk_text': 'A great way to get started with MLflow is...', # 'doc_uri': 'https://mlflow.org/docs/latest/getting-started/', # 'title': 'Getting Started with MLflow' # }, # ... # ] set_retriever_schema( primary_key="chunk_id", text_column="chunk_text", doc_uri="doc_uri", other_columns=["title"], name="my_custom_retriever", ) 
- 
mlflow.models.set_signature(model_uri: str, signature: mlflow.models.signature.ModelSignature)[source]
- Sets the model signature for specified model artifacts. - The process involves downloading the MLmodel file in the model artifacts (if it’s non-local), updating its model signature, and then overwriting the existing MLmodel file. Should the artifact repository associated with the model artifacts disallow overwriting, this function will fail. - Furthermore, as model registry artifacts are read-only, model artifacts located in the model registry and represented by - models:/URI schemes are not compatible with this API. To set a signature on a model version, first set the signature on the source model artifacts. Following this, generate a new model version using the updated model artifacts. For more information about setting signatures on model versions, see this doc section.- Parameters
- model_uri – - The location, in URI format, of the MLflow model. For example: - /Users/me/path/to/local/model
- relative/path/to/local/model
- s3://my_bucket/path/to/model
- runs:/<mlflow_run_id>/run-relative/path/to/model
- mlflow-artifacts:/path/to/model
- models:/<model_id>
 - For more information about supported URI schemes, see Referencing Artifacts. - Please note that model URIs with the - models:/<name>/<version>scheme are not supported.
- signature – ModelSignature to set on the model. 
 
 - import mlflow from mlflow.models import set_signature, infer_signature # load model from run artifacts run_id = "96771d893a5e46159d9f3b49bf9013e2" artifact_path = "models" model_uri = f"runs:/{run_id}/{artifact_path}" model = mlflow.pyfunc.load_model(model_uri) # determine model signature test_df = ... predictions = model.predict(test_df) signature = infer_signature(test_df, predictions) # set the signature for the logged model set_signature(model_uri, signature) 
- 
mlflow.models.update_model_requirements(model_uri: str, operation: Literal[add, remove], requirement_list: list) → None[source]
- Add or remove requirements from a model’s conda.yaml and requirements.txt files. - The process involves downloading these two files from the model artifacts (if they’re non-local), updating them with the specified requirements, and then overwriting the existing files. Should the artifact repository associated with the model artifacts disallow overwriting, this function will fail. - Note that model registry URIs (i.e. URIs in the form - models:/) are not supported, as artifacts in the model registry are intended to be read-only.- If adding requirements, the function will overwrite any existing requirements that overlap, or else append the new requirements to the existing list. - If removing requirements, the function will ignore any version specifiers, and remove all the specified package names. Any requirements that are not found in the existing files will be ignored. - Parameters
- model_uri – - The location, in URI format, of the MLflow model. For example: - /Users/me/path/to/local/model
- relative/path/to/local/model
- s3://my_bucket/path/to/model
- runs:/<mlflow_run_id>/run-relative/path/to/model
- mlflow-artifacts:/path/to/model
 - For more information about supported URI schemes, see Referencing Artifacts. 
- operation – The operation to perform. Must be one of “add” or “remove”. 
- requirement_list – A list of requirements to add or remove from the model. For example: [“numpy==1.20.3”, “pandas>=1.3.3”] 
 
 
- 
mlflow.models.validate_schema(data: Union[pandas.core.frame.DataFrame, pandas.core.series.Series, numpy.ndarray, scipy.sparse._csc.csc_matrix, scipy.sparse._csr.csr_matrix, List[Any], Dict[str, Any], datetime.datetime, bool, bytes, float, int, str, pyspark.sql.dataframe.DataFrame], expected_schema: mlflow.types.schema.Schema) → None[source]
- Validate that the input data has the expected schema. - Parameters
- data – - Input data to be validated. Supported types are: - pandas.DataFrame 
- pandas.Series 
- numpy.ndarray 
- scipy.sparse.csc_matrix 
- scipy.sparse.csr_matrix 
- List[Any] 
- Dict[str, Any] 
- str 
 
- expected_schema – Expected Schema of the input data. 
 
- Raises
- mlflow.exceptions.MlflowException – when the input data does not match the schema. 
 - import mlflow.models # Suppose you've already got a model_uri model_info = mlflow.models.get_model_info(model_uri) # Get model signature directly model_signature = model_info.signature # validate schema mlflow.models.validate_schema(input_data, model_signature.inputs) 
- 
mlflow.models.validate_serving_input(model_uri: str, serving_input: Union[str, dict])[source]
- Note - Experimental: This function may change or be removed in a future release without warning. - Helper function to validate the model can be served and provided input is valid prior to serving the model. - Parameters
- model_uri – URI of the model to be served. 
- serving_input – Input data to be validated. Should be a dictionary or a JSON string. 
 
- Returns
- The prediction result from the model. 
 
- 
class mlflow.models.model.ModelInfo(artifact_path: str, flavors: dict, model_uri: str, model_uuid: str, run_id: str, saved_input_example_info: Optional[dict], signature, utc_time_created: str, mlflow_version: str, signature_dict: Optional[dict] = None, metadata: Optional[dict] = None, registered_model_version: Optional[int] = None, env_vars: Optional[list] = None, prompts: Optional[list] = None, logged_model: Optional[LoggedModel] = None)[source]
- The metadata of a logged MLflow Model. - 
property artifact_path
- Run relative path identifying the logged model. - Getter
- Retrieves the relative path of the logged model. 
- Type
- str 
 
 - 
property creation_timestamp
- Returns the creation timestamp of the logged model. - Getter
- the creation timestamp of the logged model 
 
 - 
property env_vars
- Environment variables used during the model logging process. - Getter
- Gets the environment variables used during the model logging process. 
- Type
- Optional[List[str]] 
 
 - 
property flavors
- A dictionary mapping the flavor name to how to serve the model as that flavor. - Getter
- Gets the mapping for the logged model’s flavor that defines parameters used in serving of the model 
- Type
- Dict[str, str] 
 - { "python_function": { "model_path": "model.pkl", "loader_module": "mlflow.sklearn", "python_version": "3.8.10", "env": "conda.yaml", }, "sklearn": { "pickled_model": "model.pkl", "sklearn_version": "0.24.1", "serialization_format": "cloudpickle", }, } 
 - 
property metadata
- User defined metadata added to the model. - Getter
- Gets the user-defined metadata about a model 
- Type
- Optional[Dict[str, Any]] 
 - # Create and log a model with metadata to the Model Registry from sklearn import datasets from sklearn.ensemble import RandomForestClassifier import mlflow from mlflow.models import infer_signature with mlflow.start_run(): iris = datasets.load_iris() clf = RandomForestClassifier() clf.fit(iris.data, iris.target) signature = infer_signature(iris.data, iris.target) mlflow.sklearn.log_model( clf, "iris_rf", signature=signature, registered_model_name="model-with-metadata", metadata={"metadata_key": "metadata_value"}, ) # model uri for the above model model_uri = "models:/model-with-metadata/1" # Load the model and access the custom metadata from its ModelInfo object model = mlflow.pyfunc.load_model(model_uri=model_uri) assert model.metadata.get_model_info().metadata["metadata_key"] == "metadata_value" # Load the ModelInfo and access the custom metadata model_info = mlflow.models.get_model_info(model_uri=model_uri) assert model_info.metadata["metadata_key"] == "metadata_value" - Note - Experimental: This property may change or be removed in a future release without warning. 
 - 
property metrics
- Returns the metrics of the logged model. - Getter
- Retrieves the metrics of the logged model 
 
 - 
property mlflow_version
- Version of MLflow used to log the model - Getter
- Gets the version of MLflow that was installed when a model was logged 
- Type
- str 
 
 - 
property model_uri
- The - model_uriof the logged model in the format- 'runs:/<run_id>/<artifact_path>'.- Getter
- Gets the uri path of the logged model from the uri runs:/<run_id> path encapsulation 
- Type
- str 
 
 - 
property model_uuid
- The - model_uuidof the logged model, e.g.,- '39ca11813cfc46b09ab83972740b80ca'.- Getter
- [Legacy] Gets the model_uuid (run_id) of a logged model 
- Type
- str 
 
 - 
property params
- Returns the parameters of the logged model. - Getter
- Retrieves the parameters of the logged model 
 
 - 
property registered_model_version
- The registered model version, if the model is registered. - Getter
- Gets the registered model version, if the model is registered in Model Registry. 
- Setter
- Sets the registered model version. 
- Type
- Optional[int] 
 
 - 
property run_id
- The - run_idassociated with the logged model, e.g.,- '8ede7df408dd42ed9fc39019ef7df309'- Getter
- Gets the run_id identifier for the logged model 
- Type
- str 
 
 - 
property saved_input_example_info
- A dictionary that contains the metadata of the saved input example, e.g., - {"artifact_path": "input_example.json", "type": "dataframe", "pandas_orient": "split"}.- Getter
- Gets the input example if specified during model logging 
- Type
- Optional[Dict[str, str]] 
 
 - 
property signature
- A - ModelSignaturethat describes the model input and output.- Getter
- Gets the model signature if it is defined 
- Type
- Optional[ModelSignature] 
 
 - 
property signature_dict
- A dictionary that describes the model input and output generated by - ModelSignature.to_dict().- Getter
- Gets the model signature as a dictionary 
- Type
- Optional[Dict[str, Any]] 
 
 - Returns the tags of the logged model. - Getter
- Retrieves the tags of the logged model 
 
 
- 
property