mlflow.pyfunc
The python_function model flavor serves as a default model interface for MLflow Python models.
Any MLflow Python model is expected to be loadable as a python_function model.
In addition, the mlflow.pyfunc module defines a generic filesystem format for Python models and provides utilities for saving to and loading from
this format. The format is self contained in the sense that it includes all necessary information
for anyone to load it and use it. Dependencies are either stored directly with the model or
referenced via a Conda environment.
The mlflow.pyfunc module also defines utilities for creating custom pyfunc models
using frameworks and inference logic that may not be natively included in MLflow. See
Models From Code for Custom Models.
Inference API
Python function models are loaded as an instance of PyFuncModel, which is an MLflow wrapper around the model implementation and model
metadata (MLmodel file). You can score the model by calling the predict() method, which has the following signature:
predict(
  model_input: [pandas.DataFrame, numpy.ndarray, scipy.sparse.(csc_matrix | csr_matrix),
  List[Any], Dict[str, Any], pyspark.sql.DataFrame]
) -> [numpy.ndarray | pandas.(Series | DataFrame) | List | Dict | pyspark.sql.DataFrame]
All PyFunc models will support pandas.DataFrame as input and PyFunc deep learning models will also support tensor inputs in the form of Dict[str, numpy.ndarray] (named tensors) and numpy.ndarrays (unnamed tensors).
Here are some examples of supported inference types, assuming we have the correct model object
loaded.
| Input Type | Example | 
|---|---|
| 
 | import pandas as pd
x_new = pd.DataFrame(dict(x1=[1, 2, 3], x2=[4, 5, 6]))
model.predict(x_new)
 | 
| 
 | import numpy as np
x_new = np.array([[1, 4][2, 5], [3, 6]])
model.predict(x_new)
 | 
| 
 | import scipy
x_new = scipy.sparse.csc_matrix([[1, 2, 3], [4, 5, 6]])
model.predict(x_new)
x_new = scipy.sparse.csr_matrix([[1, 2, 3], [4, 5, 6]])
model.predict(x_new)
 | 
| python  | x_new = [[1, 4], [2, 5], [3, 6]]
model.predict(x_new)
 | 
| python  | x_new = dict(x1=[1, 2, 3], x2=[4, 5, 6])
model.predict(x_new)
 | 
| 
 | from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
data = [(1, 4), (2, 5), (3, 6)]  # List of tuples
x_new = spark.createDataFrame(data, ["x1", "x2"])  # Specify column name
model.predict(x_new)
 | 
Filesystem format
The Pyfunc format is defined as a directory structure containing all required data, code, and configuration:
./dst-path/
    ./MLmodel: configuration
    <code>: code packaged with the model (specified in the MLmodel file)
    <data>: data packaged with the model (specified in the MLmodel file)
    <env>: Conda environment definition (specified in the MLmodel file)
The directory structure may contain additional contents that can be referenced by the MLmodel
configuration.
MLModel configuration
A Python model contains an MLmodel file in python_function format in its root with the
following parameters:
- loader_module [required]:
- Python module that can load the model. Expected as module identifier e.g. - mlflow.sklearn, it will be imported using- importlib.import_module. The imported module must contain a function with the following signature:- _load_pyfunc(path: string) -> <pyfunc model implementation> - The path argument is specified by the - dataparameter and may refer to a file or directory. The model implementation is expected to be an object with a- predictmethod with the following signature:- predict( model_input: [pandas.DataFrame, numpy.ndarray, scipy.sparse.(csc_matrix | csr_matrix), List[Any], Dict[str, Any]], pyspark.sql.DataFrame ) -> [numpy.ndarray | pandas.(Series | DataFrame) | List | Dict | pyspark.sql.DataFrame] 
 
- code [optional]:
- Relative path to a directory containing the code packaged with this model. All files and directories inside this directory are added to the Python path prior to importing the model loader. 
 
- data [optional]:
- Relative path to a file or directory containing model data. The path is passed to the model loader. 
 
- env [optional]:
- Relative path to an exported Conda environment. If present this environment should be activated prior to running the model. 
 
- Optionally, any additional parameters necessary for interpreting the serialized model in - pyfuncformat.
Example
tree example/sklearn_iris/mlruns/run1/outputs/linear-lr
├── MLmodel
├── code
│   ├── sklearn_iris.py
│
├── data
│   └── model.pkl
└── mlflow_env.yml
cat example/sklearn_iris/mlruns/run1/outputs/linear-lr/MLmodel
python_function:
  code: code
  data: data/model.pkl
  loader_module: mlflow.sklearn
  env: mlflow_env.yml
  main: sklearn_iris
Models From Code for Custom Models
Tip
MLflow 2.12.2 introduced the feature “models from code”, which greatly simplifies the process of serializing and deploying custom models through the use of script serialization. It is strongly recommended to migrate custom model implementations to this new paradigm to avoid the limitations and complexity of serializing with cloudpickle. You can learn more about models from code within the Models From Code Guide.
The section below illustrates the process of using the legacy serializer for custom Pyfunc models. Models from code will provide a far simpler experience for logging of your models.
Creating custom Pyfunc models
MLflow’s persistence modules provide convenience functions for creating models with the
pyfunc flavor in a variety of machine learning frameworks (scikit-learn, Keras, Pytorch, and
more); however, they do not cover every use case. For example, you may want to create an MLflow
model with the pyfunc flavor using a framework that MLflow does not natively support.
Alternatively, you may want to build an MLflow model that executes custom logic when evaluating
queries, such as preprocessing and postprocessing routines. Therefore, mlflow.pyfunc
provides utilities for creating pyfunc models from arbitrary code and model data.
The save_model() and log_model() methods are designed to support multiple workflows
for creating custom pyfunc models that incorporate custom inference logic and artifacts
that the logic may require.
An artifact is a file or directory, such as a serialized model or a CSV. For example, a serialized TensorFlow graph is an artifact. An MLflow model directory is also an artifact.
Workflows
save_model() and log_model() support the following workflows:
- Programmatically defining a new MLflow model, including its attributes and artifacts. - Given a set of artifact URIs, - save_model()and- log_model()can automatically download artifacts from their URIs and create an MLflow model directory.- In this case, you must define a Python class which inherits from - PythonModel, defining- predict()and, optionally,- load_context(). An instance of this class is specified via the- python_modelparameter; it is automatically serialized and deserialized as a Python class, including all of its attributes.
- Interpreting pre-existing data as an MLflow model. - If you already have a directory containing model data, - save_model()and- log_model()can import the data as an MLflow model. The- data_pathparameter specifies the local filesystem path to the directory containing model data.- In this case, you must provide a Python module, called a loader module. The loader module defines a - _load_pyfunc()method that performs the following tasks:- Load data from the specified - data_path. For example, this process may include deserializing pickled Python objects or models or parsing CSV files.
- Construct and return a pyfunc-compatible model wrapper. As in the first use case, this wrapper must define a - predict()method that is used to evaluate queries.- predict()must adhere to the Inference API.
 - The - loader_moduleparameter specifies the name of your loader module.- For an example loader module implementation, refer to the loader module implementation in mlflow.sklearn. 
Which workflow is right for my use case?
We consider the first workflow to be more user-friendly and generally recommend it for the following reasons:
- It automatically resolves and collects specified model artifacts. 
- It automatically serializes and deserializes the - python_modelinstance and all of its attributes, reducing the amount of user logic that is required to load the model
- You can create Models using logic that is defined in the - __main__scope. This allows custom models to be constructed in interactive environments, such as notebooks and the Python REPL.
You may prefer the second, lower-level workflow for the following reasons:
- Inference logic is always persisted as code, rather than a Python object. This makes logic easier to inspect and modify later. 
- If you have already collected all of your model data in a single location, the second workflow allows it to be saved in MLflow format directly, without enumerating constituent artifacts. 
Function-based Model vs Class-based Model
When creating custom PyFunc models, you can choose between two different interfaces:
a function-based model and a class-based model. In short, a function-based model is simply a
python function that does not take additional params. The class-based model, on the other hand,
is subclass of PythonModel that supports several required and optional
methods. If your use case is simple and fits within a single predict function, a function-based
approach is recommended. If you need more power, such as custom serialization, custom data
processing, or to override additional methods, you should use the class-based implementation.
Before looking at code examples, it’s important to note that both methods are serialized via cloudpickle. cloudpickle can serialize Python functions, lambda functions, and locally defined classes and functions inside other functions. This makes cloudpickle especially useful for parallel and distributed computing where code objects need to be sent over network to execute on remote workers, which is a common deployment paradigm for MLflow.
That said, cloudpickle has some limitations.
- Environment Dependency: cloudpickle does not capture the full execution environment, so in MLflow we must pass - pip_requirements,- extra_pip_requirements, or an- input_example, the latter of which is used to infer environment dependencies. For more, refer to the model dependency docs.
- Object Support: cloudpickle does not serialize objects outside of the Python data model. Some relevant examples include raw files and database connections. If your program depends on these, be sure to log ways to reference these objects along with your model. 
Function-based Model
If you’re looking to serialize a simple python function without additional dependent methods, you
can simply log a predict method via the keyword argument python_model.
Note
Function-based model only supports a function with a single input argument. If you would like to pass more arguments or additional inference parameters, please use the class-based model below.
import mlflow
import pandas as pd
# Define a simple function to log
def predict(model_input):
    return model_input.apply(lambda x: x * 2)
# Save the function as a model
with mlflow.start_run():
    mlflow.pyfunc.log_model(
        name="model", python_model=predict, pip_requirements=["pandas"]
    )
    run_id = mlflow.active_run().info.run_id
# Load the model from the tracking server and perform inference
model = mlflow.pyfunc.load_model(f"runs:/{run_id}/model")
x_new = pd.Series([1, 2, 3])
prediction = model.predict(x_new)
print(prediction)
Class-based Model
If you’re looking to serialize a more complex object, for instance a class that handles
preprocessing, complex prediction logic, or custom serialization, you should subclass the
PythonModel class. MLflow has tutorials on building custom PyFunc models, as shown
here,
so instead of duplicating that information, in this example we’ll recreate the above functionality
to highlight the differences. Note that this PythonModel implementation is overly complex and
we would recommend using the functional-based Model instead for this simple case.
import mlflow
import pandas as pd
class MyModel(mlflow.pyfunc.PythonModel):
    def predict(self, context, model_input, params=None):
        return [x * 2 for x in model_input]
# Save the function as a model
with mlflow.start_run():
    mlflow.pyfunc.log_model(
        name="model", python_model=MyModel(), pip_requirements=["pandas"]
    )
    run_id = mlflow.active_run().info.run_id
# Load the model from the tracking server and perform inference
model = mlflow.pyfunc.load_model(f"runs:/{run_id}/model")
x_new = pd.Series([1, 2, 3])
print(f"Prediction:
    {model.predict(x_new)}")
The primary difference between the this implementation and the function-based implementation above
is that the predict method is wrapped with a class, has the self parameter,
and has the params parameter that defaults to None. Note that function-based models don’t
support additional params.
In summary, use the function-based Model when you have a simple function to serialize. If you need more power, use the class-based model.
- class mlflow.pyfunc.EnvType[source]
- Bases: - object
- class mlflow.pyfunc.PyFuncModel(model_meta: mlflow.models.model.Model, model_impl: Any, predict_fn: str = 'predict', predict_stream_fn: Optional[str] = None, model_id: Optional[str] = None)[source]
- Bases: - object- MLflow ‘python function’ model. - Wrapper around model implementation and metadata. This class is not meant to be constructed directly. Instead, instances of this class are constructed and returned from - load_model().- model_implcan be any Python object that implements the Pyfunc interface, and is returned by invoking the model’s- loader_module.- model_metacontains model metadata loaded from the MLmodel file.- get_raw_model()[source]
- Get the underlying raw model if the model wrapper implemented get_raw_model function. 
 - property metadata: mlflow.models.model.Model
- Model metadata. 
 - predict(data: Union[pandas.core.frame.DataFrame, pandas.core.series.Series, numpy.ndarray, csc_matrix, csr_matrix, List[Any], Dict[str, Any], datetime.datetime, bool, bytes, float, int, str, pyspark.sql.dataframe.DataFrame], params: Optional[dict[str, typing.Any]] = None) Union[pandas.core.frame.DataFrame, pandas.core.series.Series, numpy.ndarray, list, str, pyspark.sql.dataframe.DataFrame][source]
 - predict_stream(data: Union[dict[str, typing.Any], bool, bytes, float, int, str], params: Optional[dict[str, typing.Any]] = None) Iterator[Union[dict[str, typing.Any], str]][source]
 - unwrap_python_model()[source]
- Unwrap the underlying Python model object. - This method is useful for accessing custom model functions, while still being able to leverage the MLflow designed workflow through the predict() method. - Returns
- The underlying wrapped model object 
 - import mlflow # define a custom model class MyModel(mlflow.pyfunc.PythonModel): def predict(self, context, model_input, params=None): return self.my_custom_function(model_input, params) def my_custom_function(self, model_input, params=None): # do something with the model input return 0 some_input = 1 # save the model with mlflow.start_run(): model_info = mlflow.pyfunc.log_model(name="model", python_model=MyModel()) # load the model loaded_model = mlflow.pyfunc.load_model(model_uri=model_info.model_uri) print(type(loaded_model)) # <class 'mlflow.pyfunc.model.PyFuncModel'> unwrapped_model = loaded_model.unwrap_python_model() print(type(unwrapped_model)) # <class '__main__.MyModel'> # does not work, only predict() is exposed # print(loaded_model.my_custom_function(some_input)) print(unwrapped_model.my_custom_function(some_input)) # works print(loaded_model.predict(some_input)) # works # works, but None is needed for context arg print(unwrapped_model.predict(None, some_input)) 
 
- mlflow.pyfunc.add_to_model(model, loader_module, data=None, code=None, conda_env=None, python_env=None, model_config=None, model_code_path=None, **kwargs)[source]
- Add a - pyfuncspec to the model configuration.- Defines - pyfuncconfiguration schema. Caller can use this to create a valid- pyfuncmodel flavor out of an existing directory structure. For example, other model flavors can use this to specify how to use their output as a- pyfunc.- Note - All paths are relative to the exported model root directory. - Parameters
- model – Existing model. 
- loader_module – The module to be used to load the model. 
- data – Path to the model data. 
- code – Path to the code dependencies. 
- conda_env – Conda environment. 
- python_env – Python environment. 
- model_config – - The model configuration to apply to the model. This configuration is available during model loading. - Note - Experimental: This parameter may change or be removed in a future release without warning. 
- model_code_path – Path to the model code. 
- kwargs – Additional key-value pairs to include in the - pyfuncflavor specification. Values must be YAML-serializable.
 
- Returns
- Updated model configuration. 
 
- mlflow.pyfunc.build_model_env(model_uri, save_path, env_manager='virtualenv')[source]
- Prebuild model python environment and generate an archive file saved to provided save_path. - Typical usages:
- Pre-build a model’s environment in Databricks Runtime and then download the prebuilt python environment archive file. This pre-built environment archive can then be used in mlflow.pyfunc.spark_udf for remote inference execution when using Databricks Connect to remotely connect to a Databricks environment for code execution. 
 
 - Note - The build_model_env API is intended to only work when executed within Databricks runtime, serving the purpose of capturing the required execution environment that is needed for remote code execution when using DBConnect. The environment archive is designed to be used when performing remote execution using mlflow.pyfunc.spark_udf in Databricks runtime or Databricks Connect client and has no other purpose. The prebuilt env archive file cannot be used across different Databricks runtime versions or different platform machines. As such, if you connect to a different cluster that is running a different runtime version on Databricks, you will need to execute this API in a notebook and retrieve the generated archive to your local machine. Each environment snapshot is unique to the the model, the runtime version of your remote Databricks cluster, and the specification of the udf execution environment. When using the prebuilt env in mlflow.pyfunc.spark_udf, MLflow will verify whether the spark UDF sandbox environment matches the prebuilt env requirements and will raise Exceptions if there are compatibility issues. If these occur, simply re-run this API in the cluster that you are attempting to attach to. - from mlflow.pyfunc import build_model_env # Create a python environment archive file at the path `prebuilt_env_uri` prebuilt_env_uri = build_model_env(f"runs:/{run_id}/model", "/path/to/save_directory") - Parameters
- model_uri – URI to the model that is used to build the python environment. 
- save_path – The directory path that is used to save the prebuilt model environment archive file path. The path can be either local directory path or mounted DBFS path such as ‘/dbfs/…’ or mounted UC volume path such as ‘/Volumes/…’. 
- env_manager – The environment manager to use in order to create the python environment for model inference, the value can be either ‘virtualenv’ or ‘uv’, the default value is ‘virtualenv’. 
 
- Returns
- Return the path of an archive file containing the python environment data. 
 
- mlflow.pyfunc.get_model_dependencies(model_uri, format='pip')[source]
- Downloads the model dependencies and returns the path to requirements.txt or conda.yaml file. - Warning - This API downloads all the model artifacts to the local filesystem. This may take a long time for large models. To avoid this overhead, use - mlflow.artifacts.download_artifacts("<model_uri>/requirements.txt")or- mlflow.artifacts.download_artifacts("<model_uri>/conda.yaml")instead.- Parameters
- model_uri – The uri of the model to get dependencies from. 
- format – The format of the returned dependency file. If the - "pip"format is specified, the path to a pip- requirements.txtfile is returned. If the- "conda"format is specified, the path to a- "conda.yaml"file is returned . If the- "pip"format is specified but the model was not saved with a- requirements.txtfile, the- pipsection of the model’s- conda.yamlfile is extracted instead, and any additional conda dependencies are ignored. Default value is- "pip".
 
- Returns
- The local filesystem path to either a pip - requirements.txtfile (if- format="pip") or a- conda.yamlfile (if- format="conda") specifying the model’s dependencies.
 
- mlflow.pyfunc.load_model(model_uri: str, suppress_warnings: bool = False, dst_path: Optional[str] = None, model_config: Optional[Union[str, pathlib.Path, dict[str, typing.Any]]] = None) mlflow.pyfunc.PyFuncModel[source]
- Load a model stored in Python function format. - Parameters
- model_uri – - The location, in URI format, of the MLflow model. For example: - /Users/me/path/to/local/model
- relative/path/to/local/model
- s3://my_bucket/path/to/model
- runs:/<mlflow_run_id>/run-relative/path/to/model
- models:/<model_name>/<model_version>
- models:/<model_name>/<stage>
- mlflow-artifacts:/path/to/model
 - For more information about supported URI schemes, see Referencing Artifacts. 
- suppress_warnings – If - True, non-fatal warning messages associated with the model loading process will be suppressed. If- False, these warning messages will be emitted.
- dst_path – The local filesystem path to which to download the model artifact. This directory must already exist. If unspecified, a local output path will be created. 
- model_config – - The model configuration to apply to the model. The configuration will be available as the - model_configproperty of the- contextparameter in- PythonModel.load_context()and- PythonModel.predict(). The configuration can be passed as a file path, or a dict with string keys.- Note - Experimental: This parameter may change or be removed in a future release without warning. 
 
 
- mlflow.pyfunc.load_pyfunc(model_uri, suppress_warnings=False)[source]
- Warning - mlflow.pyfunc.load_pyfuncis deprecated since 1.0. This method will be removed in a future release. Use- mlflow.pyfunc.load_modelinstead.- Load a model stored in Python function format. - Parameters
- model_uri – - The location, in URI format, of the MLflow model. For example: - /Users/me/path/to/local/model
- relative/path/to/local/model
- s3://my_bucket/path/to/model
- runs:/<mlflow_run_id>/run-relative/path/to/model
- models:/<model_name>/<model_version>
- models:/<model_name>/<stage>
- mlflow-artifacts:/path/to/model
 - For more information about supported URI schemes, see Referencing Artifacts. 
- suppress_warnings – If - True, non-fatal warning messages associated with the model loading process will be suppressed. If- False, these warning messages will be emitted.
 
 
- mlflow.pyfunc.log_model(artifact_path=None, loader_module=None, data_path=None, code_paths=None, infer_code_paths=False, conda_env=None, python_model=None, artifacts=None, registered_model_name=None, signature: mlflow.models.signature.ModelSignature = None, input_example: Union[pandas.core.frame.DataFrame, numpy.ndarray, dict, list, csr_matrix, csc_matrix, str, bytes, tuple] = None, await_registration_for=300, pip_requirements=None, extra_pip_requirements=None, metadata=None, model_config=None, streamable=None, resources: Optional[Union[str, list[mlflow.models.resources.Resource]]] = None, auth_policy: Optional[mlflow.models.auth_policy.AuthPolicy] = None, prompts: Optional[list[typing.Union[str, Prompt]]] = None, name=None, params: Optional[dict[str, typing.Any]] = None, tags: Optional[dict[str, typing.Any]] = None, model_type: Optional[str] = None, step: int = 0, model_id: Optional[str] = None)[source]
- Log a Pyfunc model with custom inference logic and optional data dependencies as an MLflow artifact for the current run. - For information about the workflows that this method supports, see Workflows for creating custom pyfunc models and Which workflow is right for my use case?. You cannot specify the parameters for the second workflow: - loader_module,- data_pathand the parameters for the first workflow:- python_model,- artifactstogether.- Parameters
- artifact_path – Deprecated. Use name instead. 
- loader_module – - The name of the Python module that is used to load the model from - data_path. This module must define a method with the prototype- _load_pyfunc(data_path). If not- None, this module and its dependencies must be included in one of the following locations:- The MLflow library. 
- Package(s) listed in the model’s Conda environment, specified by the - conda_envparameter.
- One or more of the files specified by the - code_pathsparameter.
 
- data_path – Path to a file or directory containing model data. 
- code_paths – - A list of local filesystem paths to Python file dependencies (or directories containing file dependencies). These files are prepended to the system path when the model is loaded. Files declared as dependencies for a given model should have relative imports declared from a common root path if multiple files are defined with import dependencies between them to avoid import errors when loading the model. - You can leave - code_pathsargument unset but set- infer_code_pathsto- Trueto let MLflow infer the model code paths. See- infer_code_pathsargument doc for details.- For a detailed explanation of - code_pathsfunctionality, recommended usage patterns and limitations, see the code_paths usage guide.
- infer_code_paths – - If set to True, MLflow automatically infers model code paths. The inferred
- code path files only include necessary python module files. Only python code files under current working directory are automatically inferable. Default value is - False.
 - Warning - Please ensure that the custom python module code does not contain sensitive data such as credential token strings, otherwise they might be included in the automatic inferred code path files and be logged to MLflow artifact repository. - If your custom python module depends on non-python files (e.g. a JSON file) with a relative path to the module code file path, the non-python files can’t be automatically inferred as the code path file. To address this issue, you should put all used non-python files outside your custom code directory. - If a python code file is loaded as the python - __main__module, then this code file can’t be inferred as the code path file. If your model depends on classes / functions defined in- __main__module, you should use cloudpickle to dump your model instance in order to pickle classes / functions in- __main__.- Note - Experimental: This parameter may change or be removed in a future release without warning. 
- If set to 
- conda_env – - Either a dictionary representation of a Conda environment or the path to a conda environment yaml file. If provided, this describes the environment this model should be run in. At a minimum, it should specify the dependencies contained in get_default_conda_env(). If - None, a conda environment with pip requirements inferred by- mlflow.models.infer_pip_requirements()is added to the model. If the requirement inference fails, it falls back to using get_default_pip_requirements. pip requirements from- conda_envare written to a pip- requirements.txtfile and the full conda environment is written to- conda.yaml. The following is an example dictionary representation of a conda environment:- { "name": "mlflow-env", "channels": ["conda-forge"], "dependencies": [ "python=3.8.15", { "pip": [ "scikit-learn==x.y.z" ], }, ], } 
- python_model – - An instance of a subclass of - PythonModelor a callable object with a single argument (see the examples below). The passed-in object is serialized using the CloudPickle library. The python_model can also be a file path to the PythonModel which defines the model from code artifact rather than serializing the model object. Any dependencies of the class should be included in one of the following locations:- The MLflow library. 
- Package(s) listed in the model’s Conda environment, specified by the - conda_envparameter.
- One or more of the files specified by the - code_pathsparameter.
 - Note: If the class is imported from another module, as opposed to being defined in the - __main__scope, the defining module should also be included in one of the listed locations.- Examples - Class model - from typing import List import mlflow class MyModel(mlflow.pyfunc.PythonModel): def predict(self, context, model_input: List[str], params=None) -> List[str]: return [i.upper() for i in model_input] with mlflow.start_run(): model_info = mlflow.pyfunc.log_model( name="model", python_model=MyModel(), ) loaded_model = mlflow.pyfunc.load_model(model_uri=model_info.model_uri) print(loaded_model.predict(["a", "b", "c"])) # -> ["A", "B", "C"] - Functional model - Note - Experimental: Functional model support is experimental and may change or be removed in a future release without warning. - from typing import List import mlflow def predict(model_input: List[str]) -> List[str]: return [i.upper() for i in model_input] with mlflow.start_run(): model_info = mlflow.pyfunc.log_model( name="model", python_model=predict, input_example=["a"] ) loaded_model = mlflow.pyfunc.load_model(model_uri=model_info.model_uri) print(loaded_model.predict(["a", "b", "c"])) # -> ["A", "B", "C"] - Model from code - Note - Experimental: Model from code model support is experimental and may change or be removed in a future release without warning. - # code.py from typing import List import mlflow class MyModel(mlflow.pyfunc.PythonModel): def predict(self, context, model_input: List[str], params=None) -> List[str]: return [i.upper() for i in model_input] mlflow.models.set_model(MyModel()) # log_model.py import mlflow with mlflow.start_run(): model_info = mlflow.pyfunc.log_model( name="model", python_model="code.py", ) - If the predict method or function has type annotations, MLflow automatically constructs a model signature based on the type annotations (unless the - signatureargument is explicitly specified), and converts the input value to the specified type before passing it to the function. Currently, the following type annotations are supported:- List[str]
- List[Dict[str, str]]
 
- artifacts – - A dictionary containing - <name, artifact_uri>entries. Remote artifact URIs are resolved to absolute filesystem paths, producing a dictionary of- <name, absolute_path>entries.- python_modelcan reference these resolved entries as the- artifactsproperty of the- contextparameter in- PythonModel.load_context()and- PythonModel.predict(). For example, consider the following- artifactsdictionary:- {"my_file": "s3://my-bucket/path/to/my/file"} - In this case, the - "my_file"artifact is downloaded from S3. The- python_modelcan then refer to- "my_file"as an absolute filesystem path via- context.artifacts["my_file"].- If - None, no artifacts are added to the model.
- registered_model_name – If given, create a model version under - registered_model_name, also creating a registered model if one with the given name does not exist.
- signature – - ModelSignaturedescribes model input and output- Schema. The model signature can be- inferredfrom datasets with valid model input (e.g. the training dataset with target column omitted) and valid model output (e.g. model predictions generated on the training dataset), for example:- from mlflow.models import infer_signature train = df.drop_column("target_label") predictions = ... # compute model predictions signature = infer_signature(train, predictions) 
- input_example – one or several instances of valid model input. The input example is used as a hint of what data to feed the model. It will be converted to a Pandas DataFrame and then serialized to json using the Pandas split-oriented format, or a numpy array where the example will be serialized to json by converting it to a list. Bytes are base64-encoded. When the - signatureparameter is- None, the input example is used to infer a model signature.
- await_registration_for – Number of seconds to wait for the model version to finish being created and is in - READYstatus. By default, the function waits for five minutes. Specify 0 or None to skip waiting.
- pip_requirements – Either an iterable of pip requirement strings (e.g. - ["scikit-learn", "-r requirements.txt", "-c constraints.txt"]) or the string path to a pip requirements file on the local filesystem (e.g.- "requirements.txt"). If provided, this describes the environment this model should be run in. If- None, a default list of requirements is inferred by- mlflow.models.infer_pip_requirements()from the current software environment. If the requirement inference fails, it falls back to using get_default_pip_requirements. Both requirements and constraints are automatically parsed and written to- requirements.txtand- constraints.txtfiles, respectively, and stored as part of the model. Requirements are also written to the- pipsection of the model’s conda environment (- conda.yaml) file.
- extra_pip_requirements – - Either an iterable of pip requirement strings (e.g. - ["pandas", "-r requirements.txt", "-c constraints.txt"]) or the string path to a pip requirements file on the local filesystem (e.g.- "requirements.txt"). If provided, this describes additional pip requirements that are appended to a default set of pip requirements generated automatically based on the user’s current software environment. Both requirements and constraints are automatically parsed and written to- requirements.txtand- constraints.txtfiles, respectively, and stored as part of the model. Requirements are also written to the- pipsection of the model’s conda environment (- conda.yaml) file.- Warning - The following arguments can’t be specified at the same time: - conda_env
- pip_requirements
- extra_pip_requirements
 - This example demonstrates how to specify pip requirements using - pip_requirementsand- extra_pip_requirements.
- metadata – Custom metadata dictionary passed to the model and stored in the MLmodel file. 
- model_config – - The model configuration to apply to the model. The configuration will be available as the - model_configproperty of the- contextparameter in- PythonModel.load_context()and- PythonModel.predict(). The configuration can be passed as a file path, or a dict with string keys.- Note - Experimental: This parameter may change or be removed in a future release without warning. 
- streamable – A boolean value indicating if the model supports streaming prediction, If None, MLflow will try to inspect if the model supports streaming by checking if predict_stream method exists. Default None. 
- resources – - A list of model resources or a resources.yaml file containing a list of
- resources required to serve the model. 
 - Note - Experimental: This parameter may change or be removed in a future release without warning. 
- auth_policy – - Specifies the authentication policy for the model, which includes two key components.
- Note that only one of auth_policy or resources should be defined. - System Auth Policy: A list of resources required to serve this model. 
- User Auth Policy: A minimal list of scopes that the user should have access to
- ,in order to invoke this model. 
 
 
 - Note - Experimental: This parameter may change or be removed in a future release without warning. 
- prompts – - A list of prompt URIs registered in the MLflow Prompt Registry, to be associated with the model. Each prompt URI should be in the form - prompt:/<name>/<version>. The prompts should be registered in the MLflow Prompt Registry before being associated with the model.- This will create a mutual link between the model and the prompt. The associated prompts can be seen in the model’s metadata stored in the MLmodel file. From the Prompt Registry UI, you can navigate to the model as well. - import mlflow prompt_template = "Hi, {name}! How are you doing today?" # Register a prompt in the MLflow Prompt Registry mlflow.prompts.register_prompt("my_prompt", prompt_template, description="A simple prompt") # Log a model with the registered prompt with mlflow.start_run(): model_info = mlflow.pyfunc.log_model( name=MyModel(), name="model", prompts=["prompt:/my_prompt/1"] ) print(model_info.prompts) # Output: ['prompt:/my_prompt/1'] # Load the prompt prompt = mlflow.genai.load_prompt(model_info.prompts[0]) 
- name – Model name. 
- params – A dictionary of parameters to log with the model. 
- tags – A dictionary of tags to log with the model. 
- model_type – The type of the model. 
- step – The step at which to log the model outputs and metrics 
- model_id – The ID of the model. 
 
- Returns
- A - ModelInfoinstance that contains the metadata of the logged model.
 
- mlflow.pyfunc.save_model(path, loader_module=None, data_path=None, code_paths=None, infer_code_paths=False, conda_env=None, mlflow_model=None, python_model=None, artifacts=None, signature: mlflow.models.signature.ModelSignature = None, input_example: Union[pandas.core.frame.DataFrame, numpy.ndarray, dict, list, csr_matrix, csc_matrix, str, bytes, tuple] = None, pip_requirements=None, extra_pip_requirements=None, metadata=None, model_config=None, streamable=None, resources: Optional[Union[str, list[mlflow.models.resources.Resource]]] = None, auth_policy: Optional[mlflow.models.auth_policy.AuthPolicy] = None, **kwargs)[source]
- Save a Pyfunc model with custom inference logic and optional data dependencies to a path on the local filesystem. - For information about the workflows that this method supports, please see “workflows for creating custom pyfunc models” and “which workflow is right for my use case?”. Note that the parameters for the second workflow: - loader_module,- data_pathand the parameters for the first workflow:- python_model,- artifacts, cannot be specified together.- Parameters
- path – The path to which to save the Python model. 
- loader_module – - The name of the Python module that is used to load the model from - data_path. This module must define a method with the prototype- _load_pyfunc(data_path). If not- None, this module and its dependencies must be included in one of the following locations:- The MLflow library. 
- Package(s) listed in the model’s Conda environment, specified by the - conda_envparameter.
- One or more of the files specified by the - code_pathsparameter.
 
- data_path – Path to a file or directory containing model data. 
- code_paths – - A list of local filesystem paths to Python file dependencies (or directories containing file dependencies). These files are prepended to the system path when the model is loaded. Files declared as dependencies for a given model should have relative imports declared from a common root path if multiple files are defined with import dependencies between them to avoid import errors when loading the model. - You can leave - code_pathsargument unset but set- infer_code_pathsto- Trueto let MLflow infer the model code paths. See- infer_code_pathsargument doc for details.- For a detailed explanation of - code_pathsfunctionality, recommended usage patterns and limitations, see the code_paths usage guide.
- infer_code_paths – - If set to True, MLflow automatically infers model code paths. The inferred
- code path files only include necessary python module files. Only python code files under current working directory are automatically inferable. Default value is - False.
 - Warning - Please ensure that the custom python module code does not contain sensitive data such as credential token strings, otherwise they might be included in the automatic inferred code path files and be logged to MLflow artifact repository. - If your custom python module depends on non-python files (e.g. a JSON file) with a relative path to the module code file path, the non-python files can’t be automatically inferred as the code path file. To address this issue, you should put all used non-python files outside your custom code directory. - If a python code file is loaded as the python - __main__module, then this code file can’t be inferred as the code path file. If your model depends on classes / functions defined in- __main__module, you should use cloudpickle to dump your model instance in order to pickle classes / functions in- __main__.- Note - Experimental: This parameter may change or be removed in a future release without warning. 
- If set to 
- conda_env – - Either a dictionary representation of a Conda environment or the path to a conda environment yaml file. If provided, this describes the environment this model should be run in. At a minimum, it should specify the dependencies contained in get_default_conda_env(). If - None, a conda environment with pip requirements inferred by- mlflow.models.infer_pip_requirements()is added to the model. If the requirement inference fails, it falls back to using get_default_pip_requirements. pip requirements from- conda_envare written to a pip- requirements.txtfile and the full conda environment is written to- conda.yaml. The following is an example dictionary representation of a conda environment:- { "name": "mlflow-env", "channels": ["conda-forge"], "dependencies": [ "python=3.8.15", { "pip": [ "scikit-learn==x.y.z" ], }, ], } 
- mlflow_model – - mlflow.models.Modelconfiguration to which to add the python_function flavor.
- python_model – - An instance of a subclass of - PythonModelor a callable object with a single argument (see the examples below). The passed-in object is serialized using the CloudPickle library. The python_model can also be a file path to the PythonModel which defines the model from code artifact rather than serializing the model object. Any dependencies of the class should be included in one of the following locations:- The MLflow library. 
- Package(s) listed in the model’s Conda environment, specified by the - conda_envparameter.
- One or more of the files specified by the - code_pathsparameter.
 - Note: If the class is imported from another module, as opposed to being defined in the - __main__scope, the defining module should also be included in one of the listed locations.- Examples - Class model - from typing import List, Dict import mlflow class MyModel(mlflow.pyfunc.PythonModel): def predict(self, context, model_input: List[str], params=None) -> List[str]: return [i.upper() for i in model_input] mlflow.pyfunc.save_model("model", python_model=MyModel(), input_example=["a"]) model = mlflow.pyfunc.load_model("model") print(model.predict(["a", "b", "c"])) # -> ["A", "B", "C"] - Functional model - Note - Experimental: Functional model support is experimental and may change or be removed in a future release without warning. - from typing import List import mlflow def predict(model_input: List[str]) -> List[str]: return [i.upper() for i in model_input] mlflow.pyfunc.save_model("model", python_model=predict, input_example=["a"]) model = mlflow.pyfunc.load_model("model") print(model.predict(["a", "b", "c"])) # -> ["A", "B", "C"] - Model from code - Note - Experimental: Model from code model support is experimental and may change or be removed in a future release without warning. - # code.py from typing import List import mlflow class MyModel(mlflow.pyfunc.PythonModel): def predict(self, context, model_input: List[str], params=None) -> List[str]: return [i.upper() for i in model_input] mlflow.models.set_model(MyModel()) # log_model.py import mlflow with mlflow.start_run(): model_info = mlflow.pyfunc.log_model( name="model", python_model="code.py", ) - If the predict method or function has type annotations, MLflow automatically constructs a model signature based on the type annotations (unless the - signatureargument is explicitly specified), and converts the input value to the specified type before passing it to the function. Currently, the following type annotations are supported:- List[str]
- List[Dict[str, str]]
 
- artifacts – - A dictionary containing - <name, artifact_uri>entries. Remote artifact URIs are resolved to absolute filesystem paths, producing a dictionary of- <name, absolute_path>entries.- python_modelcan reference these resolved entries as the- artifactsproperty of the- contextparameter in- PythonModel.load_context()and- PythonModel.predict(). For example, consider the following- artifactsdictionary:- {"my_file": "s3://my-bucket/path/to/my/file"} - In this case, the - "my_file"artifact is downloaded from S3. The- python_modelcan then refer to- "my_file"as an absolute filesystem path via- context.artifacts["my_file"].- If - None, no artifacts are added to the model.
- signature – - ModelSignaturedescribes model input and output- Schema. The model signature can be- inferredfrom datasets with valid model input (e.g. the training dataset with target column omitted) and valid model output (e.g. model predictions generated on the training dataset), for example:- from mlflow.models import infer_signature train = df.drop_column("target_label") predictions = ... # compute model predictions signature = infer_signature(train, predictions) 
- input_example – one or several instances of valid model input. The input example is used as a hint of what data to feed the model. It will be converted to a Pandas DataFrame and then serialized to json using the Pandas split-oriented format, or a numpy array where the example will be serialized to json by converting it to a list. Bytes are base64-encoded. When the - signatureparameter is- None, the input example is used to infer a model signature.
- pip_requirements – Either an iterable of pip requirement strings (e.g. - ["scikit-learn", "-r requirements.txt", "-c constraints.txt"]) or the string path to a pip requirements file on the local filesystem (e.g.- "requirements.txt"). If provided, this describes the environment this model should be run in. If- None, a default list of requirements is inferred by- mlflow.models.infer_pip_requirements()from the current software environment. If the requirement inference fails, it falls back to using get_default_pip_requirements. Both requirements and constraints are automatically parsed and written to- requirements.txtand- constraints.txtfiles, respectively, and stored as part of the model. Requirements are also written to the- pipsection of the model’s conda environment (- conda.yaml) file.
- extra_pip_requirements – - Either an iterable of pip requirement strings (e.g. - ["pandas", "-r requirements.txt", "-c constraints.txt"]) or the string path to a pip requirements file on the local filesystem (e.g.- "requirements.txt"). If provided, this describes additional pip requirements that are appended to a default set of pip requirements generated automatically based on the user’s current software environment. Both requirements and constraints are automatically parsed and written to- requirements.txtand- constraints.txtfiles, respectively, and stored as part of the model. Requirements are also written to the- pipsection of the model’s conda environment (- conda.yaml) file.- Warning - The following arguments can’t be specified at the same time: - conda_env
- pip_requirements
- extra_pip_requirements
 - This example demonstrates how to specify pip requirements using - pip_requirementsand- extra_pip_requirements.
- metadata – Custom metadata dictionary passed to the model and stored in the MLmodel file. 
- model_config – - The model configuration to apply to the model. The configuration will be available as the - model_configproperty of the- contextparameter in- PythonModel.load_context()and- PythonModel.predict(). The configuration can be passed as a file path, or a dict with string keys.- Note - Experimental: This parameter may change or be removed in a future release without warning. 
- streamable – A boolean value indicating if the model supports streaming prediction, If None, MLflow will try to inspect if the model supports streaming by checking if predict_stream method exists. Default None. 
- resources – - A list of model resources or a resources.yaml file containing a list of
- resources required to serve the model. 
 - Note - Experimental: This parameter may change or be removed in a future release without warning. 
- auth_policy – - Specifies the authentication policy for the model, which includes two key components.
- Note that only one of auth_policy or resources should be defined. - System Auth Policy: A list of resources required to serve this model. 
- User Auth Policy: A minimal list of scopes that the user should have access to
- ,in order to invoke this model. 
 
 
 - Note - Experimental: This parameter may change or be removed in a future release without warning. 
- kwargs – Extra keyword arguments. 
 
 
- mlflow.pyfunc.spark_udf(spark, model_uri, result_type=None, env_manager=None, params: Optional[dict[str, typing.Any]] = None, extra_env: Optional[dict[str, str]] = None, prebuilt_env_uri: Optional[str] = None, model_config: Optional[Union[str, pathlib.Path, dict[str, typing.Any]]] = None)[source]
- A Spark UDF that can be used to invoke the Python function formatted model. - Parameters passed to the UDF are forwarded to the model as a DataFrame where the column names are ordinals (0, 1, …). On some versions of Spark (3.0 and above), it is also possible to wrap the input in a struct. In that case, the data will be passed as a DataFrame with column names given by the struct definition (e.g. when invoked as my_udf(struct(‘x’, ‘y’)), the model will get the data as a pandas DataFrame with 2 columns ‘x’ and ‘y’). - If a model contains a signature with tensor spec inputs, you will need to pass a column of array type as a corresponding UDF argument. The column values of which must be one dimensional arrays. The UDF will reshape the column values to the required shape with ‘C’ order (i.e. read / write the elements using C-like index order) and cast the values as the required tensor spec type. - If a model contains a signature, the UDF can be called without specifying column name arguments. In this case, the UDF will be called with column names from signature, so the evaluation dataframe’s column names must match the model signature’s column names. - The predictions are filtered to contain only the columns that can be represented as the - result_type. If the- result_typeis string or array of strings, all predictions are converted to string. If the result type is not an array type, the left most column with matching type is returned.- Note - Inputs of type - pyspark.sql.types.DateTypeare not supported on earlier versions of Spark (2.4 and below).- Note - When using Databricks Connect to connect to a remote Databricks cluster, the Databricks cluster must use runtime version >= 16, and if the ‘prebuilt_env_uri’ parameter is set, ‘env_manager’ parameter should not be set. the Databricks cluster must use runtime version >= 15.4,and if the ‘prebuilt_env_uri’ parameter is set, ‘env_manager’ parameter should not be set, if the runtime version is 15.4 and the cluster is standard access mode, the cluster need to configure “spark.databricks.safespark.archive.artifact.unpack.disabled” to “false”. - Note - Please be aware that when operating in Databricks Serverless, spark tasks run within the confines of the Databricks Serverless UDF sandbox. This environment has a total capacity limit of 1GB, combining both available memory and local disk capacity. Furthermore, there are no GPU devices available in this setup. Therefore, any deep-learning models that contain large weights or require a GPU are not suitable for deployment on Databricks Serverless. - from pyspark.sql.functions import struct predict = mlflow.pyfunc.spark_udf(spark, "/my/local/model") df.withColumn("prediction", predict(struct("name", "age"))).show() - Parameters
- spark – A SparkSession object. 
- model_uri – - The location, in URI format, of the MLflow model with the - mlflow.pyfuncflavor. For example:- /Users/me/path/to/local/model
- relative/path/to/local/model
- s3://my_bucket/path/to/model
- runs:/<mlflow_run_id>/run-relative/path/to/model
- models:/<model_name>/<model_version>
- models:/<model_name>/<stage>
- mlflow-artifacts:/path/to/model
 - For more information about supported URI schemes, see Referencing Artifacts. 
- result_type – - the return type of the user-defined function. The value can be either a - pyspark.sql.types.DataTypeobject or a DDL-formatted type string. Only a primitive type, an array- pyspark.sql.types.ArrayTypeof primitive type, or a struct type containing fields of above 2 kinds of types are allowed. If unspecified, it tries to infer result type from model signature output schema, if model output schema is not available, it fallbacks to use- doubletype.- The following classes of result type are supported: - ”int” or - pyspark.sql.types.IntegerType: The leftmost integer that can fit in an- int32or an exception if there is none.
- ”long” or - pyspark.sql.types.LongType: The leftmost long integer that can fit in an- int64or an exception if there is none.
- ArrayType(IntegerType|LongType): All integer columns that can fit into the requested size.
- ”float” or - pyspark.sql.types.FloatType: The leftmost numeric result cast to- float32or an exception if there is none.
- ”double” or - pyspark.sql.types.DoubleType: The leftmost numeric result cast to- doubleor an exception if there is none.
- ArrayType(FloatType|DoubleType): All numeric columns cast to the requested type or an exception if there are no numeric columns.
- ”string” or - pyspark.sql.types.StringType: The leftmost column converted to- string.
- ”boolean” or “bool” or - pyspark.sql.types.BooleanType: The leftmost column converted to- boolor an exception if there is none.
- ArrayType(StringType): All columns converted to- string.
- ”field1 FIELD1_TYPE, field2 FIELD2_TYPE, …”: A struct type containing multiple fields separated by comma, each field type must be one of types listed above. 
 
- env_manager – - The environment manager to use in order to create the python environment for model inference. Note that environment is only restored in the context of the PySpark UDF; the software environment outside of the UDF is unaffected. If prebuilt_env_uri parameter is not set, the default value is - local, and the following values are supported:- virtualenv: Use virtualenv to restore the python environment that was used to train the model. This is the default option if- env_manageris not set.
- uv: Use uv to restore the python environment that was used to train the model.
- conda: Use Conda to restore the software environment that was used to train the model.
- local: Use the current Python environment for model inference, which may differ from the environment used to train the model and may lead to errors or invalid predictions.
 - If the prebuilt_env_uri parameter is set, env_manager parameter should not be set. 
- params – Additional parameters to pass to the model for inference. 
- extra_env – Extra environment variables to pass to the UDF executors. For overrides that need to propagate to the Spark workers (i.e., overriding the scoring server timeout via MLFLOW_SCORING_SERVER_REQUEST_TIMEOUT). 
- prebuilt_env_uri – - The path of the prebuilt env archive file created by mlflow.pyfunc.build_model_env API. This parameter can only be used in Databricks Serverless notebook REPL, Databricks Shared cluster notebook REPL, and Databricks Connect client environment. The path can be either local file path or DBFS path such as ‘dbfs:/Volumes/…’, in this case, MLflow automatically downloads it to local temporary directory, “MLFLOW_MODEL_ENV_DOWNLOADING_TEMP_DIR” environmental variable can be set to specify the temporary directory to use. - If this parameter is set, env_manger parameter must not be set. 
- model_config – The model configuration to set when loading the model. See ‘model_config’ argument in mlflow.pyfunc.load_model API for details. 
 
- Returns
- Spark UDF that applies the model’s - predictmethod to the data and returns a type specified by- result_type, which by default is a double.
 
- mlflow.pyfunc.update_signature_for_type_hint_from_example(input_example: Any, signature: mlflow.models.signature.ModelSignature)[source]
- mlflow.pyfunc.get_default_pip_requirements()[source]
- Returns
- A list of default pip requirements for MLflow Models produced by this flavor. Calls to - save_model()and- log_model()produce a pip environment that, at minimum, contains these requirements.
 
- mlflow.pyfunc.get_default_conda_env()[source]
- Returns
- The default Conda environment for MLflow Models produced by calls to - save_model()and- log_model()when a user-defined subclass of- PythonModelis provided.
 
- class mlflow.pyfunc.PythonModelContext[source]
- A collection of artifacts that a - PythonModelcan use when performing inference.- PythonModelContextobjects are created implicitly by the- save_model()and- log_model()persistence methods, using the contents specified by the- artifactsparameter of these methods.
- class mlflow.pyfunc.PythonModel[source]
- Represents a generic Python model that evaluates inputs and produces API-compatible outputs. By subclassing - PythonModel, users can create customized MLflow models with the “python_function” (“pyfunc”) flavor, leveraging custom inference logic and artifact dependencies.- load_context(context)[source]
- Loads artifacts from the specified - PythonModelContextthat can be used by- predict()when evaluating inputs. When loading an MLflow model with- load_model(), this method is called as soon as the- PythonModelis constructed.- The same - PythonModelContextwill also be available during calls to- predict(), but it may be more efficient to override this method and load artifacts from the context at model load time.- Parameters
- context – A - PythonModelContextinstance containing artifacts that the model can use to perform inference.
 
 - abstract predict(context, model_input, params: Optional[dict[str, typing.Any]] = None)[source]
- Evaluates a pyfunc-compatible input and produces a pyfunc-compatible output. For more information about the pyfunc input/output API, see the Inference API. - Parameters
- context – A - PythonModelContextinstance containing artifacts that the model can use to perform inference.
- model_input – A pyfunc-compatible input for the model to evaluate. 
- params – Additional parameters to pass to the model for inference. 
 
 - Tip - Since MLflow 2.20.0, context parameter can be removed from predict function signature if it’s not used. def predict(self, model_input, params=None) is valid. 
 - predict_stream(context, model_input, params: Optional[dict[str, typing.Any]] = None)[source]
- Evaluates a pyfunc-compatible input and produces an iterator of output. For more information about the pyfunc input API, see the Inference API. - Parameters
- context – A - PythonModelContextinstance containing artifacts that the model can use to perform inference.
- model_input – A pyfunc-compatible input for the model to evaluate. 
- params – Additional parameters to pass to the model for inference. 
 
 - Tip - Since MLflow 2.20.0, context parameter can be removed from predict_stream function signature if it’s not used. def predict_stream(self, model_input, params=None) is valid. 
 
- class mlflow.pyfunc.ChatModel(*args, **kwargs)[source]
- Warning - mlflow.pyfunc.model.ChatModelis deprecated since 3.0.0. This method will be removed in a future release. Use- ResponsesAgentinstead.- Tip - Since MLflow 3.0.0, we recommend using - ResponsesAgentinstead of- ChatModelunless you need strict compatibility with the OpenAI ChatCompletion API.- A subclass of - PythonModelthat makes it more convenient to implement models that are compatible with popular LLM chat APIs. By subclassing- ChatModel, users can create MLflow models with a- predict()method that is more convenient for chat tasks than the generic- PythonModelAPI. ChatModels automatically define input/output signatures and an input example, so manually specifying these values when calling- mlflow.pyfunc.save_model()is not necessary.- See the documentation of the - predict()method below for details on that parameters and outputs that are expected by the- ChatModelAPI.- ChatModel - PythonModel - When to use - Use when you want to develop and deploy a conversational model with standard chat schema compatible with OpenAI spec. - Use when you want full control over the model’s interface or customize every aspect of your model’s behavior. - Interface - Fixed to OpenAI’s chat schema. - Full control over the model’s input and output schema. - Setup - Quick. Works out of the box for conversational applications, with pre-defined
- model signature and input example. 
 - Custom. You need to define model signature or input example yourself. - Complexity - Low. Standardized interface simplified model deployment and integration. - High. Deploying and integrating the custom PythonModel may not be straightforward.
- E.g., The model needs to handle Pandas DataFrames as MLflow converts input data to DataFrames before passing it to PythonModel. 
 - abstract predict(context, messages: list[mlflow.types.llm.ChatMessage], params: mlflow.types.llm.ChatParams) mlflow.types.llm.ChatCompletionResponse[source]
- Evaluates a chat input and produces a chat output. - Parameters
- context – A - PythonModelContextinstance containing artifacts that the model can use to perform inference.
- messages (List[ - ChatMessage]) – A list of- ChatMessageobjects representing chat history.
- params ( - ChatParams) – A- ChatParamsobject containing various parameters used to modify model behavior during inference.
 
 - Tip - Since MLflow 2.20.0, context parameter can be removed from predict function signature if it’s not used. def predict(self, messages: list[ChatMessage], params: ChatParams) is valid. - Returns
- A - ChatCompletionResponseobject containing the model’s response(s), as well as other metadata.
 
 - predict_stream(context, messages: list[mlflow.types.llm.ChatMessage], params: mlflow.types.llm.ChatParams) Generator[mlflow.types.llm.ChatCompletionChunk, None, None][source]
- Evaluates a chat input and produces a chat output. Override this function to implement a real stream prediction. - Parameters
- context – A - PythonModelContextinstance containing artifacts that the model can use to perform inference.
- messages (List[ - ChatMessage]) – A list of- ChatMessageobjects representing chat history.
- params ( - ChatParams) – A- ChatParamsobject containing various parameters used to modify model behavior during inference.
 
 - Tip - Since MLflow 2.20.0, context parameter can be removed from predict_stream function signature if it’s not used. def predict_stream(self, messages: list[ChatMessage], params: ChatParams) is valid. - Returns
- A generator over - ChatCompletionChunkobject containing the model’s response(s), as well as other metadata.
 
 
- class mlflow.pyfunc.ChatAgent[source]
- Tip - Since MLflow 3.0.0, we recommend using - ResponsesAgentinstead of- ChatAgent.- What is the ChatAgent Interface? - The ChatAgent interface is a chat schema specification that has been designed for authoring conversational agents. ChatAgent allows your agent to do the following: - Return multiple messages 
- Return intermediate steps for tool calling agents 
- Confirm tool calls 
- Support multi-agent scenarios 
 - ChatAgentshould always be used when authoring an agent. We also recommend using- ChatAgentinstead of- ChatModeleven for use cases like simple chat models (e.g. prompt-engineered LLMs), to give you the flexibility to support more agentic functionality in the future.- The - ChatAgentRequestschema is similar to, but not strictly compatible with the OpenAI ChatCompletion schema. ChatAgent adds additional functionality and diverges from OpenAI- ChatCompletionRequestin the following ways:- Adds an optional - attachmentsattribute to every input/output message for tools and internal agent calls so they can return additional outputs such as visualizations and progress indicators
- Adds a - contextattribute with a- conversation_idand- user_idattributes to enable modifying the behavior of the agent depending on the user querying the agent
- Adds the - custom_inputsattribute, an arbitrary- dict[str, Any]to pass in any additional information to modify the agent’s behavior
 - The - ChatAgentResponseschema diverges from- ChatCompletionResponseschema in the following ways:- Adds the - custom_outputskey, an arbitrary- dict[str, Any]to return any additional information
- Allows multiple messages in the output, to improve the display and evaluation of internal tool calls and inter-agent communication that led to the final answer. 
 - Here’s an example of a - ChatAgentResponsedetailing a tool call:- { "messages": [ { "role": "assistant", "content": "", "id": "run-04b46401-c569-4a4a-933e-62e38d8f9647-0", "tool_calls": [ { "id": "call_15ca4fcc-ffa1-419a-8748-3bea34b9c043", "type": "function", "function": { "name": "generate_random_ints", "arguments": '{"min": 1, "max": 100, "size": 5}', }, } ], }, { "role": "tool", "content": '{"content": "Generated array of 2 random ints in [1, 100]."', "name": "generate_random_ints", "id": "call_15ca4fcc-ffa1-419a-8748-3bea34b9c043", "tool_call_id": "call_15ca4fcc-ffa1-419a-8748-3bea34b9c043", }, { "role": "assistant", "content": "The new set of generated random numbers are: 93, 51, 12, 7, and 25", "name": "llm", "id": "run-70c7c738-739f-4ecd-ad18-0ae232df24e8-0", }, ], "custom_outputs": {"random_nums": [93, 51, 12, 7, 25]}, } - Streaming Agent Output with ChatAgent - Please read the docstring of - ChatAgent.predict_streamfor more details on how to stream the output of your agent.- Authoring a ChatAgent - Authoring an agent using the ChatAgent interface is a framework-agnostic way to create a model with a standardized interface that is loggable with the MLflow pyfunc flavor, can be reused across clients, and is ready for serving workloads. - To write your own agent, subclass - ChatAgent, implementing the- predictand optionally- predict_streammethods to define the non-streaming and streaming behavior of your agent. You can use any agent authoring framework - the only hard requirement is to implement the- predictinterface.- def predict( self, messages: list[ChatAgentMessage], context: Optional[ChatContext] = None, custom_inputs: Optional[dict[str, Any]] = None, ) -> ChatAgentResponse: ... - In addition to calling predict and predict_stream methods with an input matching their type hints, you can also pass a single input dict that matches the - ChatAgentRequestschema for ease of testing.- chat_agent = MyChatAgent() chat_agent.predict( { "messages": [{"role": "user", "content": "What is 10 + 10?"}], "context": {"conversation_id": "123", "user_id": "456"}, } ) - See an example implementation of - predictand- predict_streamfor a LangGraph agent in the- ChatAgentStatedocstring.- Logging the ChatAgent - Since the landscape of LLM frameworks is constantly evolving and not every flavor can be natively supported by MLflow, we recommend the Models-from-Code logging approach. - with mlflow.start_run(): logged_agent_info = mlflow.pyfunc.log_model( name="agent", python_model=os.path.join(os.getcwd(), "agent"), # Add serving endpoints, tools, and vector search indexes here resources=[], ) - After logging the model, you can query the model with a single dictionary with the - ChatAgentRequestschema. Under the hood, it will be converted into the python objects expected by your- predictand- predict_streammethods.- loaded_model = mlflow.pyfunc.load_model(tmp_path) loaded_model.predict( { "messages": [{"role": "user", "content": "What is 10 + 10?"}], "context": {"conversation_id": "123", "user_id": "456"}, } ) - To make logging ChatAgent models as easy as possible, MLflow has built in the following features: - Automatic Model Signature Inference
- You do not need to set a signature when logging a ChatAgent 
- An input and output signature will be automatically set that adheres to the - ChatAgentRequestand- ChatAgentResponseschemas
 
 
- Metadata
- {"task": "agent/v2/chat"}will be automatically appended to any metadata that you may pass in when logging the model
 
 
- Input Example
- Providing an input example is optional, - mlflow.types.agent.CHAT_AGENT_INPUT_EXAMPLEwill be provided by default
- If you do provide an input example, ensure it’s a dict with the - ChatAgentRequestschema
- input_example = { "messages": [{"role": "user", "content": "What is MLflow?"}], "context": {"conversation_id": "123", "user_id": "456"}, } 
 
 
 - Migrating from ChatModel to ChatAgent - To convert an existing ChatModel that takes in - List[ChatMessage]and- ChatParamsand outputs a- ChatCompletionResponse, do the following:- Subclass - ChatAgentinstead of- ChatModel
- Move any functionality from your - ChatModel’s- load_contextimplementation into the- __init__method of your new- ChatAgent.
- Use - .model_dump_compat()instead of- .to_dict()when converting your model’s inputs to dictionaries. Ex.- [msg.model_dump_compat() for msg in messages]instead of- [msg.to_dict() for msg in messages]
- Return a - ChatAgentResponseinstead of a- ChatCompletionResponse
 - For example, we can convert the ChatModel from the Chat Model Intro to a ChatAgent: - class SimpleOllamaModel(ChatModel): def __init__(self): self.model_name = "llama3.2:1b" self.client = None def load_context(self, context): self.client = ollama.Client() def predict( self, context, messages: list[ChatMessage], params: ChatParams = None ) -> ChatCompletionResponse: ollama_messages = [msg.to_dict() for msg in messages] response = self.client.chat(model=self.model_name, messages=ollama_messages) return ChatCompletionResponse( choices=[{"index": 0, "message": response["message"]}], model=self.model_name, ) - class SimpleOllamaModel(ChatAgent): def __init__(self): self.model_name = "llama3.2:1b" self.client = None self.client = ollama.Client() def predict( self, messages: list[ChatAgentMessage], context: Optional[ChatContext] = None, custom_inputs: Optional[dict[str, Any]] = None, ) -> ChatAgentResponse: ollama_messages = self._convert_messages_to_dict(messages) response = self.client.chat(model=self.model_name, messages=ollama_messages) return ChatAgentResponse(**{"messages": [response["message"]]}) - ChatAgent Connectors - MLflow provides convenience APIs for wrapping agents written in popular authoring frameworks with ChatAgent. See examples for: - LangGraph in the - ChatAgentStatedocstring
 - abstract predict(messages: list[mlflow.types.agent.ChatAgentMessage], context: Optional[mlflow.types.agent.ChatContext] = None, custom_inputs: Optional[dict[str, typing.Any]] = None) mlflow.types.agent.ChatAgentResponse[source]
- Given a ChatAgent input, returns a ChatAgent output. In addition to calling - predictwith an input matching the type hints, you can also pass a single input dict that matches the- ChatAgentRequestschema for ease of testing.- chat_agent = ChatAgent() chat_agent.predict( { "messages": [{"role": "user", "content": "What is 10 + 10?"}], "context": {"conversation_id": "123", "user_id": "456"}, } ) - Parameters
- messages (List[ - ChatAgentMessage]) – A list of- ChatAgentMessageobjects representing the chat history.
- context ( - ChatContext) – A- ChatContextobject containing conversation_id and user_id. Optional Defaults to None.
- custom_inputs (Dict[str, Any]) – An optional param to provide arbitrary additional inputs to the model. The dictionary values must be JSON-serializable. Optional Defaults to None. 
 
- Returns
- A - ChatAgentResponseobject containing the model’s response, as well as other metadata.
 
 - predict_stream(messages: list[mlflow.types.agent.ChatAgentMessage], context: Optional[mlflow.types.agent.ChatContext] = None, custom_inputs: Optional[dict[str, typing.Any]] = None) Generator[mlflow.types.agent.ChatAgentChunk, None, None][source]
- Given a ChatAgent input, returns a generator containing streaming ChatAgent output chunks. In addition to calling - predict_streamwith an input matching the type hints, you can also pass a single input dict that matches the- ChatAgentRequestschema for ease of testing.- chat_agent = ChatAgent() for event in chat_agent.predict_stream( { "messages": [{"role": "user", "content": "What is 10 + 10?"}], "context": {"conversation_id": "123", "user_id": "456"}, } ): print(event) - To support streaming the output of your agent, override this method in your subclass of - ChatAgent. When implementing- predict_stream, keep in mind the following requirements:- Ensure your implementation adheres to the - predict_streamtype signature. For example, streamed messages must be of the type- ChatAgentChunk, where each chunk contains partial output from a single response message.
- At most one chunk in a particular response can contain the - custom_outputskey.
- Chunks containing partial content of a single response message must have the same - id. The content field of the message and usage stats of the- ChatAgentChunkshould be aggregated by the consuming client. See the example below.
 - {"delta": {"role": "assistant", "content": "Born", "id": "123"}} {"delta": {"role": "assistant", "content": " in", "id": "123"}} {"delta": {"role": "assistant", "content": " data", "id": "123"}} - Parameters
- messages (List[ - ChatAgentMessage]) – A list of- ChatAgentMessageobjects representing the chat history.
- context ( - ChatContext) – A- ChatContextobject containing conversation_id and user_id. Optional Defaults to None.
- custom_inputs (Dict[str, Any]) – An optional param to provide arbitrary additional inputs to the model. The dictionary values must be JSON-serializable. Optional Defaults to None. 
 
- Returns
- A generator over - ChatAgentChunkobjects containing the model’s response(s), as well as other metadata.
 
 
- class mlflow.pyfunc.ResponsesAgent[source]
- Note - Experimental: This class may change or be removed in a future release without warning. - A base class for creating ResponsesAgent models. It can be used as a wrapper around any agent framework to create an agent model that can be deployed to MLflow. Has a few helper methods to help create output items that can be a part of a ResponsesAgentResponse or ResponsesAgentStreamEvent. - See https://www.mlflow.org/docs/latest/llms/responses-agent-intro/ for more details. - create_function_call_item(id: str, call_id: str, name: str, arguments: str) dict[str, typing.Any][source]
- Helper method to create a dictionary conforming to the function call item schema. - Read more at https://www.mlflow.org/docs/latest/llms/responses-agent-intro/#creating-agent-output. - Parameters
- id (str) – The id of the output item. 
- call_id (str) – The id of the function call. 
- name (str) – The name of the function to be called. 
- arguments (str) – The arguments to be passed to the function. 
 
 
 - create_function_call_output_item(call_id: str, output: str) dict[str, typing.Any][source]
- Helper method to create a dictionary conforming to the function call output item schema. - Read more at https://www.mlflow.org/docs/latest/llms/responses-agent-intro/#creating-agent-output. - Parameters
- call_id (str) – The id of the function call. 
- output (str) – The output of the function call. 
 
 
 - create_text_delta(delta: str, item_id: str) dict[str, typing.Any][source]
- Helper method to create a dictionary conforming to the text delta schema for streaming. - Read more at https://www.mlflow.org/docs/latest/llms/responses-agent-intro/#streaming-agent-output. 
 - create_text_output_item(text: str, id: str) dict[str, typing.Any][source]
- Helper method to create a dictionary conforming to the text output item schema. - Read more at https://www.mlflow.org/docs/latest/llms/responses-agent-intro/#creating-agent-output. - Parameters
- text (str) – The text to be outputted. 
- id (str) – The id of the output item. 
 
 
 - abstract predict(request: mlflow.types.responses.ResponsesAgentRequest) mlflow.types.responses.ResponsesAgentResponse[source]
- Given a ResponsesAgentRequest, returns a ResponsesAgentResponse. - You can see example implementations at https://www.mlflow.org/docs/latest/llms/responses-agent-intro#simple-chat-example and https://www.mlflow.org/docs/latest/llms/responses-agent-intro#tool-calling-example. 
 - predict_stream(request: mlflow.types.responses.ResponsesAgentRequest) Generator[mlflow.types.responses.ResponsesAgentStreamEvent, None, None][source]
- Given a ResponsesAgentRequest, returns a generator of ResponsesAgentStreamEvent objects. - See more details at https://www.mlflow.org/docs/latest/llms/responses-agent-intro#streaming-agent-output. - You can see example implementations at https://www.mlflow.org/docs/latest/llms/responses-agent-intro#simple-chat-example and https://www.mlflow.org/docs/latest/llms/responses-agent-intro#tool-calling-example. 
 - static responses_agent_output_reducer(chunks: list[typing.Union[mlflow.types.responses.ResponsesAgentStreamEvent, dict[str, typing.Any]]])[source]