MLflow 2.3.2 is a patch release containing the following features, bug fixes and changes:
Features:
- [Models] Add GPU support for transformers models pyfunc inference and serving (#8375, @ankit-db)
- [Models] Disable autologging functionality for non-relevant models when training a transformers model (#8405, @BenWilson2)
- [Models] Add support for preserving and overriding torch_dtype values in transformers pipelines (#8421, @BenWilson2)
- [Models] Add support for Feature Extraction pipelines in the transformers flavor (#8423, @BenWilson2)
- [Tracking] Add basic HTTP auth support for users, registered models, and experiments permissions (#8286, @gabrielfu)
Bug Fixes:
- [Models] Fix inferred schema issue with Text2TextGeneration pipelines in the transformers flavor (#8391, @BenWilson2)
- [Models] Change MLflow dependency pinning in logged models from a range value to an exact major and minor version (#8422, @harupy)
Documentation updates:
For a comprehensive list of changes, see the release change log, and check out the latest documentation on mlflow.org.
MLflow 2.3.1 is a patch release containing bug fixes and a security patch for GHSA-83fm-w79m-64r5. If you are using mlflow server or mlflow ui, we recommend upgrading to MLflow 2.3.1 as soon as possible.
Security patches:
- [Security] Fix critical LFI attack vulnerability by disabling the ability to provide relative paths in registered model sources (#8281, @BenWilson2)
Bug fixes:
- [Tracking] Fix an issue causing file and model uploads to hang on Databricks (#8348, @harupy)
- [Tracking / Model Registry] Fix an issue causing file and model downloads to hang on Databricks (#8350, @dbczumar)
- [Scoring] Fix regression in schema enforcement for model serving when using the inputs format for inference (#8326, @BenWilson2)
- [Model Registry] Fix regression in model naming parsing where special characters were not accepted in model names (#8322, @arpitjasa-db)
- [Recipes] Fix card rendering with the pandas profiler to handle columns containing all null values (#8263, @sunishsheth2009)
We are happy to announce the availability of MLflow 2.3.0!
MLflow 2.3.0 includes several major features and improvements
Features:
- [Models] Introduce a new transformers named flavor (#8236, #8181, #8086, @BenWilson2)
- [Models] Introduce a new openai named flavor (#8191, #8155, @harupy)
- [Models] Introduce a new langchain named flavor (#8251, #8197, @liangz1, @sunishsheth2009)
- [Models] Add support for Pytorch and Lightning 2.0 (#8072, @shrinath-suresh)
- [Tracking] Add support for logging LLM input, output, and prompt artifacts (#8234, #8204, @sunishsheth2009)
- [Tracking] Add support for HTTP Basic Auth in the MLflow tracking server (#8130, @gabrielfu)
- [Tracking] Add search_model_versions to the fluent API (#8223, @mariusschlegel)
- [Artifacts] Add support for parallelized artifact downloads (#8116, @apurva-koti)
- [Artifacts] Add support for parallelized artifact uploads for AWS (#8003, @harupy)
- [Artifacts] Add content type headers to artifact upload requests for the HttpArtifactRepository (#8048, @WillEngler)
- [Model Registry] Add alias support for logged models within Model Registry (#8164, #8094, #8055 @arpitjasa-db)
- [UI] Add support for custom domain git providers (#7933, @gusghrlrl101)
- [Scoring] Add plugin support for customization of MLflow serving endpoints (#7757, @jmahlik)
- [Scoring] Add support to MLflow serving that allows configuration of multiple inference workers (#8035, @M4nouel)
- [Sagemaker] Add support for asynchronous inference configuration on Sagemaker (#8009, @thomasbell1985)
- [Build] Remove shap as a core dependency of MLflow (#8199, @jmahlik)
Bug fixes:
- [Models] Fix a bug with tensorflow autologging for models with multiple inputs (#8097, @jaume-ferrarons)
- [Recipes] Fix a bug with Pandas 2.0 updates for profiler rendering of datetime types (#7925, @sunishsheth2009)
- [Tracking] Prevent exceptions from being raised if a parameter is logged with an existing key whose value is identical to the logged parameter (#8038, @AdamStelmaszczyk)
- [Tracking] Fix an issue with deleting experiments in the FileStore backend (#8178, @mariusschlegel)
- [Tracking] Fix a UI bug where the "Source Run" field in the Model Version page points to an incorrect set of artifacts (#8156, @WeichenXu123)
- [Tracking] Fix a bug wherein renaming a run reverts its current lifecycle status to UNFINISHED (#8154, @WeichenXu123)
- [Tracking] Fix a bug where a file URI could be used as a model version source (#8126, @harupy)
- [Projects] Fix an issue with MLflow projects that have submodules contained within a project (#8050, @kota-iizuka)
- [Examples] Fix lightning hyperparameter tuning examples (#8039, @BenWilson2)
- [Server-infra] Fix bug with Cache-Control headers for static server files (#8016, @jmahlik)
Documentation updates:
- [Examples] Add a new and thorough example for the creation of custom model flavors (#7867, @benjaminbluhm)
For a comprehensive list of changes, see the release change log, and check out the latest documentation on mlflow.org.
We are happy to announce the availability of MLflow 2.2.2!
MLflow 2.2.2 is a patch release containing the following bug fixes:
- [Model Registry] Allow
source to be a local path within a run's artifact directory if a run_id is specified (#7993, @harupy)
- [Model Registry] Fix a bug where a windows UNC path is considered a local path (#7988, @WeichenXu123)
- [Model Registry] Disallow
name to be a file path in FileStore.get_registered_model (#7965, @harupy)
For a comprehensive list of changes, see the release change log, and check out the latest documentation on mlflow.org.
We are happy to announce the availability of MLflow 2.2.1!
MLflow 2.2.1 is a patch release containing the following bug fixes:
- [Model Registry] Fix a bug that caused too many results to be requested by default when calling
MlflowClient.search_model_versions() (#7935, @dbczumar)
For a comprehensive list of changes, see the release change log, and check out the latest documentation on mlflow.org.
We are happy to announce the availability of MLflow 2.2.0!
MLflow 2.2.0 includes several major features and improvements
Features:
- [Recipes] Add support for score calibration to the classification recipe (#7744, @sunishsheth2009)
- [Recipes] Add automatic label encoding to the classification recipe (#7711, @sunishsheth2009)
- [Recipes] Support custom data splitting logic in the classification and regression recipes (#7815, #7588, @sunishsheth2009)
- [Recipes] Introduce customizable MLflow Run name prefixes to the classification and regression recipes (#7746, @kamalesh0406; #7763, @sunishsheth2009)
- [UI] Add a new Chart View to the MLflow Experiment Page for model performance insights (#7864, @hubertzub-db, @apurva-koti, @prithvikannan, @ridhimag11, @sunishseth2009, @dbczumar)
- [UI] Modernize and improve parallel coordinates chart for model tuning (#7864, @hubertzub-db, @apurva-koti, @prithvikannan, @ridhimag11, @sunishseth2009, @dbczumar)
- [UI] Add typeahead suggestions to the MLflow Experiment Page search bar (#7864, @hubertzub-db, @apurva-koti, @prithvikannan, @ridhimag11, @sunishseth2009, @dbczumar)
- [UI] Improve performance of Experiments Sidebar for large numbers of experiments (#7804, @jmahlik)
- [Tracking] Introduce autologging support for native PyTorch models (#7627, @temporaer)
- [Tracking] Allow specifying
model_format when autologging XGBoost models (#7781, @guyrosin)
- [Tracking] Add
MLFLOW_ARTIFACT_UPLOAD_DOWNLOAD_TIMEOUT environment variable to configure artifact operation timeouts (#7783, @wamartin-aml)
- [Artifacts] Include
Content-Type response headers for artifacts downloaded from mlflow server (#7827, @bali0019)
- [Model Registry] Introduce the
searchModelVersions() API to the Java client (#7880, @gabrielfu)
- [Model Registry] Introduce
max_results, order_by and page_token arguments to MlflowClient.search_model_versions() (#7623, @serena-ruan)
- [Models] Support logging large ONNX models by using external data (#7808, @dogeplusplus)
- [Models] Add support for logging Diviner models fit in Spark (#7800, @BenWilson2)
- [Models] Introduce
MLFLOW_DEFAULT_PREDICTION_DEVICE environment variable to set the device for pyfunc model inference (#7922, @ankit-db)
- [Scoring] Publish official Docker images for the MLflow Model scoring server at github.com/mlflow/mlflow/pkgs (#7759, @dbczumar)
Bug fixes:
- [Recipes] Fix dataset format validation in the ingest step for custom dataset sources (#7638, @sunishsheth2009)
- [Recipes] Fix bug in identification of worst performing examples during training (#7658, @sunishsheth2009)
- [Recipes] Ensure consistent rendering of the recipe graph when
inspect() is called (#7852, @sunishsheth2009)
- [Recipes] Correctly respect
positive_class configuration in the transform step (#7626, @sunishsheth2009)
- [Recipes] Make logged metric names consistent with
mlflow.evaluate() (#7613, @sunishsheth2009)
- [Recipes] Add
run_id and artifact_path keys to logged MLmodel files (#7651, @sunishsheth2009)
- [UI] Fix bugs in UI validation of experiment names, model names, and tag keys (#7818, @subramaniam02)
- [Tracking] Resolve artifact locations to absolute paths when creating experiments (#7670, @bali0019)
- [Tracking] Exclude Delta checkpoints from Spark datasource autologging (#7902, @harupy)
- [Tracking] Consistently return an empty list from GetMetricHistory when a metric does not exist (#7589, @bali0019; #7659, @harupy)
- [Artifacts] Fix support for artifact operations on Windows paths in UNC format (#7750, @bali0019)
- [Artifacts] Fix bug in HDFS artifact listing (#7581, @pwnywiz)
- [Model Registry] Disallow creation of model versions with local filesystem sources in
mlflow server (#7908, @harupy)
- [Model Registry] Fix handling of deleted model versions in FileStore (#7716, @harupy)
- [Model Registry] Correctly initialize Model Registry SQL tables independently of MLflow Tracking (#7704, @harupy)
- [Models] Correctly move PyTorch model outputs from GPUs to CPUs during inference with pyfunc (#7885, @ankit-db)
- [Build] Fix compatiblility issues with Python installations compiled using
PYTHONOPTIMIZE=2 (#7791, @dbczumar)
- [Build] Fix compatibility issues with the upcoming pandas 2.0 release (#7899, @harupy; #7910, @dbczumar)
Documentation updates:
For a comprehensive list of changes, see the release change log, and check out the latest documentation on mlflow.org.
We are happy to announce the availability of MLflow 2.1.1!
MLflow 2.1.1 is a patch release containing the following bug fixes:
- [Scoring] Fix
mlflow.pyfunc.spark_udf() type casting error on model with ColSpec input schema
and make PyFuncModel.predict support dataframe with elements of numpy.ndarray type (#7592 @WeichenXu123)
- [Scoring] Make
mlflow.pyfunc.scoring_server.client.ScoringServerClient support input dataframe with elements
of numpy.ndarray type (#7594 @WeichenXu123)
- [Tracking] Ensure mlflow imports ML packages lazily (#7597, @harupy)
For a comprehensive list of changes, see the release change log, and check out the latest documentation on mlflow.org.
We are happy to announce the availability of MLflow 2.1.0!
MLflow 2.1.0 includes several major features and improvements
Features:
- [Recipes] Introduce support for multi-class classification (#7458, @mshtelma)
- [Recipes] Extend the pyfunc representation of classification models to output scores in addition to labels (#7474, @sunishsheth2009)
- [UI] Add user ID and lifecycle stage quick search links to the Runs page (#7462, @jaeday)
- [Tracking] Paginate the GetMetricHistory API (#7523, #7415, @BenWilson2)
- [Tracking] Add Runs search aliases for Run name and start time that correspond to UI column names (#7492, @apurva-koti)
- [Tracking] Add a
/version endpoint to mlflow server for querying the server's MLflow version (#7273, @joncarter1)
- [Model Registry] Add FileStore support for the Model Registry (#6605, @serena-ruan)
- [Model Registry] Introduce an
mlflow.search_registered_models() fluent API (#7428, @TSienki)
- [Model Registry / Java] Add a
getRegisteredModel() method to the Java client (#6602) (#7511, @drod331)
- [Model Registry / R] Add an
mlflow_set_model_version_tag() method to the R client (#7401, @leeweijie)
- [Models] Introduce a
metadata field to the MLmodel specification and log_model() methods (#7237, @jdonzallaz)
- [Models] Extend
Model.load() to support loading MLmodel specifications from remote locations (#7517, @dbczumar)
- [Models] Pin the major version of MLflow in Models'
requirements.txt and conda.yaml files (#7364, @BenWilson2)
- [Scoring] Extend
mlflow.pyfunc.spark_udf() to support StructType results (#7527, @WeichenXu123)
- [Scoring] Extend TensorFlow and Keras Models to support multi-dimensional inputs with
mlflow.pyfunc.spark_udf()(#7531, #7291, @WeichenXu123)
- [Scoring] Support specifying deployment environment variables and tags when deploying models to SageMaker (#7433, @jhallard)
Bug fixes:
- [Recipes] Fix a bug that prevented use of custom
early_stop functions during model tuning (#7538, @sunishsheth2009)
- [Recipes] Fix a bug in the logic used to create a Spark session during data ingestion (#7307, @WeichenXu123)
- [Tracking] Make the metric names produced by
mlflow.autolog() consistent with mlflow.evaluate() (#7418, @wenfeiy-db)
- [Tracking] Fix an autologging bug that caused nested, redundant information to be logged for XGBoost and LightGBM models (#7404, @WeichenXu123)
- [Tracking] Correctly classify SQLAlchemy OperationalErrors as retryable HTTP errors (#7240, @barrywhart)
- [Artifacts] Correctly handle special characters in credentials when using FTP artifact storage (#7479, @HCTsai)
- [Models] Address an issue that prevented MLeap models from being saved on Windows (#6966, @dbczumar)
- [Scoring] Fix a permissions issue encountered when using NFS during model scoring with
mlflow.pyfunc.spark_udf() (#7427, @WeichenXu123)
Documentation updates:
- [Docs] Add more examples to the Runs search documentation page (#7487, @apurva-koti)
- [Docs] Add documentation for Model flavors developed by the community (#7425, @mmerce)
- [Docs] Add an example for logging and scoring ONNX Models (#7398, @Rusteam)
- [Docs] Fix a typo in the model scoring REST API example for inputs with the
dataframe_split format (#7540, @zhouyangyu)
- [Docs] Fix a typo in the model scoring REST API example for inputs with the
dataframe_records format (#7361, @dbczumar)
For a comprehensive list of changes, see the release change log, and check out the latest documentation on mlflow.org.
We are happy to announce the availability of MLflow 2.0.1!
The 2.0.1 version of MLflow is a major milestone release that focuses on simplifying the management of end-to-end MLOps workflows, providing new feature-rich functionality, and expanding upon the production-ready MLOps capabilities offered by MLflow. Check out the MLflow 2.0 blog post
for an in-depth walk through!
This release contains several important breaking changes from the 1.x API, additional major features and improvements.
Features:
- [Recipes] MLflow Pipelines is now MLflow Recipes - a framework that enables data scientists to quickly develop high-quality models and deploy them to production
- [Recipes] Add support for classification models to MLflow Recipes (#7082, @bbarnes52)
- [UI] Introduce support for pinning runs within the experiments UI (#7177, @harupy)
- [UI] Simplify the layout and provide customized displays of metrics, parameters, and tags within the experiments UI (#7177, @harupy)
- [UI] Simplify run filtering and ordering of runs within the experiments UI (#7177, @harupy)
- [Tracking] Update
mlflow.pyfunc.get_model_dependencies() to download all referenced requirements files for specified models (#6733, @harupy)
- [Tracking] Add support for selecting the Keras model
save_format used by mlflow.tensorflow.autolog() (#7123, @balvisio)
- [Models] Set
mlflow.evaluate() status to stable as it is now a production-ready API
- [Models] Simplify APIs for specifying custom metrics and custom artifacts during model evaluation with
mlflow.evaluate() (#7142, @harupy)
- [Models] Correctly infer the positive label for binary classification within
mlflow.evaluate() (#7149, @dbczumar)
- [Models] Enable automated signature logging for
tensorflow and keras models when mlflow.tensorflow.autolog() is enabled (#6678, @BenWilson2)
- [Models] Add support for native Keras and Tensorflow Core models within
mlflow.tensorflow (#6530, @WeichenXu123)
- [Models] Add support for defining the
model_format used by mlflow.xgboost.save/log_model() (#7068, @AvikantSrivastava)
- [Scoring] Overhaul the model scoring REST API to introduce format indicators for inputs and support multiple output fields (#6575, @tomasatdatabricks; #7254, @adriangonz)
- [Scoring] Add support for ragged arrays in model signatures (#7135, @trangevi)
- [Java] Add
getModelVersion API to the java client (#6955, @wgottschalk)
Breaking Changes:
The following list of breaking changes are arranged by their order of significance within each category.
- [Core] Support for Python 3.7 has been dropped. MLflow now requires Python >=3.8
- [Recipes]
mlflow.pipelines APIs have been replaced with mlflow.recipes
- [Tracking / Registry] Remove
/preview routes for Tracking and Model Registry REST APIs (#6667, @harupy)
- [Tracking] Remove deprecated
list APIs for experiments, models, and runs from Python, Java, R, and REST APIs (#6785, #6786, #6787, #6788, #6800, #6868, @dbczumar)
- [Tracking] Remove deprecated
runs response field from Get Experiment REST API response (#6541, #6524 @dbczumar)
- [Tracking] Remove deprecated
MlflowClient.download_artifacts API (#6537, @WeichenXu123)
- [Tracking] Change the behavior of environment variable handling for
MLFLOW_EXPERIMENT_NAME such that the value is always used when creating an experiment (#6674, @BenWilson2)
- [Tracking] Update
mlflow server to run in --serve-artifacts mode by default (#6502, @harupy)
- [Tracking] Update Experiment ID generation for the Filestore backend to enable threadsafe concurrency (#7070, @BenWilson2)
- [Tracking] Remove
dataset_name and on_data_{name | hash} suffixes from mlflow.evaluate() metric keys (#7042, @harupy)
- [Models / Scoring / Projects] Change default environment manager to
virtualenv instead of conda for model inference and project execution (#6459, #6489 @harupy)
- [Models] Move Keras model logging APIs to the
mlflow.tensorflow flavor and drop support for TensorFlow Estimators (#6530, @WeichenXu123)
- [Models] Remove deprecated
mlflow.sklearn.eval_and_log_metrics() API in favor of mlflow.evaluate() API (#6520, @dbczumar)
- [Models] Require
mlflow.evaluate() model inputs to be specified as URIs (#6670, @harupy)
- [Models] Drop support for returning custom metrics and artifacts from the same function when using
mlflow.evaluate(), in favor of custom_artifacts (#7142, @harupy)
- [Models] Extend
PyFuncModel spec to support conda and virtualenv subfields (#6684, @harupy)
- [Scoring] Remove support for defining input formats using the
Content-Type header (#6575, @tomasatdatabricks; #7254, @adriangonz)
- [Scoring] Replace the
--no-conda CLI option argument for native serving with --env-manager='local' (#6501, @harupy)
- [Scoring] Remove public APIs for
mlflow.sagemaker.deploy() and mlflow.sagemaker.delete() in favor of MLflow deployments APIs, such as mlflow deployments -t sagemaker (#6650, @dbczumar)
- [Scoring] Rename input argument
df to inputs in mlflow.deployments.predict() method (#6681, @BenWilson2)
- [Projects] Replace the
use_conda argument with the env_manager argument within the run CLI command for MLflow Projects (#6654, @harupy)
- [Projects] Modify the MLflow Projects docker image build options by renaming
--skip-image-build to --build-image with a default of False (#7011, @harupy)
- [Integrations/Azure] Remove deprecated
mlflow.azureml modules from MLflow in favor of the azure-mlflow deployment plugin (#6691, @BenWilson2)
- [R] Remove conda integration with the R client (#6638, @harupy)
Bug fixes:
- [Recipes] Fix rendering issue with profile cards polyfill (#7154, @hubertzub-db)
- [Tracking] Set the MLflow Run name correctly when specified as part of the
tags argument to mlflow.start_run() (#7228, @Cokral)
- [Tracking] Fix an issue with conflicting MLflow Run name assignment if the
mlflow.runName tag is set (#7138, @harupy)
- [Scoring] Fix incorrect payload constructor error in SageMaker deployment client
predict() API (#7193, @dbczumar)
- [Scoring] Fix an issue where
DataCaptureConfig information was not preserved when updating a Sagemaker deployment (#7281, @harupy)
For a comprehensive list of changes, see the release change log, and check out the latest documentation on mlflow.org.
We are happy to announce the availability of MLflow 1.30.0!
MLflow 1.30.0 includes several major features and improvements
Features:
- [Pipelines] Introduce hyperparameter tuning support to MLflow Pipelines (#6859, @prithvikannan)
- [Pipelines] Introduce support for prediction outlier comparison to training data set (#6991, @jinzhang21)
- [Pipelines] Introduce support for recording all training parameters for reproducibility (#7026, #7094, @prithvikannan)
- [Pipelines] Add support for
Delta tables as a datasource in the ingest step (#7010, @sunishsheth2009)
- [Pipelines] Add expanded support for data profiling up to 10,000 columns (#7035, @prithvikanna)
- [Pipelines] Add support for AutoML in MLflow Pipelines using FLAML (#6959, @mshtelma)
- [Pipelines] Add support for simplified transform step execution by allowing for unspecified configuration (#6909, @apurva-koti)
- [Pipelines] Introduce a data preview tab to the transform step card (#7033, @prithvikannan)
- [Tracking] Introduce
run_name attribute for create_run, get_run and update_run APIs (#6782, #6798 @apurva-koti)
- [Tracking] Add support for searching by
creation_time and last_update_time for the search_experiments API (#6979, @harupy)
- [Tracking] Add support for search terms
run_id IN and run ID NOT IN for the search_runs API (#6945, @harupy)
- [Tracking] Add support for searching by
user_id and end_time for the search_runs API (#6881, #6880 @subramaniam02)
- [Tracking] Add support for searching by
run_name and run_id for the search_runs API (#6899, @harupy; #6952, @alexacole)
- [Tracking] Add support for synchronizing run
name attribute and mlflow.runName tag (#6971, @BenWilson2)
- [Tracking] Add support for signed tracking server requests using AWSSigv4 and AWS IAM (#7044, @pdifranc)
- [Tracking] Introduce the
update_run() API for modifying the status and name attributes of existing runs (#7013, @gabrielfu)
- [Tracking] Add support for experiment deletion in the
mlflow gc cli API (#6977, @shaikmoeed)
- [Models] Add support for environment restoration in the
evaluate() API (#6728, @jerrylian-db)
- [Models] Remove restrictions on binary classification labels in the
evaluate() API (#7077, @dbczumar)
- [Scoring] Add support for
BooleanType to mlflow.pyfunc.spark_udf() (#6913, @BenWilson2)
- [SQLAlchemy] Add support for configurable
Pool class options for SqlAlchemyStore (#6883, @mingyu89)
Bug fixes:
- [Pipelines] Enable Pipeline subprocess commands to create a new
SparkSession if one does not exist (#6846, @prithvikannan)
- [Pipelines] Fix a rendering issue with
bool column types in Step Card data profiles (#6907, @sunishsheth2009)
- [Pipelines] Add validation and an exception if required step files are missing (#7067, @mingyu89)
- [Pipelines] Change step configuration validation to only be performed during runtime execution of a step (#6967, @prithvikannan)
- [Tracking] Fix infinite recursion bug when inferring the model schema in
mlflow.pyspark.ml.autolog() (#6831, @harupy)
- [UI] Remove the browser error notification when failing to fetch artifacts (#7001, @kevingreer)
- [Models] Allow
mlflow-skinny package to serve as base requirement in MLmodel requirements (#6974, @BenWilson2)
- [Models] Fix an issue with code path resolution for loading SparkML models (#6968, @dbczumar)
- [Models] Fix an issue with dependency inference in logging SparkML models (#6912, @BenWilson2)
- [Models] Fix an issue involving potential duplicate downloads for SparkML models (#6903, @serena-ruan)
- [Models] Add missing
pos_label to sklearn.metrics.precision_recall_curve in mlflow.evaluate() (#6854, @dbczumar)
- [SQLAlchemy] Fix a bug in
SqlAlchemyStore where set_tag() updates the incorrect tags (#7027, @gabrielfu)
Documentation updates:
- [Models] Update details regarding the default
Keras serialization format (#7022, @balvisio)
For a comprehensive list of changes, see the release change log, and check out the latest documentation on mlflow.org.