R API
The MLflow R API allows you to use MLflow Tracking, Projects and Models.
For instance, you can use the R API to install MLflow, start the user interface, create and list experiments, save models, run projects and serve models among many other functions available in the R API.
Table of Contents
- Crate a function to share with another process
- Is an object a crate?
- Active Run
- MLflow Command
- Create Experiment - Tracking Client
- Create Run
- Delete Experiment
- Delete a Run
- Download Artifacts
- Get Experiment by Name
- Get Experiment
- Get Run
- List artifacts
- List Experiments
- Log Artifact
- Log Metric
- Log Parameter
- Restore Experiment
- Restore a Run
- Set Tag
- Terminate a Run
- Initialize an MLflow client
- Create Experiment
- End a Run
- Get Remote Tracking URI
- Install MLflow
- Load MLflow Model Flavor
- Load MLflow Model.
- Log Artifact
- Log Metric
- Log Model
- Log Parameter
- Read Command Line Parameter
- Predict over MLflow Model Flavor
- Generate prediction with MLflow model.
- Restore Snapshot
- Predict using RFunc MLflow Model
- Serve an RFunc MLflow Model
- Run in MLflow
- Save MLflow Keras Model Flavor
- Save MLflow Model Flavor
- Save Model for MLflow
- Run the MLflow Tracking Server
- Set Experiment
- Set Tag
- Set Remote Tracking URI
- Dependencies Snapshot
- Source a Script with MLflow Params
- Start Run
- MLflow User Interface
- Uninstalls MLflow.
MLflow Command
Executes a generic MLflow command through the commmand line interface.
mlflow_cli(..., background = FALSE, echo = TRUE,
stderr_callback = NULL)
Arguments
Argument | Description |
---|---|
... |
The parameters to pass to the command line. |
background |
Should this command be triggered as
a background task? Defaults to
FALSE . |
echo |
Print the standard output and error
to the screen? Defaults to TRUE
, does not apply to background
tasks. |
stderr_callback |
NULL, or a function to call for every chunk of the standard error. |
Create Experiment - Tracking Client
Creates an MLflow experiment.
mlflow_client_create_experiment(client, name, artifact_location = NULL)
Create Run
reate a new run within an experiment. A run is usually a single execution of a machine learning or data ETL pipeline.
mlflow_client_create_run(client, experiment_id, user_id = NULL,
run_name = NULL, source_type = NULL, source_name = NULL,
entry_point_name = NULL, start_time = NULL, source_version = NULL,
tags = NULL)
Arguments
Argument | Description |
---|---|
client |
An mlflow_client object. |
experiment_id |
Unique identifier for the associated experiment. |
user_id |
User ID or LDAP for the user executing the run. |
run_name |
Human readable name for run. |
source_type |
Originating source for this run. One of Notebook, Job, Project, Local or Unknown. |
source_name |
String descriptor for source. For example, name or description of the notebook, or job name. |
entry_point_name |
Name of the entry point for the run. |
start_time |
Unix timestamp of when the run started in milliseconds. |
source_version |
Git version of the source code used to create run. |
tags |
Additional metadata for run in key-value pairs. |
Details
MLflow uses runs to track Param, Metric, and RunTag, associated with a single execution.
The Tracking Client family of functions require an MLflow client to be specified explicitly. These functions allow for greater control of where the operations take place in terms of services and runs, but are more verbose compared to the Fluent API.
Delete Experiment
Mark an experiment and associated runs, params, metrics, … etc for deletion. If the experiment uses FileStore, artifacts associated with experiment are also deleted.
mlflow_client_delete_experiment(client, experiment_id)
Download Artifacts
Download an artifact file or directory from a run to a local directory if applicable, and return a local path for it.
mlflow_client_download_artifacts(client, run_id, path)
Get Experiment by Name
Get meta data for experiment by name.
mlflow_client_get_experiment_by_name(client, name)
Get Experiment
Get meta data for experiment and a list of runs for this experiment.
mlflow_client_get_experiment(client, experiment_id)
Get Run
Get meta data, params, tags, and metrics for run. Only last logged value for each metric is returned.
mlflow_client_get_run(client, run_id)
List artifacts
List artifacts
mlflow_client_list_artifacts(client, run_id, path = NULL)
List Experiments
Get a list of all experiments.
mlflow_client_list_experiments(client, view_type = c("ACTIVE_ONLY",
"DELETED_ONLY", "ALL"))
Log Artifact
Logs an specific file or directory as an artifact.
mlflow_client_log_artifact(client, run_id, path, artifact_path = NULL)
Arguments
Argument | Description |
---|---|
client |
An mlflow_client object. |
run_id |
Run ID. |
path |
The file or directory to log as an artifact. |
artifact_path |
Destination path within the run’s artifact URI. |
Details
The Tracking Client family of functions require an MLflow client to be specified explicitly. These functions allow for greater control of where the operations take place in terms of services and runs, but are more verbose compared to the Fluent API.
When logging to Amazon S3, ensure that the user has a proper policy attach to it, for instance:
``
Additionally, at least the AWS_ACCESS_KEY_ID
and
AWS_SECRET_ACCESS_KEY
environment variables must be set to the
corresponding key and secrets provided by Amazon IAM.
Log Metric
API to log a metric for a run. Metrics key-value pair that record a single float measure. During a single execution of a run, a particular metric can be logged several times. Backend will keep track of historical values along with timestamps.
mlflow_client_log_metric(client, run_id, key, value, timestamp = NULL)
Log Parameter
API to log a parameter used for this run. Examples are params and hyperparams used for ML training, or constant dates and values used in an ETL pipeline. A params is a STRING key-value pair. For a run, a single parameter is allowed to be logged only once.
mlflow_client_log_param(client, run_id, key, value)
Restore Experiment
Restore an experiment marked for deletion. This also restores associated metadata, runs, metrics, and params. If experiment uses FileStore, underlying artifacts associated with experiment are also restored.
mlflow_client_restore_experiment(client, experiment_id)
Arguments
Argument | Description |
---|---|
client |
An mlflow_client object. |
experiment_id |
ID of the associated experiment. This field is required. |
Details
Throws RESOURCE_DOES_NOT_EXIST if experiment was never created or was permanently deleted.
The Tracking Client family of functions require an MLflow client to be specified explicitly. These functions allow for greater control of where the operations take place in terms of services and runs, but are more verbose compared to the Fluent API.
Set Tag
Set a tag on a run. Tags are run metadata that can be updated during and after a run completes.
mlflow_client_set_tag(client, run_id, key, value)
Terminate a Run
Terminate a Run
mlflow_client_set_terminated(client, run_id, status = c("FINISHED",
"SCHEDULED", "FAILED", "KILLED"), end_time = NULL)
Create Experiment
Creates an MLflow experiment.
mlflow_create_experiment(name, artifact_location = NULL)
End a Run
End an active MLflow run (if there is one).
mlflow_end_run(status = c("FINISHED", "SCHEDULED", "FAILED", "KILLED"))
Install MLflow
Installs MLflow for individual use.
mlflow_install()
Details
Notice that MLflow requires Python and Conda to be installed, see https://www.python.org/getit/ and https://conda.io/docs/installation.html .
Load MLflow Model Flavor
Loads an MLflow model flavor, to be used by package authors to extend the supported MLflow models.
mlflow_load_flavor(model_path)
Load MLflow Model.
MLflow models can have multiple model flavors. Not all flavors / models can be loaded in R. This method will by default search for a flavor supported by R/mlflow.
mlflow_load_model(model_path, flavor = NULL, run_id = NULL)
Arguments
Argument | Description |
---|---|
model_path |
“Path to the MLflow model. The path is relative to the run with the given run-id or local filesystem path without run-id. |
flavor |
Optional flavor specification. Can be used to load a particular flavor in case there are multiple flavors available. |
run_id |
Optional MLflow run-id. If supplied model will be fetched from MLflow tracking server. |
Log Artifact
Logs an specific file or directory as an artifact.
mlflow_log_artifact(path, artifact_path = NULL)
Arguments
Argument | Description |
---|---|
path |
The file or directory to log as an artifact. |
artifact_path |
Destination path within the run’s artifact URI. |
Details
The fluent API family of functions operate with an implied MLflow client
determined by the service set by mlflow_set_tracking_uri()
. For
operations involving a run it adopts the current active run, or, if one
does not exist, starts one through the implied service.
When logging to Amazon S3, ensure that the user has a proper policy attach to it, for instance:
``
Additionally, at least the AWS_ACCESS_KEY_ID
and
AWS_SECRET_ACCESS_KEY
environment variables must be set to the
corresponding key and secrets provided by Amazon IAM.
Log Metric
API to log a metric for a run. Metrics key-value pair that record a single float measure. During a single execution of a run, a particular metric can be logged several times. Backend will keep track of historical values along with timestamps.
mlflow_log_metric(key, value, timestamp = NULL)
Log Model
Logs a model in the given run. Similar to mlflow_save_model()
but
stores model as an artifact within the active run.
mlflow_log_model(fn, artifact_path)
Log Parameter
API to log a parameter used for this run. Examples are params and hyperparams used for ML training, or constant dates and values used in an ETL pipeline. A params is a STRING key-value pair. For a run, a single parameter is allowed to be logged only once.
mlflow_log_param(key, value)
Read Command Line Parameter
Reads a command line parameter.
mlflow_param(name, default = NULL, type = NULL, description = NULL)
Predict over MLflow Model Flavor
Performs prediction over a model loaded using mlflow_load_model()
,
to be used by package authors to extend the supported MLflow models.
mlflow_predict_flavor(model, data)
Generate prediction with MLflow model.
Generate prediction with MLflow model.
mlflow_predict_model(model, data)
Restore Snapshot
Restores a snapshot of all dependencies required to run the files in the current directory
mlflow_restore_snapshot()
Predict using RFunc MLflow Model
Predict using an RFunc MLflow Model from a file or data frame.
mlflow_rfunc_predict(model_path, run_uuid = NULL, input_path = NULL,
output_path = NULL, data = NULL, restore = FALSE)
Arguments
Argument | Description |
---|---|
model_path |
The path to the MLflow model, as a string. |
run_uuid |
Run ID of run to grab the model from. |
input_path |
Path to ‘JSON’ or ‘CSV’ file to be used for prediction. |
output_path |
‘JSON’ or ‘CSV’ file where the prediction will be written to. |
data |
Data frame to be scored. This can be
utilized for testing purposes and
can only be specified when
input_path is not specified. |
restore |
Should mlflow_restore_snapshot()
be called before serving? |
Examples
list("\n", "library(mlflow)\n", "\n", "# save simple model which roundtrips data as prediction\n", "mlflow_save_model(function(df) df, \"mlflow_roundtrip\")\n", "\n", "# save data as json\n", "jsonlite::write_json(iris, \"iris.json\")\n", "\n", "# predict existing model from json data\n", "mlflow_rfunc_predict(\"mlflow_roundtrip\", \"iris.json\")\n")
Serve an RFunc MLflow Model
Serve an RFunc MLflow Model as a local web api.
mlflow_rfunc_serve(model_path, run_uuid = NULL, host = "127.0.0.1",
port = 8090, daemonized = FALSE, browse = !daemonized,
restore = FALSE)
Arguments
Argument | Description |
---|---|
model_path |
The path to the MLflow model, as a string. |
run_uuid |
ID of run to grab the model from. |
host |
Address to use to serve model, as a string. |
port |
Port to use to serve model, as numeric. |
daemonized |
Makes ‘httpuv’ server daemonized so R interactive sessions are not blocked to handle requests. To terminate a daemonized server, call ‘httpuv::stopDaemonizedServer()’ with the handle returned from this call. |
browse |
Launch browser with serving landing page? |
restore |
Should mlflow_restore_snapshot()
be called before serving? |
Examples
list("\n", "library(mlflow)\n", "\n", "# save simple model with constant prediction\n", "mlflow_save_model(function(df) 1, \"mlflow_constant\")\n", "\n", "# serve an existing model over a web interface\n", "mlflow_rfunc_serve(\"mlflow_constant\")\n", "\n", "# request prediction from server\n", "httr::POST(\"http://127.0.0.1:8090/predict/\")\n")
Run in MLflow
Wrapper for mlflow run
.
mlflow_run(entry_point = NULL, uri = ".", version = NULL,
param_list = NULL, experiment_id = NULL, mode = NULL,
cluster_spec = NULL, git_username = NULL, git_password = NULL,
no_conda = FALSE, storage_dir = NULL)
Arguments
Argument | Description |
---|---|
entry_point |
Entry point within project, defaults
to main if not specified. |
uri |
A directory containing modeling scripts, defaults to the current directory. |
version |
Version of the project to run, as a Git commit reference for Git projects. |
param_list |
A list of parameters. |
experiment_id |
ID of the experiment under which to launch the run. |
mode |
Execution mode to use for run. |
cluster_spec |
Path to JSON file describing the cluster to use when launching a run on Databricks. |
git_username |
Username for HTTP(S) Git authentication. |
git_password |
Password for HTTP(S) Git authentication. |
no_conda |
If specified, assume that MLflow is running within a Conda environment with the necessary dependencies for the current project instead of attempting to create a new conda environment. Only valid if running locally. |
storage_dir |
Only valid when mode is local.
MLflow downloads artifacts from
distributed URIs passed to
parameters of type ‘path’ to
subdirectories of storage_dir. |
Save MLflow Keras Model Flavor
Saves model in MLflow’s Keras flavor.
list(list("mlflow_save_flavor"), list("keras.engine.training.Model"))(x,
path = "model", r_dependencies = NULL, conda_env = NULL)
Arguments
Argument | Description |
---|---|
x |
The serving function or model that will perform a prediction. |
path |
Destination path where this MLflow compatible model will be saved. |
r_dependencies |
Optional vector of paths to
dependency files to include in the
model, as in r-dependencies.txt
or conda.yaml . |
conda_env |
Path to Conda dependencies file. |
Save MLflow Model Flavor
Saves model in MLflow’s flavor, to be used by package authors to extend the supported MLflow models.
mlflow_save_flavor(x, path = "model", r_dependencies = NULL,
conda_env = NULL)
Arguments
Argument | Description |
---|---|
x |
The serving function or model that will perform a prediction. |
path |
Destination path where this MLflow compatible model will be saved. |
r_dependencies |
Optional vector of paths to
dependency files to include in the
model, as in r-dependencies.txt
or conda.yaml . |
conda_env |
Path to Conda dependencies file. |
Save Model for MLflow
Saves model in MLflow’s format that can later be used for prediction and serving.
mlflow_save_model(x, path = "model", r_dependencies = NULL,
conda_env = NULL)
Arguments
Argument | Description |
---|---|
x |
The serving function or model that will perform a prediction. |
path |
Destination path where this MLflow compatible model will be saved. |
r_dependencies |
Optional vector of paths to
dependency files to include in the
model, as in r-dependencies.txt
or conda.yaml . |
conda_env |
Path to Conda dependencies file. |
Run the MLflow Tracking Server
Wrapper for mlflow server
.
mlflow_server(file_store = "mlruns", default_artifact_root = NULL,
host = "127.0.0.1", port = 5000, workers = 4,
static_prefix = NULL)
Arguments
Argument | Description |
---|---|
file_store |
The root of the backing file store for experiment and run data. |
default_artifact_root |
Local or S3 URI to store artifacts in, for newly created experiments. |
host |
The network address to listen on (default: 127.0.0.1). |
port |
The port to listen on (default: 5000). |
workers |
Number of gunicorn worker processes to handle requests (default: 4). |
static_prefix |
A prefix which will be prepended to the path of all static paths. |
Set Experiment
Set given experiment as active experiment. If experiment does not exist, create an experiment with provided name.
mlflow_set_experiment(experiment_name)
Set Tag
Set a tag on a run. Tags are run metadata that can be updated during and after a run completes.
mlflow_set_tag(key, value)
Set Remote Tracking URI
Specifies the URI to the remote MLflow server that will be used to track experiments.
mlflow_set_tracking_uri(uri)
Dependencies Snapshot
Creates a snapshot of all dependencies required to run the files in the current directory.
mlflow_snapshot()
Source a Script with MLflow Params
This function should not be used interactively. It is designed to be
called via Rscript
from the terminal or through the MLflow CLI.
mlflow_source(uri)
Start Run
Starts a new run within an experiment, should be used within a with
block.
mlflow_start_run(run_uuid = NULL, experiment_id = NULL,
source_name = NULL, source_version = NULL, entry_point_name = NULL,
source_type = "LOCAL")
Arguments
Argument | Description |
---|---|
run_uuid |
If specified, get the run with the specified UUID and log metrics and params under that run. The run’s end time is unset and its status is set to running, but the run’s other attributes remain unchanged. |
experiment_id |
Used only when run_uuid is
unspecified. ID of the experiment
under which to create the current
run. If unspecified, the run is
created under a new experiment with
a randomly generated name. |
source_name |
Name of the source file or URI of the project to be associated with the run. Defaults to the current file if none provided. |
source_version |
Optional Git commit hash to associate with the run. |
entry_point_name |
Optional name of the entry point for to the current run. |
source_type |
Integer enum value describing the type of the run (“local”, “project”, etc.). |
MLflow User Interface
Launches MLflow user interface.
mlflow_ui(x, ...)