R API
The MLflow R API allows you to use MLflow Tracking, Projects and Models.
You can use the R API to install MLflow, start the user interface, create and list experiments, save models, run projects and serve models among many other functions available in the R API.
Table of Contents
- Crate a function to share with another process
- Is an object a crate?
- Active Run
- MLflow Command
- Create Experiment - Tracking Client
- Create Run
- Delete Experiment
- Delete a Run
- Download Artifacts
- Get Experiment by Name
- Get Experiment
- Get Run
- List Artifacts
- List Experiments
- Log Artifact
- Log Metric
- Log Parameter
- Restore Experiment
- Restore a Run
- Set Tag
- Terminate a Run
- Initialize an MLflow Client
- Create Experiment
- End a Run
- Get Remote Tracking URI
- Install MLflow
- Load MLflow Model Flavor
- Load MLflow Model
- Log Artifact
- Log Metric
- Log Model
- Log Parameter
- Read Command-Line Parameter
- Predict over MLflow Model Flavor
- Generate Prediction with MLflow Model
- Restore Snapshot
- Predict using RFunc MLflow Model
- Serve an RFunc MLflow Model
- Run in MLflow
- Save MLflow Keras Model Flavor
- Save MLflow Model Flavor
- Save Model for MLflow
- Run MLflow Tracking Server
- Set Experiment
- Set Tag
- Set Remote Tracking URI
- Dependencies Snapshot
- Source a Script with MLflow Params
- Start Run
- Run MLflow User Interface
- Uninstall MLflow
MLflow Command
Runs a generic MLflow command through the command-line interface.
mlflow_cli(..., background = FALSE, echo = TRUE,
stderr_callback = NULL)
Arguments
Argument | Description |
---|---|
... |
The parameters to pass to the command line. |
background |
Should this command be triggered as
a background task? Defaults to
FALSE . |
echo |
Print the standard output and error
to the screen? Defaults to TRUE
, does not apply to background
tasks. |
stderr_callback |
NULL, or a function to call for every chunk of the standard error. |
Create Experiment - Tracking Client
Creates an MLflow experiment.
mlflow_client_create_experiment(client, name, artifact_location = NULL)
Create Run
Create a new run within an experiment. A run is usually a single execution of a machine learning or data ETL pipeline.
mlflow_client_create_run(client, experiment_id, user_id = NULL,
run_name = NULL, source_type = NULL, source_name = NULL,
entry_point_name = NULL, start_time = NULL, source_version = NULL,
tags = NULL)
Arguments
Argument | Description |
---|---|
client |
An mlflow_client object. |
experiment_id |
Unique identifier for the associated experiment. |
user_id |
User ID or LDAP for the user executing the run. |
run_name |
Human readable name for run. |
source_type |
Originating source for this run. One of Notebook, Job, Project, Local, or Unknown. |
source_name |
String descriptor for source. For example, name or description of the notebook, or job name. |
entry_point_name |
Name of the entry point for the run. |
start_time |
Unix timestamp of when the run started in milliseconds. |
source_version |
Git version of the source code used to create run. |
tags |
Additional metadata for run in key-value pairs. |
Details
MLflow uses runs to track Param, Metric, and RunTag, associated with a single execution.
The Tracking Client family of functions require an MLflow client to be specified explicitly. These functions allow for greater control of where the operations take place in terms of services and runs, but are more verbose compared to the Fluent API.
Delete Experiment
Marks an experiment and associated runs, params, metrics, etc. for deletion. If the experiment uses FileStore, artifacts associated with experiment are also deleted.
mlflow_client_delete_experiment(client, experiment_id)
Download Artifacts
Download an artifact file or directory from a run to a local directory if applicable, and return a local path for it.
mlflow_client_download_artifacts(client, run_id, path)
Get Experiment by Name
Gets metadata for an experiment by name.
mlflow_client_get_experiment_by_name(client, name)
Get Experiment
Gets metadata for an experiment and a list of runs for the experiment.
mlflow_client_get_experiment(client, experiment_id)
Get Run
Gets metadata, params, tags, and metrics for a run. In the case where multiple metrics with the same key are logged for the run, returns only the value with the latest timestamp. If there are multiple values with the latest timestamp, returns the maximum of these values.
mlflow_client_get_run(client, run_id)
List Artifacts
Gets a list of artifacts.
mlflow_client_list_artifacts(client, run_id, path = NULL)
List Experiments
Gets a list of all experiments.
mlflow_client_list_experiments(client, view_type = c("ACTIVE_ONLY",
"DELETED_ONLY", "ALL"))
Log Artifact
Logs a specific file or directory as an artifact for a run.
mlflow_client_log_artifact(client, run_id, path, artifact_path = NULL)
Arguments
Argument | Description |
---|---|
client |
An mlflow_client object. |
run_id |
Run ID. |
path |
The file or directory to log as an artifact. |
artifact_path |
Destination path within the run’s artifact URI. |
Details
The Tracking Client family of functions require an MLflow client to be specified explicitly. These functions allow for greater control of where the operations take place in terms of services and runs, but are more verbose compared to the Fluent API.
When logging to Amazon S3, ensure that the user has a proper policy attach to it.
Additionally, at least the AWS_ACCESS_KEY_ID
and
AWS_SECRET_ACCESS_KEY
environment variables must be set to the
corresponding key and secrets provided by Amazon IAM.
Log Metric
Logs a metric for a run. Metrics key-value pair that records a single float measure. During a single execution of a run, a particular metric can be logged several times. Backend will keep track of historical values along with timestamps.
mlflow_client_log_metric(client, run_id, key, value, timestamp = NULL)
Log Parameter
Logs a parameter for a run. Examples are params and hyperparams used for ML training, or constant dates and values used in an ETL pipeline. A param is a STRING key-value pair. For a run, a single parameter is allowed to be logged only once.
mlflow_client_log_param(client, run_id, key, value)
Restore Experiment
Restores an experiment marked for deletion. This also restores associated metadata, runs, metrics, and params. If experiment uses FileStore, underlying artifacts associated with experiment are also restored.
mlflow_client_restore_experiment(client, experiment_id)
Arguments
Argument | Description |
---|---|
client |
An mlflow_client object. |
experiment_id |
ID of the associated experiment. This field is required. |
Details
Throws RESOURCE_DOES_NOT_EXIST
if the experiment was never created or was
permanently deleted.
The Tracking Client family of functions require an MLflow client to be specified explicitly. These functions allow for greater control of where the operations take place in terms of services and runs, but are more verbose compared to the Fluent API.
Set Tag
Sets a tag on a run. Tags are run metadata that can be updated during a run and after a run completes.
mlflow_client_set_tag(client, run_id, key, value)
Terminate a Run
Terminates a run.
mlflow_client_set_terminated(client, run_id, status = c("FINISHED",
"SCHEDULED", "FAILED", "KILLED"), end_time = NULL)
Create Experiment
Creates an MLflow experiment.
mlflow_create_experiment(name, artifact_location = NULL)
End a Run
Ends an active MLflow run (if there is one).
mlflow_end_run(status = c("FINISHED", "SCHEDULED", "FAILED", "KILLED"))
Install MLflow
Installs MLflow for individual use.
mlflow_install()
Details
MLflow requires Python and Conda to be installed. See https://www.python.org/getit/ and https://docs.conda.io/projects/conda/en/latest/user-guide/install/.
Load MLflow Model Flavor
Loads an MLflow model flavor, to be used by package authors to extend the supported MLflow models.
mlflow_load_flavor(model_path)
Load MLflow Model
Loads an MLflow model. MLflow models can have multiple model flavors. Not all flavors / models can be loaded in R. This method by default searches for a flavor supported by R/MLflow.
mlflow_load_model(model_path, flavor = NULL, run_id = NULL)
Arguments
Argument | Description |
---|---|
model_path |
Path to the MLflow model. The path is relative to the run with the given run-id or local filesystem path without run-id. |
flavor |
Optional flavor specification. Can be used to load a particular flavor in case there are multiple flavors available. |
run_id |
Optional MLflow run-id. If supplied model will be fetched from MLflow tracking server. |
Log Artifact
Logs a specific file or directory as an artifact for this run.
mlflow_log_artifact(path, artifact_path = NULL)
Arguments
Argument | Description |
---|---|
path |
The file or directory to log as an artifact. |
artifact_path |
Destination path within the run’s artifact URI. |
Details
The fluent API family of functions operate with an implied MLflow client
determined by the service set by mlflow_set_tracking_uri()
. For
operations involving a run it adopts the current active run, or, if one
does not exist, starts one through the implied service.
When logging to Amazon S3, ensure that the user has a proper policy attach to it.
Additionally, at least the AWS_ACCESS_KEY_ID
and
AWS_SECRET_ACCESS_KEY
environment variables must be set to the
corresponding key and secrets provided by Amazon IAM.
Log Metric
Logs a metric for this run. Metrics key-value pair that records a single float measure. During a single execution of a run, a particular metric can be logged several times. Backend will keep track of historical values along with timestamps.
mlflow_log_metric(key, value, timestamp = NULL)
Log Model
Logs a model for this run. Similar to mlflow_save_model()
but
stores model as an artifact within the active run.
mlflow_log_model(fn, artifact_path)
Log Parameter
Logs a parameter for this run. Examples are params and hyperparams used for ML training, or constant dates and values used in an ETL pipeline. A params is a STRING key-value pair. For a run, a single parameter is allowed to be logged only once.
mlflow_log_param(key, value)
Read Command-Line Parameter
Reads a command-line parameter.
mlflow_param(name, default = NULL, type = NULL, description = NULL)
Predict over MLflow Model Flavor
Performs prediction over a model loaded using mlflow_load_model()
,
to be used by package authors to extend the supported MLflow models.
mlflow_predict_flavor(model, data)
Generate Prediction with MLflow Model
Generates a prediction with an MLflow model.
mlflow_predict_model(model, data)
Restore Snapshot
Restores a snapshot of all dependencies required to run the files in the current directory.
mlflow_restore_snapshot()
Predict using RFunc MLflow Model
Performs prediction using an RFunc MLflow model from a file or data frame.
mlflow_rfunc_predict(model_path, run_uuid = NULL, input_path = NULL,
output_path = NULL, data = NULL, restore = FALSE)
Arguments
Argument | Description |
---|---|
model_path |
The path to the MLflow model, as a string. |
run_uuid |
Run ID of run to grab the model from. |
input_path |
Path to JSON or CSV file to be used for prediction. |
output_path |
JSON or CSV file where the prediction will be written to. |
data |
Data frame to be scored. This can be
used for testing purposes and
can only be specified when
input_path is not specified. |
restore |
Should mlflow_restore_snapshot()
be called before serving? |
Examples
list("\n", "library(mlflow)\n", "\n", "# save simple model which roundtrips data as prediction\n", "mlflow_save_model(function(df) df, \"mlflow_roundtrip\")\n", "\n", "# save data as json\n", "jsonlite::write_json(iris, \"iris.json\")\n", "\n", "# predict existing model from json data\n", "mlflow_rfunc_predict(\"mlflow_roundtrip\", \"iris.json\")\n")
Serve an RFunc MLflow Model
Serves an RFunc MLflow model as a local web API.
mlflow_rfunc_serve(model_path, run_uuid = NULL, host = "127.0.0.1",
port = 8090, daemonized = FALSE, browse = !daemonized,
restore = FALSE)
Arguments
Argument | Description |
---|---|
model_path |
The path to the MLflow model, as a string. |
run_uuid |
ID of run to grab the model from. |
host |
Address to use to serve model, as a string. |
port |
Port to use to serve model, as numeric. |
daemonized |
Makes httpuv server daemonized so
R interactive sessions are not
blocked to handle requests. To
terminate a daemonized server, call
httpuv::stopDaemonizedServer()
with the handle returned from this
call. |
browse |
Launch browser with serving landing page? |
restore |
Should mlflow_restore_snapshot()
be called before serving? |
Examples
list("\n", "library(mlflow)\n", "\n", "# save simple model with constant prediction\n", "mlflow_save_model(function(df) 1, \"mlflow_constant\")\n", "\n", "# serve an existing model over a web interface\n", "mlflow_rfunc_serve(\"mlflow_constant\")\n", "\n", "# request prediction from server\n", "httr::POST(\"http://127.0.0.1:8090/predict/\")\n")
Run in MLflow
Wrapper for mlflow run
.
mlflow_run(entry_point = NULL, uri = ".", version = NULL,
param_list = NULL, experiment_id = NULL, mode = NULL,
cluster_spec = NULL, git_username = NULL, git_password = NULL,
no_conda = FALSE, storage_dir = NULL)
Arguments
Argument | Description |
---|---|
entry_point |
Entry point within project, defaults
to main if not specified. |
uri |
A directory containing modeling scripts, defaults to the current directory. |
version |
Version of the project to run, as a Git commit reference for Git projects. |
param_list |
A list of parameters. |
experiment_id |
ID of the experiment under which to launch the run. |
mode |
Execution mode to use for run. |
cluster_spec |
Path to JSON file describing the cluster to use when launching a run on Databricks. |
git_username |
Username for HTTP(S) Git authentication. |
git_password |
Password for HTTP(S) Git authentication. |
no_conda |
If specified, assume that MLflow is running within a Conda environment with the necessary dependencies for the current project instead of attempting to create a new Conda environment. Only valid if running locally. |
storage_dir |
Valid only when mode is local.
MLflow downloads artifacts from
distributed URIs passed to
parameters of type path to
subdirectories of storage_dir . |
Save MLflow Keras Model Flavor
Saves model in MLflow Keras flavor.
list(list("mlflow_save_flavor"), list("keras.engine.training.Model"))(x,
path = "model", r_dependencies = NULL, conda_env = NULL)
Arguments
Argument | Description |
---|---|
x |
The serving function or model that will perform a prediction. |
path |
Destination path where this MLflow compatible model will be saved. |
r_dependencies |
Optional vector of paths to
dependency files to include in the
model, as in r-dependencies.txt
or conda.yaml . |
conda_env |
Path to Conda dependencies file. |
Save MLflow Model Flavor
Saves model in MLflow flavor, to be used by package authors to extend the supported MLflow models.
mlflow_save_flavor(x, path = "model", r_dependencies = NULL,
conda_env = NULL)
Arguments
Argument | Description |
---|---|
x |
The serving function or model that will perform a prediction. |
path |
Destination path where this MLflow compatible model will be saved. |
r_dependencies |
Optional vector of paths to
dependency files to include in the
model, as in r-dependencies.txt
or conda.yaml . |
conda_env |
Path to Conda dependencies file. |
Save Model for MLflow
Saves model in MLflow format that can later be used for prediction and serving.
mlflow_save_model(x, path = "model", r_dependencies = NULL,
conda_env = NULL)
Arguments
Argument | Description |
---|---|
x |
The serving function or model that will perform a prediction. |
path |
Destination path where this MLflow compatible model will be saved. |
r_dependencies |
Optional vector of paths to
dependency files to include in the
model, as in r-dependencies.txt
or conda.yaml . |
conda_env |
Path to Conda dependencies file. |
Run MLflow Tracking Server
Wrapper for mlflow server
.
mlflow_server(file_store = "mlruns", default_artifact_root = NULL,
host = "127.0.0.1", port = 5000, workers = 4,
static_prefix = NULL)
Arguments
Argument | Description |
---|---|
file_store |
The root of the backing file store for experiment and run data. |
default_artifact_root |
Local or S3 URI to store artifacts in, for newly created experiments. |
host |
The network address to listen on (default: 127.0.0.1). |
port |
The port to listen on (default: 5000). |
workers |
Number of gunicorn worker processes to handle requests (default: 4). |
static_prefix |
A prefix which will be prepended to the path of all static paths. |
Set Experiment
Sets an experiment as the active experiment. If the experiment does not exist, creates an experiment with provided name.
mlflow_set_experiment(experiment_name)
Set Tag
Sets a tag on a run. Tags are run metadata that can be updated during and after a run completes.
mlflow_set_tag(key, value)
Set Remote Tracking URI
Specifies the URI to the remote MLflow server that will be used to track experiments.
mlflow_set_tracking_uri(uri)
Dependencies Snapshot
Creates a snapshot of all dependencies required to run the files in the current directory.
mlflow_snapshot()
Source a Script with MLflow Params
This function should not be used interactively. It is designed to be
called via Rscript
from the terminal or through the MLflow CLI.
mlflow_source(uri)
Start Run
Starts a new run within an experiment, should be used within a with
block.
mlflow_start_run(run_uuid = NULL, experiment_id = NULL,
source_name = NULL, source_version = NULL, entry_point_name = NULL,
source_type = "LOCAL")
Arguments
Argument | Description |
---|---|
run_uuid |
If specified, get the run with the specified UUID and log metrics and params under that run. The run’s end time is unset and its status is set to running, but the run’s other attributes remain unchanged. |
experiment_id |
Used only when run_uuid is
unspecified. ID of the experiment
under which to create the current
run. If unspecified, the run is
created under a new experiment with
a randomly generated name. |
source_name |
Name of the source file or URI of the project to be associated with the run. Defaults to the current file if none provided. |
source_version |
Optional Git commit hash to associate with the run. |
entry_point_name |
Optional name of the entry point for to the current run. |
source_type |
Integer enum value describing the type of the run (“local”, “project”, etc.). |
Run MLflow User Interface
Launches the MLflow user interface.
mlflow_ui(x, ...)