mlflow.types

The mlflow.types module defines data types and utilities to be used by other mlflow components to describe interface independent of other frameworks or languages.

class mlflow.types.Schema(cols: List[mlflow.types.schema.ColSpec])[source]

Bases: object

Specification of types and column names in a dataset.

Schema is represented as a list of ColSpec. The columns in a schema can be named, with unique non empty name for every column, or unnamed with implicit integer index defined by their list indices. Combination of named and unnamed columns is not allowed.

as_spark_schema()[source]

Convert to Spark schema. If this schema is a single unnamed column, it is converted directly the corresponding spark data type, otherwise it’s returned as a struct (missing column names are filled with an integer sequence).

column_names() → List[Union[str, int]][source]

Get list of column names or range of indices if the schema has no column names.

column_types() → List[mlflow.types.schema.DataType][source]

Get column types of the columns in the dataset.

property columns

The list of columns that defines this schema.

classmethod from_json(json_str: str)[source]

Deserialize from a json string.

has_column_names() → bool[source]

Return true iff this schema declares column names, false otherwise.

numpy_types() → List[numpy.dtype][source]

Convenience shortcut to get the datatypes as numpy types.

pandas_types() → List[numpy.dtype][source]

Convenience shortcut to get the datatypes as pandas types.

to_dict() → List[Dict[str, Any]][source]

Serialize into a jsonable dictionary.

to_json() → str[source]

Serialize into json string.

class mlflow.types.ColSpec(type: mlflow.types.schema.DataType, name: Optional[str] = None)[source]

Bases: object

Specification of name and type of a single column in a dataset.

property name

The column name or None if the columns is unnamed.

property type

The column data type.

class mlflow.types.DataType[source]

Bases: enum.Enum

MLflow data types.

binary = 7

Sequence of raw bytes.

boolean = 1

Logical data (True, False) .

double = 5

64b floating point numbers.

float = 4

32b floating point numbers.

integer = 2

32b signed integer numbers.

long = 3

64b signed integer numbers.

string = 6

Text data.

to_numpy() → numpy.dtype[source]

Get equivalent numpy data type.

to_pandas() → numpy.dtype[source]

Get equivalent pandas data type.