Redacting Sensitive Data from Traces
Traces capture powerful insights for debugging and monitoring your application, however, they may contain sensitive data, such as Personal Identifiable Information (PII), that you don't want to share with others. MLflow provides a fully configurable way to mask sensitive data from traces before they are saved to the backend.
How It Works
MLflow allows you to configure a list of post-processing hooks that are applied to each span in a trace. Each span processor is a function that takes a span as input and updates it in place.
- Define a custom filtering function and call
mlflow.tracing.configureto register it. - Whenever a new span is created, the registered filters are applied to it sequentially.
- MLflow sends the filtered span to the backend.
Since the filters are applied at client side before sending the span to the backend, the sensitive data never goes out of your application.
Filtering Function
A filtering function must take a single argument, which is a Span object. It can mutate the span in-place. It must not return a value.
def filter_function(span: Span) -> None:
...