`whylogs.mlflow.patcher`¶

Module Contents¶

Classes¶

WhyLogsRun

Functions¶

`_new_mlflow_conda_env`(path=None, additional_conda_deps=None, additional_pip_deps=None, additional_conda_channels=None, install_mlflow=True)
`_new_add_to_model`(model, loader_module, data=None, code=None, env=None, **kwargs)	Replaces the MLFLow's original add_to_model
`new_model_log`(**kwargs)	Hijack the mlflow.models.Model.log method and upload the .whylogs.yaml configuration to the model path
`enable_mlflow`(session=None) → bool	Enable whylogs in `mlflow` module via `mlflow.whylogs`.
`disable_mlflow`()

Attributes¶

`logger`
`_mlflow`
`_original_end_run`
`_active_whylogs`
`_is_patched`
`_original_mlflow_conda_env`
`_original_add_to_model`
`_original_model_log`
`WHYLOG_YAML`

whylogs.mlflow.patcher.logger¶

whylogs.mlflow.patcher._mlflow¶

whylogs.mlflow.patcher._original_end_run¶

whylogs.mlflow.patcher._active_whylogs = []¶

whylogs.mlflow.patcher._is_patched = False¶

whylogs.mlflow.patcher._original_mlflow_conda_env¶

whylogs.mlflow.patcher._original_add_to_model¶

whylogs.mlflow.patcher._original_model_log¶

class whylogs.mlflow.patcher.WhyLogsRun(session=None)¶

Bases: object

_session¶

_active_run_id¶

_loggers :Dict[str, whylogs.app.logger.Logger]¶

_create_logger(self, dataset_name: Optional[str] = None, dataset_timestamp: Optional[datetime.datetime] = None)¶

log_pandas(self, df: pandas.DataFrame, dataset_name: Optional[str] = None, dataset_timestamp: Optional[datetime.datetime] = None)¶

Log the statistics of a Pandas dataframe. Note that this method is additive within a run: calling this method with a specific dataset name will not generate a new profile; instead, data will be aggregated into the existing profile.

In order to create a new profile, please specify a dataset_name

Parameters

df – the Pandas dataframe to log
dataset_name – the name of the dataset (Optional). If not specified, the experiment name is used

log(self, features: Optional[Dict[str, any]] = None, feature_name: Optional[str] = None, value: any = None, dataset_name: Optional[str] = None)¶

Logs a collection of features or a single feature (must specify one or the other).

Parameters

features – a map of key value feature for model input
feature_name – name of a single feature. Cannot be specified if ‘features’ is specified
value – value of as single feature. Cannot be specified if ‘features’ is specified
dataset_name – the name of the dataset. If not specified, we fall back to using the experiment name

_get_or_create_logger(self, dataset_name: Optional[str] = None, dataset_timestamp: Optional[datetime.datetime] = None)¶

_close(self)¶

whylogs.mlflow.patcher._new_mlflow_conda_env(path=None, additional_conda_deps=None, additional_pip_deps=None, additional_conda_channels=None, install_mlflow=True)¶

whylogs.mlflow.patcher._new_add_to_model(model, loader_module, data=None, code=None, env=None, **kwargs)¶

Replaces the MLFLow’s original add_to_model https://github.com/mlflow/mlflow/blob/4e68f960d4520ade6b64a28c297816f622adc83e/mlflow/pyfunc/__init__.py#L242

Accepts the same signature as MLFlow’s original add_to_model call. We inject our loader module.

We also inject whylogs into the Conda environment by patching _mlflow_conda_env.

Parameters

model – Existing model.
loader_module – The module to be used to load the model.
data – Path to the model data.
code – Path to the code dependencies.
env – Conda environment.
kwargs – Additional key-value pairs to include in the pyfunc flavor specification. Values must be YAML-serializable.

Returns

Updated model configuration.

whylogs.mlflow.patcher.WHYLOG_YAML = .whylogs.yaml¶

whylogs.mlflow.patcher.new_model_log(**kwargs)¶: Hijack the mlflow.models.Model.log method and upload the .whylogs.yaml configuration to the model path This will allow us to pick up the configuration later under /opt/ml/model/.whylogs.yaml path

whylogs.mlflow.patcher.enable_mlflow(session=None) → bool¶

Enable whylogs in mlflow module via mlflow.whylogs.

Returns: True if MLFlow has been patched. False otherwise.

Example of whylogs and MLFlow¶

import mlflow
import whylogs

whylogs.enable_mlflow()

import numpy as np
import pandas as pd
pdf = pd.DataFrame(
    data=[[1, 2, 3, 4, True, "x", bytes([1])]],
    columns=["b", "d", "a", "c", "e", "g", "f"],
    dtype=np.object,
)

active_run = mlflow.start_run()

# log a Pandas dataframe under default name
mlflow.whylogs.log_pandas(pdf)

# log a Pandas dataframe with custom name
mlflow.whylogs.log_pandas(pdf, "another dataset")

# Finish the MLFlow run
mlflow.end_run()

whylogs.mlflow.patcher.disable_mlflow()¶

whylogs.mlflow.patcher¶

Module Contents¶

Classes¶

Functions¶

Attributes¶

`whylogs.mlflow.patcher`¶