whylogs.viz.drift.column_drift_algorithms
#
Module Contents#
Classes#
Dataclass for storing drift algorithm score. |
|
Abstract class for column drift algorithms. |
|
Hellinger distance algorithm for column drift detection. |
|
Chi-Squared test algorithm for column drift detection. |
|
Kolmogorov-Smirnov test algorithm for column drift detection. |
Functions#
|
Calculate drift scores for all columns in the target dataset profile. |
- class whylogs.viz.drift.column_drift_algorithms.DriftAlgorithmScore#
Dataclass for storing drift algorithm score.
- thresholds: Optional[whylogs.viz.drift.configs.DriftThresholds]#
- to_dict()#
- class whylogs.viz.drift.column_drift_algorithms.ColumnDriftAlgorithm(parameter_config: Optional[Any] = None)#
Bases:
abc.ABC
Abstract class for column drift algorithms.
- Parameters
parameter_config (Optional[Any]) –
- abstract calculate(target_column_view: whylogs.core.view.column_profile_view.ColumnProfileView, reference_column_view: whylogs.core.view.column_profile_view.ColumnProfileView, with_thresholds: bool) Optional[DriftAlgorithmScore] #
Calculates drift score for a given column.
If with_thresholds is True, the thresholds defined in the parameter config are also returned, along with the final drift category.
- Parameters
target_column_view (whylogs.core.view.column_profile_view.ColumnProfileView) –
reference_column_view (whylogs.core.view.column_profile_view.ColumnProfileView) –
with_thresholds (bool) –
- Return type
Optional[DriftAlgorithmScore]
- abstract set_parameters(parameter_config: Any)#
- Parameters
parameter_config (Any) –
- class whylogs.viz.drift.column_drift_algorithms.Hellinger(parameter_config: Optional[whylogs.viz.drift.configs.HellingerConfig] = None)#
Bases:
ColumnDriftAlgorithm
Hellinger distance algorithm for column drift detection.
Requires the target and reference columns to have non-empty distribution metrics. The statistic is the Hellinger distance between the two distributions, which can assume values between 0 and 1.
- Parameters
parameter_config (Optional[whylogs.viz.drift.configs.HellingerConfig]) –
- calculate(target_column_view: whylogs.core.view.column_profile_view.ColumnProfileView, reference_column_view: whylogs.core.view.column_profile_view.ColumnProfileView, with_thresholds=False) Optional[DriftAlgorithmScore] #
Calculates drift score for a given column.
- Parameters
target_column_view (ColumnProfileView) – Column view of the target profile
reference_column_view (ColumnProfileView) – Column view of the reference profile
with_thresholds (bool, optional) – By default False. If True, the thresholds defined in the parameter config are also returned in the DriftAlgorithmScore object, along with the final drift category.
- Returns
Returns a DriftAlgorithmScore object containing the p-value and the KS statistic. If with_thresholds is True, also returns the the thresholds defined in the parameter config and the final drift category. The drift category is determined by the p-value and the thresholds defined in the parameter config.
- Return type
Optional[DriftAlgorithmScore]
- abstract set_parameters(parameter_config: Any)#
- Parameters
parameter_config (Any) –
- class whylogs.viz.drift.column_drift_algorithms.ChiSquare(parameter_config: Optional[whylogs.viz.drift.configs.ChiSquareConfig] = None)#
Bases:
ColumnDriftAlgorithm
Chi-Squared test algorithm for column drift detection.
- Parameters
parameter_config (Optional[whylogs.viz.drift.configs.ChiSquareConfig]) –
- calculate(target_column_view: whylogs.core.view.column_profile_view.ColumnProfileView, reference_column_view: whylogs.core.view.column_profile_view.ColumnProfileView, with_thresholds=False) Optional[DriftAlgorithmScore] #
Calculates drift score for a given column.
If with_thresholds is True, the thresholds defined in the parameter config are also returned, along with the final drift category.
- Parameters
target_column_view (whylogs.core.view.column_profile_view.ColumnProfileView) –
reference_column_view (whylogs.core.view.column_profile_view.ColumnProfileView) –
- Return type
Optional[DriftAlgorithmScore]
- abstract set_parameters(parameter_config: Any)#
- Parameters
parameter_config (Any) –
- class whylogs.viz.drift.column_drift_algorithms.KS(parameter_config: Optional[whylogs.viz.drift.configs.KSTestConfig] = None)#
Bases:
ColumnDriftAlgorithm
Kolmogorov-Smirnov test algorithm for column drift detection.
- Parameters
parameter_config (Optional[whylogs.viz.drift.configs.KSTestConfig]) –
- calculate(target_column_view: whylogs.core.view.column_profile_view.ColumnProfileView, reference_column_view: whylogs.core.view.column_profile_view.ColumnProfileView, with_thresholds=False) Optional[DriftAlgorithmScore] #
Compute the Kolmogorov-Smirnov test for two distributions. Require the target and reference column views to have a distribution metric.
- Parameters
target_column_view (ColumnProfileView) – Column view of the target profile
reference_column_view (ColumnProfileView) – Column view of the reference profile
with_thresholds (bool, optional) – By default False. If True, the thresholds defined in the parameter config are also returned in the DriftAlgorithmScore object, along with the final drift category.
- Returns
Returns a DriftAlgorithmScore object containing the p-value and the KS statistic. If with_thresholds is True, also returns the the thresholds defined in the parameter config and the final drift category. The drift category is determined by the p-value and the thresholds defined in the parameter config.
- Return type
Optional[DriftAlgorithmScore]
- set_parameters(parameter_config: Any)#
- Parameters
parameter_config (Any) –
- whylogs.viz.drift.column_drift_algorithms.calculate_drift_scores(target_view: whylogs.core.view.dataset_profile_view.DatasetProfileView, reference_view: whylogs.core.view.dataset_profile_view.DatasetProfileView, drift_map: Optional[Dict[str, ColumnDriftAlgorithm]] = None, with_thresholds=False) Dict[str, Optional[Dict[str, Any]]] #
Calculate drift scores for all columns in the target dataset profile.
If a drift map is provided, the drift algorithm for each column in the map is determined by the map. Columns not in the map (or if map is not provided) will use the default drift algorithm selection logic. If the column does not have the required metrics to apply the selected algorithm, None is returned. For example, if KS or Hellinger is selected for a column with string values, None will be returned.
If with_thresholds is True, the configured algorithm’s thresholds is returned in the DriftAlgorithmScore.
Returns a dictionary of column names to drift scores.
Examples
- Parameters
target_view (whylogs.core.view.dataset_profile_view.DatasetProfileView) –
reference_view (whylogs.core.view.dataset_profile_view.DatasetProfileView) –
drift_map (Optional[Dict[str, ColumnDriftAlgorithm]]) –
- Return type