whylogs.core.constraints.factories
#
Submodules#
whylogs.core.constraints.factories.cardinality_metrics
whylogs.core.constraints.factories.condition_counts
whylogs.core.constraints.factories.count_metrics
whylogs.core.constraints.factories.distribution_metrics
whylogs.core.constraints.factories.frequent_items
whylogs.core.constraints.factories.multi_metrics
whylogs.core.constraints.factories.types_metrics
Package Contents#
Functions#
Number of distinct categories must be between lower and upper values (inclusive). |
|
|
Checks that all values in column match predicate |
Checks that no values in column match predicate |
|
|
Number of elements in a column must be below given number. |
|
Checks that there are no missing values in the column. |
Percentage of null values must be below given number. |
|
Number of null values must be below given number. |
|
|
Minimum value of given column must be above defined number. |
|
Checks that all of column's values are in defined range (inclusive). |
|
Checks if a column is non negative |
|
Estimated mean must be between range defined by lower and upper bounds. |
Q-th quantile value must be withing the range defined by lower and upper boundaries. |
|
|
Maximum value of given column must be below defined number. |
|
Estimated standard deviation must be between range defined by lower and upper bounds. |
Determine whether a set of variables appear in the frequent strings for a string column. |
|
Validate if the top n most common items appear in the dataset. |
|
Check if column contains only records of specific datatype. |
|
Attributes#
- whylogs.core.constraints.factories.distinct_number_in_range(column_name: str, lower: Union[int, float], upper: Union[int, float]) whylogs.core.constraints.metric_constraints.MetricConstraint #
Number of distinct categories must be between lower and upper values (inclusive).
- Parameters
- Return type
whylogs.core.constraints.metric_constraints.MetricConstraint
- whylogs.core.constraints.factories.condition_count_below(column_name: str, condition_name: str, max_count: int) whylogs.core.constraints.metric_constraints.MetricConstraint #
- Parameters
- Return type
whylogs.core.constraints.metric_constraints.MetricConstraint
- whylogs.core.constraints.factories.condition_meets(column_name: str, condition_name: str) whylogs.core.constraints.metric_constraints.MetricConstraint #
Checks that all values in column match predicate
- Parameters
- Return type
whylogs.core.constraints.metric_constraints.MetricConstraint
- whylogs.core.constraints.factories.condition_never_meets(column_name: str, condition_name: str) whylogs.core.constraints.metric_constraints.MetricConstraint #
Checks that no values in column match predicate
- Parameters
- Return type
whylogs.core.constraints.metric_constraints.MetricConstraint
- whylogs.core.constraints.factories.count_below_number(column_name: str, number: int) whylogs.core.constraints.metric_constraints.MetricConstraint #
Number of elements in a column must be below given number.
- Parameters
- Return type
whylogs.core.constraints.metric_constraints.MetricConstraint
- whylogs.core.constraints.factories.no_missing_values(column_name: str) whylogs.core.constraints.metric_constraints.MetricConstraint #
Checks that there are no missing values in the column.
- Parameters
column_name (str) – Column the constraint is applied to
- Return type
whylogs.core.constraints.metric_constraints.MetricConstraint
- whylogs.core.constraints.factories.null_percentage_below_number(column_name: str, number: float) whylogs.core.constraints.metric_constraints.MetricConstraint #
Percentage of null values must be below given number.
- Parameters
- Return type
whylogs.core.constraints.metric_constraints.MetricConstraint
- whylogs.core.constraints.factories.null_values_below_number(column_name: str, number: int) whylogs.core.constraints.metric_constraints.MetricConstraint #
Number of null values must be below given number.
- Parameters
- Return type
whylogs.core.constraints.metric_constraints.MetricConstraint
- whylogs.core.constraints.factories.greater_than_number(column_name: str, number: Union[float, int], skip_missing: bool = True) whylogs.core.constraints.MetricConstraint #
Minimum value of given column must be above defined number.
- Parameters
- Return type
- whylogs.core.constraints.factories.is_in_range(column_name: str, lower: Union[float, int], upper: Union[float, int], skip_missing: bool = True) whylogs.core.constraints.MetricConstraint #
Checks that all of column’s values are in defined range (inclusive).
For the constraint to pass, the column’s minimum value should be higher or equal than lower and maximum value should be less than or equal to upper.
- Parameters
column_name (str) – Column the constraint is applied to
lower (float) – lower bound of defined range
upper (float) – upper bound of defined range
skip_missing (bool) – If skip_missing is True, missing distribution metrics will make the check pass. If False, the check will fail on missing metrics, such as on an empty dataset
- Return type
- whylogs.core.constraints.factories.is_non_negative(column_name: str, skip_missing: bool = True) whylogs.core.constraints.MetricConstraint #
Checks if a column is non negative
- Parameters
- Return type
- whylogs.core.constraints.factories.mean_between_range(column_name: str, lower: float, upper: float, skip_missing: bool = True) whylogs.core.constraints.MetricConstraint #
Estimated mean must be between range defined by lower and upper bounds.
- Parameters
column_name (str) – Column the constraint is applied to
lower (int) – Lower bound of defined range
upper (int) – Upper bound of the value range
skip_missing (bool) – If skip_missing is True, missing distribution metrics will make the check pass. If False, the check will fail on missing metrics, such as on an empty dataset
- Return type
- whylogs.core.constraints.factories.quantile_between_range(column_name: str, quantile: float, lower: float, upper: float, skip_missing: bool = True) whylogs.core.constraints.MetricConstraint #
Q-th quantile value must be withing the range defined by lower and upper boundaries.
- Parameters
column_name (str) – Column the constraint is applied to
quantile (float) – Quantile value. E.g. median is equal to quantile_value=0.5
lower (float) – Lower bound of defined range
upper (float) – Upper bound of the value range
skip_missing (bool) – If skip_missing is True, missing distribution metrics will make the check pass. If False, the check will fail on missing metrics, such as on an empty dataset
- Return type
- whylogs.core.constraints.factories.smaller_than_number(column_name: str, number: float, skip_missing: bool = True) whylogs.core.constraints.MetricConstraint #
Maximum value of given column must be below defined number.
- Parameters
- Return type
- whylogs.core.constraints.factories.stddev_between_range(column_name: str, lower: float, upper: float, skip_missing: bool = True)#
Estimated standard deviation must be between range defined by lower and upper bounds.
- Parameters
column_name (str) – Column the constraint is applied to
lower (float) – Lower bound of defined range
upper (float) – Upper bound of the value range
skip_missing (bool) – If skip_missing is True, missing distribution metrics will make the check pass. If False, the check will fail on missing metrics, such as on an empty dataset
- whylogs.core.constraints.factories.frequent_strings_in_reference_set(column_name: str, reference_set: dict) whylogs.core.constraints.metric_constraints.MetricConstraint #
Determine whether a set of variables appear in the frequent strings for a string column. Every item in frequent strings must be in defined reference set
- Parameters
- Return type
whylogs.core.constraints.metric_constraints.MetricConstraint
- whylogs.core.constraints.factories.n_most_common_items_in_set(column_name: str, n: int, reference_set: dict) whylogs.core.constraints.metric_constraints.MetricConstraint #
Validate if the top n most common items appear in the dataset.
- Parameters
- Return type
whylogs.core.constraints.metric_constraints.MetricConstraint
- whylogs.core.constraints.factories.column_is_probably_unique(column_name: str, hll_stddev: int = 3) whylogs.core.constraints.MetricConstraint #
- Parameters
- Return type
- whylogs.core.constraints.factories.column_has_non_zero_types(column_name: str, types_list: List[str]) whylogs.core.constraints.metric_constraints.MetricConstraint #
- Parameters
- Return type
whylogs.core.constraints.metric_constraints.MetricConstraint
- whylogs.core.constraints.factories.column_has_zero_count_types(column_name: str, types_list: List[str]) whylogs.core.constraints.metric_constraints.MetricConstraint #
- Parameters
- Return type
whylogs.core.constraints.metric_constraints.MetricConstraint
- whylogs.core.constraints.factories.column_is_nullable_boolean(column_name: str) whylogs.core.constraints.metric_constraints.MetricConstraint #
- Parameters
column_name (str) –
- Return type
whylogs.core.constraints.metric_constraints.MetricConstraint
- whylogs.core.constraints.factories.column_is_nullable_datatype(column_name: str, datatype: str) whylogs.core.constraints.metric_constraints.MetricConstraint #
Check if column contains only records of specific datatype. Datatypes can be: integral, fractional, boolean, string, object.
Returns True if there is at least one record of type datatype and there is no records of remaining types.
- Parameters
- Return type
whylogs.core.constraints.metric_constraints.MetricConstraint
- whylogs.core.constraints.factories.column_is_nullable_fractional(column_name: str) whylogs.core.constraints.metric_constraints.MetricConstraint #
- Parameters
column_name (str) –
- Return type
whylogs.core.constraints.metric_constraints.MetricConstraint
- whylogs.core.constraints.factories.column_is_nullable_integral(column_name: str) whylogs.core.constraints.metric_constraints.MetricConstraint #
- Parameters
column_name (str) –
- Return type
whylogs.core.constraints.metric_constraints.MetricConstraint
- whylogs.core.constraints.factories.column_is_nullable_object(column_name: str) whylogs.core.constraints.metric_constraints.MetricConstraint #
- Parameters
column_name (str) –
- Return type
whylogs.core.constraints.metric_constraints.MetricConstraint
- whylogs.core.constraints.factories.column_is_nullable_string(column_name: str) whylogs.core.constraints.metric_constraints.MetricConstraint #
- Parameters
column_name (str) –
- Return type
whylogs.core.constraints.metric_constraints.MetricConstraint
- whylogs.core.constraints.factories.ALL#