whylogs.experimental.core.metrics.udf_metric#

Module Contents#

Classes#

DeclarativeSubmetricSchema

The DeclarativeSubmetricSchema allows one to customize the set of metrics

UdfMetricConfig

Configure UDFs & submetrics for UdfMetric

UdfMetric

Applies the specified UDFs to the input column values and tracks the metrics

Functions#

default_schema(→ DeclarativeSubmetricSchema)

register_metric_udf(→ Callable[[Any], Any])

Decorator to easily configure UdfMetrics for your data set. Decorate your UDF

generate_udf_resolvers(...)

Generates a list of ResolverSpecs that implement the UdfMetrics specified

generate_udf_schema(...)

udf_metric_schema(→ whylogs.core.schema.DeclarativeSchema)

Generates a DeclarativeSchema that implement the UdfMetrics specified

Attributes#

whylogs.experimental.core.metrics.udf_metric.logger#
class whylogs.experimental.core.metrics.udf_metric.DeclarativeSubmetricSchema(resolvers: List[whylogs.core.resolvers.ResolverSpec], default_config: Optional[whylogs.core.metrics.metrics.MetricConfig] = None)#

Bases: whylogs.core.resolvers.DeclarativeResolverBase, whylogs.core.metrics.multimetric.SubmetricSchema

The DeclarativeSubmetricSchema allows one to customize the set of metrics tracked for each UDF computed by a UdfMetric. Pass its constructor a list of ResolverSpecs, which specify the UDF name or data type to match and the list of MetricSpecs to instantiate for matching UDFs. Each MetricSpec specifies the Metric class and MetricConfig to instantiate. Omit the MetricSpec::config to use the default MetricConfig. Setting ResolverSpec::exclude to True will exclude the listed metrics from the matched UDFs.

For example, DeclarativeSubmetricSchema(resolvers=STANDARD_RESOLVER) implements the same schema as DatasetSchema(), i.e., using the default MetricConfig, StandardTypeMapper, StandardResolver, etc. STANDARD_RESOLVER is defined in whylogs/python/whylogs/core/resolvers.py

Parameters
resolve(name: str, why_type: whylogs.core.datatypes.DataType, fi_disabled: bool = False) Dict[str, whylogs.core.metrics.metrics.Metric]#
Parameters
Return type

Dict[str, whylogs.core.metrics.metrics.Metric]

whylogs.experimental.core.metrics.udf_metric.STANDARD_UDF_RESOLVER: List[whylogs.core.resolvers.ResolverSpec]#
whylogs.experimental.core.metrics.udf_metric.DEFAULT_UDF_RESOLVER: List[whylogs.core.resolvers.ResolverSpec]#
whylogs.experimental.core.metrics.udf_metric.default_schema() DeclarativeSubmetricSchema#
Return type

DeclarativeSubmetricSchema

class whylogs.experimental.core.metrics.udf_metric.UdfMetricConfig#

Bases: whylogs.core.metrics.metrics.MetricConfig

Configure UDFs & submetrics for UdfMetric

udfs#

Maps submetric name to the UDF that computes the value to track

submetric_schema [optional]

determines the set of metrics tracked for each computed value

type_mapper [optional]

maps Python types to whylogs DataType

udfs: Dict[str, Callable[[Any], Any]]#
submetric_schema: whylogs.core.metrics.multimetric.SubmetricSchema#
type_mapper: whylogs.core.datatypes.TypeMapper#
class whylogs.experimental.core.metrics.udf_metric.UdfMetric(submetrics: Dict[str, Dict[str, whylogs.core.metrics.metrics.Metric]], udfs: Optional[Dict[str, Callable[[Any], Any]]] = None, submetric_schema: Optional[whylogs.core.metrics.multimetric.SubmetricSchema] = None, type_mapper: Optional[whylogs.core.datatypes.TypeMapper] = None, fi_disabled: bool = False)#

Bases: whylogs.core.metrics.multimetric.MultiMetric

Applies the specified UDFs to the input column values and tracks the metrics specified by the submetric_schema to their output.

Parameters
property namespace: str#
Return type

str

merge(other: UdfMetric) UdfMetric#
Parameters

other (UdfMetric) –

Return type

UdfMetric

columnar_update(view: whylogs.core.preprocessing.PreprocessedColumn) whylogs.core.metrics.metrics.OperationResult#
Parameters

view (whylogs.core.preprocessing.PreprocessedColumn) –

Return type

whylogs.core.metrics.metrics.OperationResult

classmethod zero(config: Optional[whylogs.core.metrics.metrics.MetricConfig] = None) UdfMetric#
Parameters

config (Optional[whylogs.core.metrics.metrics.MetricConfig]) –

Return type

UdfMetric

whylogs.experimental.core.metrics.udf_metric.register_metric_udf(col_name: Optional[str] = None, col_type: Optional[whylogs.core.datatypes.DataType] = None, submetric_name: Optional[str] = None, submetric_schema: Optional[whylogs.core.metrics.multimetric.SubmetricSchema] = None, type_mapper: Optional[whylogs.core.datatypes.TypeMapper] = None, namespace: Optional[str] = None, schema_name: str = '') Callable[[Any], Any]#

Decorator to easily configure UdfMetrics for your data set. Decorate your UDF functions, then call generate_udf_schema() to generate a list of ResolverSpecs that include the UdfMetrics configured by your decorator parameters.

You must specify exactly one of either col_name or col_type. col_name will attach a UdfMetric to the named input column. col_type will attach a UdfMetric to all input columns of the specified type. The decorated function will automatically be a UDF in the UdfMetric.

Specify submetric_name to give the output of the UDF a name. submetric_name defautls to the name of the decorated function. Note that all lambdas are named “lambda” so omitting submetric_name on more than one lambda will result in name collisions. If you pass a namespace, it will be prepended to the UDF name.

You can optionally pass submetric_schema to specify and configure the metrics to be tracked for each UDF. This defualts to the STANDARD_RESOLVER metrics.

You can optionally pass type_mapper to control how Python types are mapped to whylogs DataTypes. This defaults to the StandardTypeMapper.

Parameters
Return type

Callable[[Any], Any]

whylogs.experimental.core.metrics.udf_metric.generate_udf_resolvers(schema_name: Union[str, List[str]] = '', include_default_schema: bool = True) List[whylogs.core.resolvers.ResolverSpec]#

Generates a list of ResolverSpecs that implement the UdfMetrics specified by the @register_metric_udf decorators. The result only includes the UdfMetric, so you may want to append it to a list of ResolverSpecs defining the other metrics you wish to track.

For example:

@register_metric_udf(col_name=”col1”) def add5(x):

return x + 5

@register_metric_udf(col_type=String) def upper(x):

return x.upper()

schema = DeclarativeSchema(STANDARD_RESOLVER + generate_udf_schema()) why.log(data, schema=schema)

This will attach a UdfMetric to column “col1” that will include a submetric named “add5” tracking the values in “col1” incremented by 5, and a UdfMetric for each string column that will include a submetric named “upper” tracking the uppercased strings in the input columns. Since these are appended to the STANDARD_RESOLVER, the default metrics are also tracked for every column.

Parameters
  • schema_name (Union[str, List[str]]) –

  • include_default_schema (bool) –

Return type

List[whylogs.core.resolvers.ResolverSpec]

whylogs.experimental.core.metrics.udf_metric.generate_udf_schema() List[whylogs.core.resolvers.ResolverSpec]#
Return type

List[whylogs.core.resolvers.ResolverSpec]

whylogs.experimental.core.metrics.udf_metric.udf_metric_schema(non_udf_resolvers: Optional[List[whylogs.core.resolvers.ResolverSpec]] = None, types: Optional[Dict[str, Any]] = None, default_config: Optional[whylogs.core.metrics.metrics.MetricConfig] = None, type_mapper: Optional[whylogs.core.datatypes.TypeMapper] = None, cache_size: int = 1024, schema_based_automerge: bool = False, segments: Optional[Dict[str, whylogs.core.segmentation_partition.SegmentationPartition]] = None, validators: Optional[Dict[str, List[whylogs.core.validators.validator.Validator]]] = None, schema_name: Union[str, List[str]] = '', include_default_schema: bool = True) whylogs.core.schema.DeclarativeSchema#

Generates a DeclarativeSchema that implement the UdfMetrics specified by the @register_metric_udf decorators (in additon to any non_udf_resolvers passed in).

For example:

@register_metric_udf(col_name=”col1”) def add5(x):

return x + 5

@register_metric_udf(col_type=String) def upper(x):

return x.upper()

why.log(data, schema=udf_metric_schema())

This will attach a UdfMetric to column “col1” that will include a submetric named “add5” tracking the values in “col1” incremented by 5, and a UdfMetric for each string column that will include a submetric named “upper” tracking the uppercased strings in the input columns. Since these are appended to the STANDARD_RESOLVER, the default metrics are also tracked for every column.

Parameters
Return type

whylogs.core.schema.DeclarativeSchema