whylogs.core.columnprofile
¶
Defines the ColumnProfile class for tracking per-column statistics
Module Contents¶
Classes¶
Statistics tracking for a column (i.e. a feature) |
Attributes¶
-
whylogs.core.columnprofile.
_TYPES
¶
-
whylogs.core.columnprofile.
_NUMERIC_TYPES
¶
-
whylogs.core.columnprofile.
_UNIQUE_COUNT_BOUNDS_STD
= 1¶
-
class
whylogs.core.columnprofile.
ColumnProfile
(name: str, number_tracker: whylogs.core.statistics.NumberTracker = None, string_tracker: whylogs.core.statistics.datatypes.StringTracker = None, schema_tracker: whylogs.core.statistics.SchemaTracker = None, counters: whylogs.core.statistics.CountersTracker = None, frequent_items: whylogs.util.dsketch.FrequentItemsSketch = None, cardinality_tracker: whylogs.core.statistics.hllsketch.HllSketch = None, constraints: whylogs.core.statistics.constraints.ValueConstraints = None)¶ Statistics tracking for a column (i.e. a feature)
The primary method for
- Parameters
name (str (required)) – Name of the column profile
number_tracker (NumberTracker) – Implements numeric data statistics tracking
string_tracker (StringTracker) – Implements string data-type statistics tracking
schema_tracker (SchemaTracker) – Implements tracking of schema-related information
counters (CountersTracker) – Keep count of various things
frequent_items (FrequentItemsSketch) – Keep track of all frequent items, even for mixed datatype features
cardinality_tracker (HllSketch) – Track feature cardinality (even for mixed data types)
constraints (ValueConstraints) – Static assertions to be applied to numeric data tracked in this column
TODO –
Proper TypedDataConverter type checking
Multi-threading/parallelism
-
track
(self, value)¶ Add value to tracking statistics.
-
_unique_count_summary
(self) → whylogs.proto.UniqueCountSummary¶
-
to_summary
(self)¶ Generate a summary of the statistics
- Returns
summary – Protobuf summary message.
- Return type
ColumnSummary
-
generate_constraints
(self) → whylogs.core.statistics.constraints.SummaryConstraints¶
-
merge
(self, other)¶ Merge this columnprofile with another.
- Parameters
other (ColumnProfile) –
- Returns
merged – A new, merged column profile.
- Return type
-
to_protobuf
(self)¶ Return the object serialized as a protobuf message
- Returns
message
- Return type
ColumnMessage
-
static
from_protobuf
(message)¶ Load from a protobuf message
- Returns
column_profile
- Return type