whylogs.core.statistics.schematracker

Module Contents

Classes

SchemaTracker

Track information about a column's schema and present datatypes

Attributes

Type

whylogs.core.statistics.schematracker.Type
class whylogs.core.statistics.schematracker.SchemaTracker(type_counts: dict = None, legacy_null_count=0)

Track information about a column’s schema and present datatypes

type_countsdict

If specified, a dictionary containing information about the counts of all data types.

UNKNOWN_TYPE
NULL_TYPE
CANDIDATE_MIN_FRAC = 0.7
_non_null_type_counts(self)
track(self, item_type)

Track an item type

get_count(self, item_type)

Return the count of a given item type

infer_type(self)

Generate a guess at what type the tracked values are.

Returns

type_guess – The guess tome. See InferredType.Type for candidates

Return type

object

merge(self, other)

Merge another schema tracker with this and return a new one. Does not alter this object.

Parameters

other (SchemaTracker) –

Returns

merged – Merged tracker

Return type

SchemaTracker

copy(self)

Return a copy of this tracker

to_protobuf(self)

Return the object serialized as a protobuf message

Returns

message

Return type

SchemaMessage

static from_protobuf(message, legacy_null_count=0)

Load from a protobuf message

Returns

schema_tracker

Return type

SchemaTracker

to_summary(self)

Generate a summary of the statistics

Returns

summary – Protobuf summary message.

Return type

SchemaSummary