whylogs.core.statistics.datatypes

Define classes for tracking statistics for various data types

Package Contents

Classes

FloatTracker

Track statistics for floating point numbers

IntTracker

Track statistics for integers

StringTracker

Track statistics for strings

VarianceTracker

Class that implements variance estimates for streaming data and for

Attributes

__ALL__

class whylogs.core.statistics.datatypes.FloatTracker(min: float = None, max: float = None, sum: float = None, count: int = None)

Track statistics for floating point numbers

Parameters
  • min (float) – Current min value

  • max (float) – Current max value

  • sum (float) – Sum of the numbers

  • count (int) – Total count of numbers

update(self, value: float)

Add a number to the tracking statistics

add_integers(self, tracker)

Copy data from a IntTracker into this object, overwriting the current values.

Parameters

tracker (IntTracker) –

mean(self)

Calculate the current mean

merge(self, other)

Merge this tracker with another.

Parameters

other (FloatTracker) – The other float tracker

Returns

merged – A new float tracker

Return type

FloatTracker

to_protobuf(self)

Return the object serialized as a protobuf message

Returns

message

Return type

DoublesMessage

static from_protobuf(message)

Load from a protobuf message

Returns

number_tracker

Return type

FloatTracker

class whylogs.core.statistics.datatypes.IntTracker(min: int = None, max: int = None, sum: int = None, count: int = None)

Track statistics for integers

Parameters
  • min – Current min value

  • max – Current max value

  • sum – Sum of the numbers

  • count – Total count of numbers

DEFAULTS
set_defaults(self)

Set attribute values to defaults

mean(self)

Calculate the current mean. Returns None if self.count = 0

update(self, value)

Add a number to the tracking statistics

merge(self, other)

Merge values of another IntTracker with this one.

Parameters

other (IntTracker) – Other tracker

Returns

new – New, merged tracker

Return type

IntTracker

to_protobuf(self)

Return the object serialized as a protobuf message

Returns

message

Return type

LongsMessage

static from_protobuf(message)

Load from a protobuf message

Returns

number_tracker

Return type

IntTracker

class whylogs.core.statistics.datatypes.StringTracker(count: int = None, items: datasketches.frequent_strings_sketch = None, theta_sketch: whylogs.core.statistics.thetasketch.ThetaSketch = None)

Track statistics for strings

Parameters
  • count (int) – Total number of processed values

  • items (frequent_strings_sketch) – Sketch for tracking string counts

  • theta_sketch (ThetaSketch) – Sketch for approximate cardinality tracking

update(self, value: str)

Add a string to the tracking statistics.

If value is None, nothing will be done

merge(self, other)

Merge the values of this string tracker with another

Parameters

other (StringTracker) – The other StringTracker

Returns

new – Merged values

Return type

StringTracker

to_protobuf(self)

Return the object serialized as a protobuf message

Returns

message

Return type

StringsMessage

static from_protobuf(message: whylogs.proto.StringsMessage)

Load from a protobuf message

Returns

string_tracker

Return type

StringTracker

to_summary(self)

Generate a summary of the statistics

Returns

summary – Protobuf summary message.

Return type

StringsSummary

class whylogs.core.statistics.datatypes.VarianceTracker(count=0, sum=0.0, mean=0.0)

Class that implements variance estimates for streaming data and for batched data.

Parameters
  • count – Number tracked elements

  • sum – Sum of all numbers

  • mean – Current estimate of the mean

update(self, new_value)

Add a number to tracking estimates

Based on https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Welford’s_online_algorithm

Parameters

new_value (int, float) –

stddev(self)

Return an estimate of the sample standard deviation

variance(self)

Return an estimate of the sample variance

merge(self, other: VarianceTracker)

Merge statistics from another VarianceTracker into this one

See: https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Parallel_algorithm

Parameters

other (VarianceTracker) – Other variance tracker

Returns

merged – A new variance tracker from the merged statistics

Return type

VarianceTracker

copy(self)

Return a copy of this tracker

to_protobuf(self)

Return the object serialized as a protobuf message

Returns

message

Return type

VarianceMessage

static from_protobuf(message)

Load from a protobuf message

Returns

variance_tracker

Return type

VarianceTracker

whylogs.core.statistics.datatypes.__ALL__