whylogs.core.statistics.thetasketch

Module Contents

Classes

ThetaSketch

A sketch for approximate cardinality tracking.

Functions

_copy_union(union)

numbers_summary(sketch: ThetaSketch, num_std_devs=1)

Generate a summary protobuf message from a thetasketch based on numeric

whylogs.core.statistics.thetasketch._copy_union(union)
class whylogs.core.statistics.thetasketch.ThetaSketch(theta_sketch=None, union=None, compact_theta=None)

A sketch for approximate cardinality tracking.

A wrapper class for datasketches.update_theta_sketch which implements merging for updatable theta sketches.

Currently, datasketches only implements merging for compact (read-only) theta sketches.

update(self, value)

Update the statistics tracking

Parameters

value (object) – Value to follow

merge(self, other)

Merge another ThetaSketch with this one, returning a new object

Parameters

other (ThetaSketch) – Other theta sketch

Returns

new – New theta sketch with merged statistics

Return type

ThetaSketch

get_result(self)

Generate a theta sketch

Returns

compact_sketch – Read-only compact theta sketch with full statistics.

Return type

datasketches.compact_theta_sketch

serialize(self)

Serialize this object.

Note that serialization only preserves the object approximately.

Returns

msg – Serialized to bytes

Return type

bytes

static deserialize(msg: bytes)

Deserialize from a serialized message.

msg

Parameters

msg (bytes) –

Serialized object. can be a serialized version of:
  • ThetaSketch

  • datasketches.update_theta_sketch,

  • datasketches.compact_theta_sketch

Returns

sketch – ThetaSketch object

Return type

ThetaSketch

to_summary(self, num_std_devs=1)

Generate a summary protobuf message

Parameters

num_std_devs (float) – For estimating bounds

Returns

summary – Summary protobuf message

Return type

UniqueCountSummary

whylogs.core.statistics.thetasketch.numbers_summary(sketch: ThetaSketch, num_std_devs=1)

Generate a summary protobuf message from a thetasketch based on numeric values

Parameters
  • sketch

  • num_std_devs (float) – For estimating bounds

Returns

summary – Summary protobuf message

Return type

UniqueCountSummary