whylogs.app.logger

Class and functions for whylogs logging

Module Contents

Classes

Logger

Class for logging whylogs statistics.

Functions

hash_segment(seg: List[Dict]) → str

Attributes

SegmentTag

Segment

_TAG_PREFIX

_TAG_KEY

_TAG_VALUE

logger

whylogs.app.logger.SegmentTag
whylogs.app.logger.Segment
whylogs.app.logger._TAG_PREFIX = whylogs.tag.
whylogs.app.logger._TAG_KEY = key
whylogs.app.logger._TAG_VALUE = value
whylogs.app.logger.logger
class whylogs.app.logger.Logger(session_id: str, dataset_name: str, dataset_timestamp: Optional[datetime.datetime] = None, session_timestamp: Optional[datetime.datetime] = None, tags: Dict[str, str] = None, metadata: Dict[str, str] = None, writers: List[whylogs.app.writers.Writer] = None, metadata_writer: whylogs.app.metadata_writer.MetadataWriter = None, verbose: bool = False, with_rotation_time: Optional[str] = None, interval: int = 1, cache_size: int = 1, segments: Optional[Union[List[Segment], List[str], str]] = None, profile_full_dataset: bool = False, constraints: whylogs.core.statistics.constraints.DatasetConstraints = None)

Class for logging whylogs statistics.

Parameters
  • session_id – The session ID value. Should be set by the Session boject

  • dataset_name – The name of the dataset. Gets included in the DatasetProfile metadata and can be used in generated filenames.

  • dataset_timestamp – Optional. The timestamp that the logger represents

  • session_timestamp – Optional. The time the session was created

  • tags – Optional. Dictionary of key, value for aggregating data upstream

  • metadata – Optional. Dictionary of key, value. Useful for debugging (associated with every single dataset profile)

  • writers – Optional. List of Writer objects used to write out the data

  • metadata_writer – Optional. MetadataWriter object used to write non-profile information

  • with_rotation_time – Optional. Log rotation interval, consisting of digits with unit specification, e.g. 30s, 2h, d. units are seconds (“s”), minutes (“m”), hours, (“h”), or days (“d”) Output filenames will have a suffix reflecting the rotation interval.

  • interval – Deprecated: Interval multiplier for with_rotation_time, defaults to 1.

  • verbose – enable debug logging

  • cache_size – dataprofiles to cache

  • segments

    Can be either:
    • Autosegmentation source, one of [“auto”, “local”]

    • List of tag key value pairs for tracking data segments

    • List of tag keys for which we will track every value

    • None, no segments will be used

  • profile_full_dataset – when segmenting dataset, an option to keep the full unsegmented profile of the dataset.

  • constraints – static assertions to be applied to streams and summaries.

__enter__(self)
__exit__(self, exc_type, exc_val, exc_tb)
property profile(self) whylogs.core.DatasetProfile
Returns

the last backing dataset profile

Return type

DatasetProfile

tracking_checks(self)
property segmented_profiles(self) Dict[str, whylogs.core.DatasetProfile]
Returns

the last backing dataset profile

Return type

Dict[str, DatasetProfile]

get_segment(self, segment: Segment) Optional[whylogs.core.DatasetProfile]
set_segments(self, segments: Union[List[Segment], List[str], str]) None
_retrieve_local_segments(self) Union[List[Segment], List[str], str]

Retrieves local segments

_intialize_profiles(self, dataset_timestamp: Optional[datetime.datetime] = datetime.datetime.now(datetime.timezone.utc)) None
_set_rotation(self, with_rotation_time: str = None)
rotate_when(self, time)
should_rotate(self)
_rotate_time(self)

rotate with time add a suffix

flush(self, rotation_suffix: str = None)

Synchronously perform all remaining write tasks

full_profile_check(self) bool

returns a bool to determine if unsegmented dataset should be profiled.

close(self) Optional[whylogs.core.DatasetProfile]

Flush and close out the logger, outputs the last profile

Returns

the result dataset profile. None if the logger is closed

log(self, features: Optional[Dict[str, any]] = None, feature_name: str = None, value: any = None, character_list: str = None, token_method: Optional[Callable] = None)

Logs a collection of features or a single feature (must specify one or the other).

Parameters
  • features – a map of key value feature for model input

  • feature_name – a dictionary of key->value for multiple features. Each entry represent a single columnar feature

  • feature_name – name of a single feature. Cannot be specified if ‘features’ is specified

  • value – value of as single feature. Cannot be specified if ‘features’ is specified

log_segment_datum(self, feature_name, value, character_list: str = None, token_method: Optional[Callable] = None)
log_metrics(self, targets, predictions, scores=None, model_type: whylogs.proto.ModelType = None, target_field=None, prediction_field=None, score_field=None)
log_image(self, image, feature_transforms: Optional[List[Callable]] = None, metadata_attributes: Optional[List[str]] = METADATA_DEFAULT_ATTRIBUTES, feature_name: str = '')

API to track an image, either in PIL format or as an input path

Parameters
  • feature_name – name of the feature

  • metadata_attributes – metadata attributes to extract for the images

  • feature_transforms – a list of callables to transform the input into metrics

log_local_dataset(self, root_dir, folder_feature_name='folder_feature', image_feature_transforms=None, show_progress=False)

Log a local folder dataset It will log data from the files, along with structure file data like metadata, and magic numbers. If the folder has single layer for children folders, this will pick up folder names as a segmented feature

Parameters
  • show_progress – showing the progress bar

  • image_feature_transforms – image transform that you would like to use with the image log

  • root_dir (str) – directory where dataset is located.

  • folder_feature_name (str, optional) – Name for the subfolder features, i.e. class, store etc.

log_annotation(self, annotation_data)

Log structured annotation data ie. JSON like structures

Parameters

annotation_data (Dict or List) – Description

log_csv(self, filepath_or_buffer: Union[str, pathlib.Path, IO[AnyStr]], segments: Optional[Union[List[Segment], List[str]]] = None, profile_full_dataset: bool = False, **kwargs)

Log a CSV file. This supports the same parameters as :func`pandas.red_csv<pandas.read_csv>` function.

Parameters
  • filepath_or_buffer – the path to the CSV or a CSV buffer

  • segments – define either a list of segment keys or a list of segments tags: [ {“key”:<featurename>,”value”: <featurevalue>},… ]

  • profile_full_dataset – when segmenting dataset, an option to keep the full unsegmented profile of the dataset

  • **kwargs – from pandas:read_csv

log_dataframe(self, df, segments: Optional[Union[List[Segment], List[str]]] = None, profile_full_dataset: bool = False)

Generate and log a whylogs DatasetProfile from a pandas dataframe :param profile_full_dataset: when segmenting dataset, an option to keep the full unsegmented profile of the

dataset.

Parameters
  • segments – specify the tag key value pairs for segments

  • df – the Pandas dataframe to log

log_segments(self, data)
log_segments_keys(self, data)
log_fixed_segments(self, data)
log_df_segment(self, df, segment: Segment)
is_active(self)

Return the boolean state of the logger

static _prefix_segment_tags(segment_key_values)
whylogs.app.logger.hash_segment(seg: List[Dict]) str