whylogs.app.session

whylogs logging session

Module Contents

Classes

_LoggerKey

Create a new logger or return an existing one for a given dataset name.

Session

param project

The project name. We will default to the project name when logging

Functions

session_from_config(config: whylogs.app.config.SessionConfig = None, config_path: Optional[str] = '') → Session

Construct a whylogs session from a SessionConfig or from a config_path

reset_default_session()

Reset and deactivate the global whylogs logging session.

start_whylabs_session(path_to_config: Optional[str] = None, report_progress: Optional[bool] = False)

get_or_create_session(path_to_config: Optional[str] = None, report_progress: Optional[bool] = False)

Retrieve the current active global session.

get_session()

Retrieve the logging session without altering or activating it.

get_logger()

Retrieve the global session logger

Attributes

defaultLoggerArgs

_use_whylabs_client

_session

class whylogs.app.session._LoggerKey

Create a new logger or return an existing one for a given dataset name. If no dataset_name is specified, we default to project name

Parameters
  • metadata

  • dataset_name – str Name of the dataset. Default is the project name

  • dataset_timestamp – datetime.datetime, optional The timestamp associated with the dataset. Could be the timestamp for the batch, or the timestamp for the window that you are tracking

  • tags – dict Tag the data with groupable information. For example, you might want to tag your data with the stage information (development, testing, production etc…)

  • metadata – dict Useful to debug the data source. You can associate non-groupable information in this field such as hostname,

  • session_timestamp – datetime.datetime, optional Override the timestamp associated with the session. Normally you shouldn’t need to override this value

  • segments – Can be either: - List of tag key value pairs for tracking datasetments - List of tag keys for whylogs to split up the data in the backend

dataset_name :Optional[str]
dataset_timestamp :Optional[datetime.datetime]
session_timestamp :Optional[datetime.datetime]
tags :Dict[str, str]
metadata :Dict[str, str]
segments :Optional[Union[List[Dict], List[str]]]
profile_full_dataset :bool = False
with_rotation_time :str
cache_size :int = 1
constraints :whylogs.core.statistics.constraints.DatasetConstraints
whylogs.app.session.defaultLoggerArgs
class whylogs.app.session.Session(project: Optional[str] = None, pipeline: Optional[str] = None, writers: Optional[List[whylogs.app.writers.Writer]] = None, metadata_writer: Optional[whylogs.app.metadata_writer.MetadataWriter] = None, verbose: bool = False, with_rotation_time: str = None, cache_size: int = None, report_progress: bool = False)
Parameters
  • project (str) – The project name. We will default to the project name when logging a dataset if the dataset name is not specified

  • pipeline (str) – Name of the pipeline associated with this session

  • writers (list) – configuration for the output writers. This is where the log data will go

  • verbose (bool) – enable verbose logging for not. Default is False

__enter__(self)
__exit__(self, tpe, value, traceback)
__repr__(self)

Return repr(self).

get_config(self)
is_active(self)
logger(self, dataset_name: Optional[str] = None, dataset_timestamp: Optional[datetime.datetime] = None, session_timestamp: Optional[datetime.datetime] = None, tags: Dict[str, str] = None, metadata: Dict[str, str] = None, segments: Optional[Union[List[Dict], List[str], str]] = None, profile_full_dataset: bool = False, with_rotation_time: str = None, cache_size: int = 1, constraints: whylogs.core.statistics.constraints.DatasetConstraints = None) whylogs.app.logger.Logger

Create a new logger or return an existing one for a given dataset name. If no dataset_name is specified, we default to project name

Parameters
  • dataset_name – name of the dataset

  • dataset_timestamp – timestamp of the dataset. Default to now

  • session_timestamp – timestamp of the session. Inherits from the session

  • tags – metadata associated with the profile

  • metadata – same as tags. Will be deprecated

  • segments – slice of data that the profile belongs to

  • profile_full_dataset – when segmenting dataset, an option to keep the full unsegmented profile of the dataset

  • with_rotation_time – rotation time in minutes our hours (“1m”, “1h”)

  • cache_size – size of the segment cache

  • constraints – whylogs contrainst to monitor against

get_logger(self, dataset_name: str = None)
log_dataframe(self, df: pandas.DataFrame, dataset_name: Optional[str] = None, dataset_timestamp: Optional[datetime.datetime] = None, session_timestamp: Optional[datetime.datetime] = None, tags: Dict[str, str] = None, metadata: Dict[str, str] = None, segments: Optional[Union[List[Dict], List[str], str]] = None, profile_full_dataset: bool = False, constraints: whylogs.core.statistics.constraints.DatasetConstraints = None) Optional[whylogs.core.DatasetProfile]

Perform statistics caluclations and log a pandas dataframe

Parameters
  • df – the dataframe to profile

  • dataset_name – name of the dataset

  • dataset_timestamp – the timestamp for the dataset

  • session_timestamp – the timestamp for the session. Override the default one

  • tags – the tags for the profile. Useful when merging

  • metadata – information about this current profile. Can be discarded when merging

  • segments – Can be either: - Autosegmentation source, one of [“auto”, “local”] - List of tag key value pairs for tracking data segments - List of tag keys for which we will track every value - None, no segments will be used

  • profile_full_dataset – when segmenting dataset, an option to keep the full unsegmented profile of the dataset

Returns

a dataset profile if the session is active

profile_dataframe(self, df: pandas.DataFrame, dataset_name: Optional[str] = None, dataset_timestamp: Optional[datetime.datetime] = None, session_timestamp: Optional[datetime.datetime] = None, tags: Dict[str, str] = None, metadata: Dict[str, str] = None) Optional[whylogs.core.DatasetProfile]

Profile a Pandas dataframe without actually writing data to disk. This is useful when you just want to quickly capture and explore a dataset profile.

Parameters
  • df – the dataframe to profile

  • dataset_name – name of the dataset

  • dataset_timestamp – the timestamp for the dataset

  • session_timestamp – the timestamp for the session. Override the default one

  • tags – the tags for the profile. Useful when merging

  • metadata – information about this current profile. Can be discarded when merging

Returns

a dataset profile if the session is active

new_profile(self, dataset_name: Optional[str] = None, dataset_timestamp: Optional[datetime.datetime] = None, session_timestamp: Optional[datetime.datetime] = None, tags: Dict[str, str] = None, metadata: Dict[str, str] = None) Optional[whylogs.core.DatasetProfile]

Create an empty dataset profile with the metadata from the session.

Parameters
  • dataset_name – name of the dataset

  • dataset_timestamp – the timestamp for the dataset

  • session_timestamp – the timestamp for the session. Override the default one

  • tags – the tags for the profile. Useful when merging

  • metadata – information about this current profile. Can be discarded when merging

Returns

a dataset profile if the session is active

estimate_segments(self, df: pandas.DataFrame, name: str, target_field: str = None, max_segments: int = 30, dry_run: bool = False) Optional[Union[List[Dict], List[str]]]

Estimates the most important features and values on which to segment data profiling using entropy-based methods.

Parameters
  • df – the dataframe of data to profile

  • name – name for discovery in the logger, automatically applied

to loggers with same dataset_name :param target_field: target field (optional) :param max_segments: upper threshold for total combinations of segments, default 30 :param dry_run: run calculation but do not write results to metadata :return: a list of segmentation feature names

close(self)

Deactivate this session and flush all associated loggers

remove_logger(self, dataset_name: str)

Remove a logger from the dataset. This is called by the logger when it’s being closed

Parameters
  • logger (dataset_name the name of the dataset. used to identify the) –

  • None (Returns) –

  • -------

whylogs.app.session._use_whylabs_client = False
whylogs.app.session.session_from_config(config: whylogs.app.config.SessionConfig = None, config_path: Optional[str] = '') Session

Construct a whylogs session from a SessionConfig or from a config_path

whylogs.app.session._session
whylogs.app.session.reset_default_session()

Reset and deactivate the global whylogs logging session.

whylogs.app.session.start_whylabs_session(path_to_config: Optional[str] = None, report_progress: Optional[bool] = False)
whylogs.app.session.get_or_create_session(path_to_config: Optional[str] = None, report_progress: Optional[bool] = False)

Retrieve the current active global session.

If no active session exists, attempt to load config and create a new session.

If an active session exists, return the session without loading new config.

Returns

The global active session

Return type

Session

whylogs.app.session.get_session()

Retrieve the logging session without altering or activating it.

Returns

session – The global session

Return type

Session

whylogs.app.session.get_logger()

Retrieve the global session logger

Returns

ylog – The global session logger

Return type

whylogs.app.logger.Logger