whylogs.app.session
¶
whylogs logging session
Module Contents¶
Classes¶
Create a new logger or return an existing one for a given dataset name. |
|
|
Functions¶
|
Construct a whylogs session from a SessionConfig or from a config_path |
Reset and deactivate the global whylogs logging session. |
|
|
|
|
Retrieve the current active global session. |
Retrieve the logging session without altering or activating it. |
|
Retrieve the global session logger |
Attributes¶
- class whylogs.app.session._LoggerKey¶
Create a new logger or return an existing one for a given dataset name. If no dataset_name is specified, we default to project name
- Parameters
metadata –
dataset_name – str Name of the dataset. Default is the project name
dataset_timestamp – datetime.datetime, optional The timestamp associated with the dataset. Could be the timestamp for the batch, or the timestamp for the window that you are tracking
tags – dict Tag the data with groupable information. For example, you might want to tag your data with the stage information (development, testing, production etc…)
metadata – dict Useful to debug the data source. You can associate non-groupable information in this field such as hostname,
session_timestamp – datetime.datetime, optional Override the timestamp associated with the session. Normally you shouldn’t need to override this value
segments – Can be either: - List of tag key value pairs for tracking datasetments - List of tag keys for whylogs to split up the data in the backend
- dataset_name :Optional[str]¶
- dataset_timestamp :Optional[datetime.datetime]¶
- session_timestamp :Optional[datetime.datetime]¶
- tags :Dict[str, str]¶
- metadata :Dict[str, str]¶
- segments :Optional[Union[List[Dict], List[str]]]¶
- profile_full_dataset :bool = False¶
- with_rotation_time :str¶
- cache_size :int = 1¶
- constraints :whylogs.core.statistics.constraints.DatasetConstraints¶
- whylogs.app.session.defaultLoggerArgs¶
- class whylogs.app.session.Session(project: Optional[str] = None, pipeline: Optional[str] = None, writers: Optional[List[whylogs.app.writers.Writer]] = None, metadata_writer: Optional[whylogs.app.metadata_writer.MetadataWriter] = None, verbose: bool = False, with_rotation_time: str = None, cache_size: int = None, report_progress: bool = False)¶
- Parameters
project (str) – The project name. We will default to the project name when logging a dataset if the dataset name is not specified
pipeline (str) – Name of the pipeline associated with this session
writers (list) – configuration for the output writers. This is where the log data will go
verbose (bool) – enable verbose logging for not. Default is
False
- __enter__(self)¶
- __exit__(self, tpe, value, traceback)¶
- __repr__(self)¶
Return repr(self).
- get_config(self)¶
- is_active(self)¶
- logger(self, dataset_name: Optional[str] = None, dataset_timestamp: Optional[datetime.datetime] = None, session_timestamp: Optional[datetime.datetime] = None, tags: Dict[str, str] = None, metadata: Dict[str, str] = None, segments: Optional[Union[List[Dict], List[str], str]] = None, profile_full_dataset: bool = False, with_rotation_time: str = None, cache_size: int = 1, constraints: whylogs.core.statistics.constraints.DatasetConstraints = None) whylogs.app.logger.Logger ¶
Create a new logger or return an existing one for a given dataset name. If no dataset_name is specified, we default to project name
- Parameters
dataset_name – name of the dataset
dataset_timestamp – timestamp of the dataset. Default to now
session_timestamp – timestamp of the session. Inherits from the session
tags – metadata associated with the profile
metadata – same as tags. Will be deprecated
segments – slice of data that the profile belongs to
profile_full_dataset – when segmenting dataset, an option to keep the full unsegmented profile of the dataset
with_rotation_time – rotation time in minutes our hours (“1m”, “1h”)
cache_size – size of the segment cache
constraints – whylogs contrainst to monitor against
- get_logger(self, dataset_name: str = None)¶
- log_dataframe(self, df: pandas.DataFrame, dataset_name: Optional[str] = None, dataset_timestamp: Optional[datetime.datetime] = None, session_timestamp: Optional[datetime.datetime] = None, tags: Dict[str, str] = None, metadata: Dict[str, str] = None, segments: Optional[Union[List[Dict], List[str], str]] = None, profile_full_dataset: bool = False, constraints: whylogs.core.statistics.constraints.DatasetConstraints = None) Optional[whylogs.core.DatasetProfile] ¶
Perform statistics caluclations and log a pandas dataframe
- Parameters
df – the dataframe to profile
dataset_name – name of the dataset
dataset_timestamp – the timestamp for the dataset
session_timestamp – the timestamp for the session. Override the default one
tags – the tags for the profile. Useful when merging
metadata – information about this current profile. Can be discarded when merging
segments – Can be either: - Autosegmentation source, one of [“auto”, “local”] - List of tag key value pairs for tracking data segments - List of tag keys for which we will track every value - None, no segments will be used
profile_full_dataset – when segmenting dataset, an option to keep the full unsegmented profile of the dataset
- Returns
a dataset profile if the session is active
- profile_dataframe(self, df: pandas.DataFrame, dataset_name: Optional[str] = None, dataset_timestamp: Optional[datetime.datetime] = None, session_timestamp: Optional[datetime.datetime] = None, tags: Dict[str, str] = None, metadata: Dict[str, str] = None) Optional[whylogs.core.DatasetProfile] ¶
Profile a Pandas dataframe without actually writing data to disk. This is useful when you just want to quickly capture and explore a dataset profile.
- Parameters
df – the dataframe to profile
dataset_name – name of the dataset
dataset_timestamp – the timestamp for the dataset
session_timestamp – the timestamp for the session. Override the default one
tags – the tags for the profile. Useful when merging
metadata – information about this current profile. Can be discarded when merging
- Returns
a dataset profile if the session is active
- new_profile(self, dataset_name: Optional[str] = None, dataset_timestamp: Optional[datetime.datetime] = None, session_timestamp: Optional[datetime.datetime] = None, tags: Dict[str, str] = None, metadata: Dict[str, str] = None) Optional[whylogs.core.DatasetProfile] ¶
Create an empty dataset profile with the metadata from the session.
- Parameters
dataset_name – name of the dataset
dataset_timestamp – the timestamp for the dataset
session_timestamp – the timestamp for the session. Override the default one
tags – the tags for the profile. Useful when merging
metadata – information about this current profile. Can be discarded when merging
- Returns
a dataset profile if the session is active
- estimate_segments(self, df: pandas.DataFrame, name: str, target_field: str = None, max_segments: int = 30, dry_run: bool = False) Optional[Union[List[Dict], List[str]]] ¶
Estimates the most important features and values on which to segment data profiling using entropy-based methods.
- Parameters
df – the dataframe of data to profile
name – name for discovery in the logger, automatically applied
to loggers with same dataset_name :param target_field: target field (optional) :param max_segments: upper threshold for total combinations of segments, default 30 :param dry_run: run calculation but do not write results to metadata :return: a list of segmentation feature names
- close(self)¶
Deactivate this session and flush all associated loggers
- remove_logger(self, dataset_name: str)¶
Remove a logger from the dataset. This is called by the logger when it’s being closed
- Parameters
logger (dataset_name the name of the dataset. used to identify the) –
None (Returns) –
------- –
- whylogs.app.session._use_whylabs_client = False¶
- whylogs.app.session.session_from_config(config: whylogs.app.config.SessionConfig = None, config_path: Optional[str] = '') Session ¶
Construct a whylogs session from a SessionConfig or from a config_path
- whylogs.app.session._session¶
- whylogs.app.session.reset_default_session()¶
Reset and deactivate the global whylogs logging session.
- whylogs.app.session.start_whylabs_session(path_to_config: Optional[str] = None, report_progress: Optional[bool] = False)¶
- whylogs.app.session.get_or_create_session(path_to_config: Optional[str] = None, report_progress: Optional[bool] = False)¶
Retrieve the current active global session.
If no active session exists, attempt to load config and create a new session.
If an active session exists, return the session without loading new config.
- Returns
The global active session
- Return type
- whylogs.app.session.get_session()¶
Retrieve the logging session without altering or activating it.
- Returns
session – The global session
- Return type
- whylogs.app.session.get_logger()¶
Retrieve the global session logger
- Returns
ylog – The global session logger
- Return type