whylogs.api
#
Subpackages#
whylogs.api.fugue
whylogs.api.logger
whylogs.api.reader
whylogs.api.store
whylogs.api.usage_stats
whylogs.api.whylabs
whylogs.api.writer
whylogs.api.writer.gcs
whylogs.api.writer.local
whylogs.api.writer.mlflow
whylogs.api.writer.s3
whylogs.api.writer.whylabs
whylogs.api.writer.whylabs_base
whylogs.api.writer.whylabs_batch_writer
whylogs.api.writer.whylabs_client
whylogs.api.writer.whylabs_estimation_result_writer
whylogs.api.writer.whylabs_reference_writer
whylogs.api.writer.whylabs_transaction_writer
whylogs.api.writer.writer
Submodules#
Package Contents#
Classes#
A holder object for profiling results. |
Functions#
|
|
|
|
|
Function to track metrics based on validation data. |
|
Function to track regression metrics based on validation data. |
|
|
|
- whylogs.api.profiling(*, schema: Optional[whylogs.core.DatasetSchema] = None)#
- Parameters
schema (Optional[whylogs.core.DatasetSchema]) –
- class whylogs.api.ResultSet#
Bases:
whylogs.api.writer.writer._Writable
,abc.ABC
A holder object for profiling results.
A whylogs.log call can result in more than one profile. This wrapper class simplifies the navigation among these profiles.
Note that currently we only hold one profile but we’re planning to add other kinds of profiles such as segmented profiles here.
- property performance_metrics: Optional[whylogs.core.model_performance_metrics.ModelPerformanceMetrics]#
- Return type
Optional[whylogs.core.model_performance_metrics.ModelPerformanceMetrics]
- abstract view() Optional[whylogs.core.DatasetProfileView] #
- Return type
Optional[whylogs.core.DatasetProfileView]
- abstract profile() Optional[whylogs.core.DatasetProfile] #
- Return type
Optional[whylogs.core.DatasetProfile]
- get_writables() Optional[List[whylogs.api.writer.writer._Writable]] #
- Return type
Optional[List[whylogs.api.writer.writer._Writable]]
- set_dataset_timestamp(dataset_timestamp: datetime.datetime) None #
- Parameters
dataset_timestamp (datetime.datetime) –
- Return type
- add_model_performance_metrics(metrics: whylogs.core.model_performance_metrics.ModelPerformanceMetrics) None #
- Parameters
metrics (whylogs.core.model_performance_metrics.ModelPerformanceMetrics) –
- Return type
- add_metric(name: str, metric: whylogs.core.metrics.metrics.Metric) None #
- Parameters
name (str) –
metric (whylogs.core.metrics.metrics.Metric) –
- Return type
- whylogs.api.log(obj: Any = None, *, pandas: Optional[whylogs.core.stubs.pd.DataFrame] = None, row: Optional[Dict[str, Any]] = None, schema: Optional[whylogs.core.DatasetSchema] = None, name: Optional[str] = None, multiple: Optional[Dict[str, Loggable]] = None, dataset_timestamp: Optional[datetime.datetime] = None, trace_id: Optional[str] = None, tags: Optional[List[str]] = None, segment_key_values: Optional[Dict[str, str]] = None, debug_event: Optional[Dict[str, Any]] = None) result_set.ResultSet #
- Parameters
obj (Any) –
pandas (Optional[whylogs.core.stubs.pd.DataFrame]) –
row (Optional[Dict[str, Any]]) –
schema (Optional[whylogs.core.DatasetSchema]) –
name (Optional[str]) –
multiple (Optional[Dict[str, Loggable]]) –
dataset_timestamp (Optional[datetime.datetime]) –
trace_id (Optional[str]) –
tags (Optional[List[str]]) –
debug_event (Optional[Dict[str, Any]]) –
- Return type
- whylogs.api.log_classification_metrics(data: whylogs.core.stubs.pd.DataFrame, target_column: str, prediction_column: str, score_column: Optional[str] = None, schema: Optional[whylogs.core.DatasetSchema] = None, log_full_data: bool = False, dataset_timestamp: Optional[datetime.datetime] = None) result_set.ResultSet #
Function to track metrics based on validation data. user may also pass the associated attribute names associated with target, prediction, and/or score.
- Parameters
data (pd.DataFrame) – Dataframe with the data to log.
target_column (str) – Column name for the actual validated values.
prediction_column (str) – Column name for the predicted values.
score_column (Optional[str], optional) – Associated scores for each inferred, all values set to 1 if None, by default None
schema (Optional[DatasetSchema], optional) – Defines the schema for tracking metrics in whylogs, by default None
log_full_data (bool, optional) – Whether to log the complete dataframe or not. If True, the complete DF will be logged in addition to the regression metrics. If False, only the calculated regression metrics will be logged. In a typical production use case, the ground truth might not be available at the time the remaining data is generated. In order to prevent double profiling the input features, consider leaving this as False. by default False.
dataset_timestamp (Optional[datetime], optional) – dataset’s timestamp, by default None
- Return type
Examples
data = { "product": ["milk", "carrot", "cheese", "broccoli"], "category": ["dairies", "vegetables", "dairies", "vegetables"], "output_discount": [0, 0, 1, 1], "output_prediction": [0, 0, 0, 1], } df = pd.DataFrame(data) results = why.log_classification_metrics( df, target_column="output_discount", prediction_column="output_prediction", log_full_data=True, )
- whylogs.api.log_regression_metrics(data: whylogs.core.stubs.pd.DataFrame, target_column: str, prediction_column: str, schema: Optional[whylogs.core.DatasetSchema] = None, log_full_data: bool = False, dataset_timestamp: Optional[datetime.datetime] = None) result_set.ResultSet #
Function to track regression metrics based on validation data. User may also pass the associated attribute names associated with target, prediction, and/or score.
- Parameters
data (pd.DataFrame) – Dataframe with the data to log.
target_column (str) – Column name for the target values.
prediction_column (str) – Column name for the predicted values.
schema (Optional[DatasetSchema], optional) – Defines the schema for tracking metrics in whylogs, by default None
log_full_data (bool, optional) – Whether to log the complete dataframe or not. If True, the complete DF will be logged in addition to the regression metrics. If False, only the calculated regression metrics will be logged. In a typical production use case, the ground truth might not be available at the time the remaining data is generated. In order to prevent double profiling the input features, consider leaving this as False. by default False.
dataset_timestamp (Optional[datetime], optional) – dataset’s timestamp, by default None
- Returns
- Return type
Examples
import pandas as pd import whylogs as why df = pd.DataFrame({"target_temperature": [[10.5, 24.3, 15.6]], "predicted_temperature": [[9.12,26.42,13.12]]}) results = why.log_regression_metrics(df, target_column = "temperature", prediction_column = "prediction_temperature")
- whylogs.api.write(profile: whylogs.core.DatasetProfile, base_dir: Optional[str] = None, filename: Optional[str] = None) None #
- Parameters
profile (whylogs.core.DatasetProfile) –
base_dir (Optional[str]) –
filename (Optional[str]) –
- Return type