`whylogs.api`#

Submodules#

whylogs.api.annotations

Package Contents#

Classes#

ResultSet

A holder object for profiling results.

Functions#

`profiling`(*[, schema])
`log`(→ result_set.ResultSet)
`log_classification_metrics`(→ result_set.ResultSet)	Function to track metrics based on validation data.
`log_regression_metrics`(→ result_set.ResultSet)	Function to track regression metrics based on validation data.
`read`(→ result_set.ResultSet)
`reader`
`write`(→ None)

whylogs.api.profiling(*, schema: Optional[whylogs.core.DatasetSchema] = None)#

Parameters: schema (Optional[whylogs.core.DatasetSchema]) –

class whylogs.api.ResultSet#

Bases: abc.ABC

A holder object for profiling results.

A whylogs.log call can result in more than one profile. This wrapper class simplifies the navigation among these profiles.

Note that currently we only hold one profile but we’re planning to add other kinds of profiles such as segmented profiles here.

property metadata: Optional[Dict[str, str]]#

Return type: Optional[Dict[str, str]]

property count: int#

Return type: int

property performance_metrics: Optional[whylogs.core.model_performance_metrics.ModelPerformanceMetrics]#

Return type: Optional[whylogs.core.model_performance_metrics.ModelPerformanceMetrics]

static read(multi_profile_file: str) → ResultSet#

Parameters: multi_profile_file (str) –
Return type: ResultSet

static reader(name: str = 'local') → ResultSetReader#

Parameters: name (str) –
Return type: ResultSetReader

writer(name: str = 'local') → ResultSetWriter#

Parameters: name (str) –
Return type: ResultSetWriter

abstract view() → Optional[whylogs.core.DatasetProfileView]#

Return type: Optional[whylogs.core.DatasetProfileView]

abstract profile() → Optional[whylogs.core.DatasetProfile]#

Return type: Optional[whylogs.core.DatasetProfile]

get_writables() → Optional[List[whylogs.api.writer.writer.Writable]]#

Return type: Optional[List[whylogs.api.writer.writer.Writable]]

set_dataset_timestamp(dataset_timestamp: datetime.datetime) → None#

Parameters: dataset_timestamp (datetime.datetime) –
Return type: None

add_model_performance_metrics(metrics: whylogs.core.model_performance_metrics.ModelPerformanceMetrics) → None#

Parameters: metrics (whylogs.core.model_performance_metrics.ModelPerformanceMetrics) –
Return type: None

add_metric(name: str, metric: whylogs.core.metrics.metrics.Metric) → None#

Parameters

name (str) –
metric (whylogs.core.metrics.metrics.Metric) –

Return type

None

abstract merge(other: ResultSet) → ResultSet#

Parameters: other (ResultSet) –
Return type: ResultSet

whylogs.api.log(obj: Any = None, *, pandas: Optional[whylogs.core.stubs.pd.DataFrame] = None, row: Optional[Dict[str, Any]] = None, schema: Optional[whylogs.core.DatasetSchema] = None, name: Optional[str] = None, multiple: Optional[Dict[str, Loggable]] = None, dataset_timestamp: Optional[datetime.datetime] = None, trace_id: Optional[str] = None, tags: Optional[List[str]] = None, segment_key_values: Optional[Dict[str, str]] = None, debug_event: Optional[Dict[str, Any]] = None) → result_set.ResultSet#

Parameters

obj (Any) –
pandas (Optional[whylogs.core.stubs.pd.DataFrame]) –
row (Optional[Dict[str, Any]]) –
schema (Optional[whylogs.core.DatasetSchema]) –
name (Optional[str]) –
multiple (Optional[Dict[str, Loggable]]) –
dataset_timestamp (Optional[datetime.datetime]) –
trace_id (Optional[str]) –
tags (Optional[List[str]]) –
segment_key_values (Optional[Dict[str, str]]) –
debug_event (Optional[Dict[str, Any]]) –

Return type

result_set.ResultSet

whylogs.api.log_classification_metrics(data: whylogs.core.stubs.pd.DataFrame, target_column: str, prediction_column: str, score_column: Optional[str] = None, schema: Optional[whylogs.core.DatasetSchema] = None, log_full_data: bool = False, dataset_timestamp: Optional[datetime.datetime] = None) → result_set.ResultSet#

Function to track metrics based on validation data. user may also pass the associated attribute names associated with target, prediction, and/or score.

Parameters

data (pd.DataFrame) – Dataframe with the data to log.
target_column (str) – Column name for the actual validated values.
prediction_column (str) – Column name for the predicted values.
score_column (Optional[str], optional) – Associated scores for each inferred, all values set to 1 if None, by default None
schema (Optional[DatasetSchema], optional) – Defines the schema for tracking metrics in whylogs, by default None
log_full_data (bool, optional) – Whether to log the complete dataframe or not. If True, the complete DF will be logged in addition to the regression metrics. If False, only the calculated regression metrics will be logged. In a typical production use case, the ground truth might not be available at the time the remaining data is generated. In order to prevent double profiling the input features, consider leaving this as False. by default False.
dataset_timestamp (Optional[datetime], optional) – dataset’s timestamp, by default None

Return type

result_set.ResultSet

Examples

data = {
    "product": ["milk", "carrot", "cheese", "broccoli"],
    "category": ["dairies", "vegetables", "dairies", "vegetables"],
    "output_discount": [0, 0, 1, 1],
    "output_prediction": [0, 0, 0, 1],
}
df = pd.DataFrame(data)

results = why.log_classification_metrics(
        df,
        target_column="output_discount",
        prediction_column="output_prediction",
        log_full_data=True,
    )

whylogs.api.log_regression_metrics(data: whylogs.core.stubs.pd.DataFrame, target_column: str, prediction_column: str, schema: Optional[whylogs.core.DatasetSchema] = None, log_full_data: bool = False, dataset_timestamp: Optional[datetime.datetime] = None) → result_set.ResultSet#

Function to track regression metrics based on validation data. User may also pass the associated attribute names associated with target, prediction, and/or score.

Parameters

data (pd.DataFrame) – Dataframe with the data to log.
target_column (str) – Column name for the target values.
prediction_column (str) – Column name for the predicted values.
schema (Optional[DatasetSchema], optional) – Defines the schema for tracking metrics in whylogs, by default None
log_full_data (bool, optional) – Whether to log the complete dataframe or not. If True, the complete DF will be logged in addition to the regression metrics. If False, only the calculated regression metrics will be logged. In a typical production use case, the ground truth might not be available at the time the remaining data is generated. In order to prevent double profiling the input features, consider leaving this as False. by default False.
dataset_timestamp (Optional[datetime], optional) – dataset’s timestamp, by default None

Returns

Return type

ResultSet

Examples

import pandas as pd
import whylogs as why

df = pd.DataFrame({"target_temperature": [[10.5, 24.3, 15.6]], "predicted_temperature": [[9.12,26.42,13.12]]})
results = why.log_regression_metrics(df, target_column = "temperature", prediction_column = "prediction_temperature")

whylogs.api.read(path: str) → result_set.ResultSet#

Parameters: path (str) –
Return type: result_set.ResultSet

whylogs.api.reader(name: str) → result_set.ResultSetReader#

Parameters: name (str) –
Return type: result_set.ResultSetReader

whylogs.api.write(profile: whylogs.core.DatasetProfile, base_dir: str) → None#

Parameters

profile (whylogs.core.DatasetProfile) –
base_dir (str) –

Return type

None

whylogs.api#

Subpackages#

Submodules#

Package Contents#

Classes#

Functions#

`whylogs.api`#