Examples#

Welcome to our examples! If you want to get your hands dirty, check out the Getting Started Notebook.

๐Ÿง‘๐Ÿผโ€๐Ÿซ Basic examples#

In the table below you will find different use cases for whylogs that will help you get started understanding what whylogs can do to make your data and ML pipelines more reliable and sustainable.

Example

Description

Visualizing Profiles

Compare profiles to detect distribution shifts, visualize histograms and bar charts and explore your data.

Logging Data

See the different ways you can log your data with whylogs.

Inspecting Profiles

A deeper dive on the metrics generated by whylogs.

Schema Configuration for Tracking Metrics

Configure tracking metrics according to data type or column features.

Constraints Suite

A collection of simple out-of-the-box constraints for the most common use-cases.

Merging Profiles

Merge your profiles logged across different computing instances, time periods or data segments.

๐ŸŒ‰ Whylogs Integrations#

Welcome! In this section you will find examples on how to integrate whylogsโ€™ with different tools and platforms.

Data Pipelines#

Integration

Description

Apache Spark

Profile data in an Apache Spark environment

BigQuery

Profile data queried from a Google BigQuery table

Dask

Profile data in parallel with Dask

Databricks

Learn how to configure and run whylogs on a Databricks cluster

Fugue

Use Fugue to unify parallel whylogs profiling tasks

Kafka

Learn how to consume and profile streaming data from an existing Kafka topic

Ray

Profile Big Data in parallel with the Ray integration

Storage#

Integration

Description

s3

See how to write your whylogs profiles to AWS S3 object storage

GCS

See how to write your whylogs profiles to the Google Cloud Storage

Model lifecycle and deployment#

Integration

Description

Apache Airflow

Use Airflow Operators to create drift reports and run contraint validations on your data

BentoML

Learn how monitor ML models managed and served with BentoML

FastAPI

Learn how monitor ML models served with FastAPI

Feast

Learn how to log features from your Feature Store with Feast and whylogs

Flask

See how you can create a Flask app with this whylogs + WhyLabs integration

Flyte

Learn how to use whylogsโ€™ DatasetProfileView type natively on your Flyte workflows

Github Actions

Monitor your ML datasets as part of your GitOps CI/CD pipeline

MLflow

Log your whylogs profiles to an MLflow experiment

ZenML

Combine different MLOps tools together with ZenML and whylogs!

Whylabs#

You can monitor your profiles continuously with the WhyLabs Observability Platform, and have a single view of your different projects, data and ML models. To learn more how you can combine whylogs with WhyLabs and send over different profiles, refer to these following integration examples:

Integration

Description

Writing profiles

Send profiles to your WhyLabs Dashboard

Reference Profile

Send profiles as Reference (Static) Profiles to WhyLabs

Regression Metrics

Monitor Regression Model Performance Metrics with whylogs and WhyLabs

Classification Metrics

Monitor Classification Model Performance Metrics with whylogs and WhyLabs

Ranking Metrics

Monitor Ranking Model Performance Metrics with whylogs and WhyLabs (experimental)

Writing Feature Weights

Send Feature Weights / Feature Importance information to your WhyLabs Dashboard

Others#

Integration

Description

whylogs Container

A low code solution to profile your data with a Docker container deployed to your environment

Java

Profile data with whylogs with Java

๐Ÿง‘๐Ÿผโ€๐Ÿ”ฌ Advanced examples#

Here you will find more advanced use-cases for whylogs, and you will learn how to make the most out of your created profiles. Hop on to any example in the table down below to get started.

Example

Description

Streaming Data with Log Rotation

Generate profiles automatically at fixed intervals with rolling loggers

Condition Count Metrics

Create simple counter metrics with user-defined conditions

Condition Validators

Real-time Data Validation with Condition Validators.

Data Constraints

Set constraints to your data to ensure its quality.

Custom Metrics

Create your own metrics and metric components

String Tracking

Track unicode ranges and character length distribution metrics for your textual features.

Image Logging

Log image properties and EXIF tags into profiles and send them to WhyLabs

Segments

Segment your data to improve visibility to the sub-group level

Metric Constraints with Condition Count Metrics

Build Metric Constraints on top of Condition Count Metrics

Drift Algorithm Configuration

Choose different drift algorithms and internal parameters for drift detection

Converting profiles from v0 to v1

Convert whylogs v0 profiles to v1 profiles

๐Ÿงช Experimental#

Here you will find examples of features that are still on an experimental stage. Expect changes on the API and the functionality of these features.

Example

Description

Performance Estimation - Estimating Accuracy for Binary Classification Problems

Estimate accuracy for unlabeled target datasets for binary classification problems

Extracting and Monitoring Audio Samples

Extract features from audio samples for the purpose of monitoring for drift/quality

NLP Summarization

Monitor a document summarization task with whylogs

Embeddings Distance Logging

Profile embedding values by comparing them to reference data points

Condition Validator UDFs

Easily create condition validators based on user-defined functions

๐Ÿ““ Benchmarks#

Here you will find experiments to benchmark different aspect of the whylogs package, such as computational performance and different statistical algorithms.

Example

Description

Understanding Kolmogorov-Smirnov (KS) Tests for Data Drift on Profiled Data

Experiments comparing between Kolmogorov-Smirnov whylogsโ€™ implementation on profiled data and traditional implementation on complete data

๐Ÿซ Tutorials#

Here you will find tutorials that can span two or more concepts discussed in the previous sections. These tutorials are meant to be a more in-depth, and possibly domain-specific, explanation of the concepts discussed in the previous sections.

Example

Description

Data Validation for Spark Dataframes with whylogs

Profile a Spark Dataframe and Perform Data Validation with Condition Count Metrics and Metric Constraints

Monitoring Embeddings for Text Data

Monitor Embeddings, Tokens and Performance of your text classifier application

Data Validation at Scale - Detecting and Responding to Data Misbehavior

Log, validate, and debug failed conditions with Metric Constraints, Condition Count Metrics and Condition Validators

Get in touch#

If you want to get more involved with whylogs adn interact with other practitioners, make sure to join our community Slack