đźš© Create a free WhyLabs account to get more value out of whylogs!

Did you know you can store, visualize, and monitor whylogs profiles with theWhyLabs Observability Platform? Sign up for afree WhyLabs accountto leverage the power of whylogs and WhyLabs together!

(why)Logging#

Open in Colab

WhyLogs enables logging different types of data that can then be used to monitor the data. We’ll go through examples on different types of data to log and go more in depth on different options. Before we get going though, let’s import whylogs.

[5]:
# Note: you may need to restart the kernel to use updated packages.
%pip install whylogs
WARNING: You are using pip version 22.0.3; however, version 22.1 is available.
You should consider upgrading via the '/Users/melanie/Dev/whylogs-v1/python/.venv/bin/python -m pip install --upgrade pip' command.

Table of Contents#

Log Pandas Dataframe | Log Dictionary | Display Logs#

Log Pandas DataFrame#

We will be generating log by importing data from a CSV into Pandas Dataframe, logging it with the whylogs python library.

[6]:
import os.path
import pandas as pd

# Read in a CSV, this one is from a public bucket on s3
retail_daily = pd.read_csv('https://whylabs-public.s3.us-west-2.amazonaws.com/whylogs_examples/retail-daily-features.csv')
retail_daily
[6]:
Transaction ID Customer ID Product Subcategory Code Product Category Code Item Price Total Tax Total Amount Store Type Product Category Product Subcategory Date of Birth Gender City Code Age at Transaction Date Purchase Canceled Transaction Day of Week Transaction Week Transaction Batch
0 T25601292314 C268458 12 6 114.9 24.1290 253.9290 TeleShop Home and kitchen Tools 1976-10-08 M 1.0 36.0 0.0 0 0 0
1 T1465175267 C271344 3 5 107.7 22.6170 238.0170 e-Shop Books Comics 1970-01-29 F 5.0 43.0 0.0 0 0 0
2 T4968790114 C272305 4 3 14.6 7.6650 80.6650 e-Shop Electronics Mobiles 1975-08-25 F 10.0 37.0 0.0 0 0 0
3 T50504166310 C275057 4 4 15.7 4.9455 52.0455 MBR Bags Women 1980-09-17 M 7.0 32.0 0.0 0 0 0
4 T10877729712 C270074 10 5 144.1 45.3915 477.6915 e-Shop Books Non-Fiction 1983-02-20 M 10.0 30.0 0.0 0 0 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
904 T89167826318 C274270 10 5 68.2 14.3220 150.7220 Flagship store Books Non-Fiction 1972-06-06 F 1.0 40.0 0.0 0 0 0
905 T87193008634 C271051 11 6 124.2 52.1640 548.9640 e-Shop Home and kitchen Bath 1976-02-13 F 3.0 36.0 0.0 0 0 0
906 T84036395834 C270763 11 5 77.9 16.3590 172.1590 TeleShop Books Children 1991-02-10 F 8.0 21.0 0.0 0 0 0
907 T72150045625 C270432 11 5 11.8 4.9560 52.1560 e-Shop Books Children 1982-09-17 M 7.0 30.0 1.0 0 0 0
908 T97942600110 C269559 4 3 67.9 35.6475 375.1475 e-Shop Electronics Mobiles 1972-06-24 M 1.0 40.0 0.0 0 0 0

909 rows Ă— 18 columns

[7]:
import whylogs as why

# Log the data frame. This equivalent to why.log(retail_daily) and why.log(data=retail_daily)
results = why.log(pandas=retail_daily)

# Get the Results
profile = results.profile()

# Head down to Display a Log for explination
profile.view().to_pandas()
[7]:
counts/n counts/null types/integral types/fractional types/boolean types/string types/object cardinality/est cardinality/upper_1 cardinality/lower_1 ... distribution/min distribution/q_10 distribution/q_25 distribution/median distribution/q_75 distribution/q_90 type ints/max ints/min frequent_items/frequent_strings
column
Purchase Canceled 909 72 0 837 0 0 0 2.000000 2.000100 2.000000 ... 0.000 0.0000 0.000 0.0000 0.0000 0.0000 SummaryType.COLUMN NaN NaN NaN
Age at Transaction Date 909 0 0 909 0 0 0 25.000001 25.001250 25.000000 ... 19.000 21.0000 25.000 31.0000 37.0000 40.0000 SummaryType.COLUMN NaN NaN NaN
Transaction Week 909 0 909 0 0 0 0 1.000000 1.000050 1.000000 ... 0.000 0.0000 0.000 0.0000 0.0000 0.0000 SummaryType.COLUMN 0.0 0.0 [FrequentItem(value='0.000000', est=909, upper...
Store Type 909 0 0 0 0 909 0 4.000000 4.000200 4.000000 ... NaN NaN NaN NaN NaN NaN SummaryType.COLUMN NaN NaN [FrequentItem(value='e-Shop', est=375, upper=3...
Product Category 909 0 0 0 0 909 0 6.000000 6.000300 6.000000 ... NaN NaN NaN NaN NaN NaN SummaryType.COLUMN NaN NaN [FrequentItem(value='Books', est=232, upper=23...
Gender 909 0 0 0 0 909 0 2.000000 2.000100 2.000000 ... NaN NaN NaN NaN NaN NaN SummaryType.COLUMN NaN NaN [FrequentItem(value='M', est=455, upper=455, l...
Transaction ID 909 0 0 0 0 909 0 904.722898 916.565225 893.168643 ... NaN NaN NaN NaN NaN NaN SummaryType.COLUMN NaN NaN [FrequentItem(value='T40336799311', est=3, upp...
Item Price 909 0 0 909 0 0 0 672.542875 681.346093 663.953801 ... 7.100 18.2000 43.200 80.1000 116.2000 137.0000 SummaryType.COLUMN NaN NaN NaN
Total Tax 909 0 0 909 0 0 0 800.975225 811.459552 790.745935 ... 0.861 4.8825 10.017 20.5170 36.1725 54.2640 SummaryType.COLUMN NaN NaN NaN
Product Category Code 909 0 909 0 0 0 0 6.000000 6.000300 6.000000 ... 1.000 1.0000 2.000 4.0000 5.0000 6.0000 SummaryType.COLUMN 6.0 1.0 [FrequentItem(value='5.000000', est=232, upper...
Transaction Day of Week 909 0 909 0 0 0 0 1.000000 1.000050 1.000000 ... 0.000 0.0000 0.000 0.0000 0.0000 0.0000 SummaryType.COLUMN 0.0 0.0 [FrequentItem(value='0.000000', est=909, upper...
City Code 909 1 0 908 0 0 0 10.000000 10.000500 10.000000 ... 1.000 1.0000 3.000 5.0000 8.0000 10.0000 SummaryType.COLUMN NaN NaN NaN
Transaction Batch 909 0 909 0 0 0 0 1.000000 1.000050 1.000000 ... 0.000 0.0000 0.000 0.0000 0.0000 0.0000 SummaryType.COLUMN 0.0 0.0 [FrequentItem(value='0.000000', est=909, upper...
Date of Birth 909 0 0 0 0 909 0 801.978113 812.475567 791.736016 ... NaN NaN NaN NaN NaN NaN SummaryType.COLUMN NaN NaN [FrequentItem(value='1981-03-29', est=4, upper...
Customer ID 909 0 0 0 0 909 0 847.420398 858.512667 836.597955 ... NaN NaN NaN NaN NaN NaN SummaryType.COLUMN NaN NaN [FrequentItem(value='C274278', est=4, upper=3,...
Total Amount 909 0 0 909 0 0 0 842.098548 853.121157 831.344071 ... -767.975 25.3045 82.433 188.4025 358.6830 555.2625 SummaryType.COLUMN NaN NaN NaN
Product Subcategory 909 0 0 0 0 909 0 18.000001 18.000899 18.000000 ... NaN NaN NaN NaN NaN NaN SummaryType.COLUMN NaN NaN [FrequentItem(value='Women', est=133, upper=13...
Product Subcategory Code 909 0 909 0 0 0 0 12.000000 12.000599 12.000000 ... 1.000 1.0000 3.000 5.0000 10.0000 11.0000 SummaryType.COLUMN 12.0 1.0 [FrequentItem(value='4.000000', est=148, upper...

18 rows Ă— 24 columns

Log Dictionary#

Sometimes a quick log is all you need though and don’t want to set up a DataFrame. We can log a dictionary as if it were a single row of data. This works best when the values of that dictionary are scalar data, any collection values or nested values will be tracked with only a basic type counter and these entries get mapped to the object count.

Suppose we want to log art prints that are being shown to see what sells best.

[8]:
import whylogs as why

example_data = {"height": 100, "length": 1000, "status": "sold", "price": 58.00, "medium": ["watercolor", "digital"] }

# Log the dictionary this is equivalent to why.log(example_data)
dict_results = why.log(row=example_data)

# Retrieve the profile
profile_from_dict = dict_results.profile()

# Head to Display Logs to explain
profile_from_dict.view().to_pandas()
[8]:
counts/n counts/null types/integral types/fractional types/boolean types/string types/object cardinality/est cardinality/upper_1 cardinality/lower_1 ... distribution/n distribution/max distribution/min distribution/q_10 distribution/q_25 distribution/median distribution/q_75 distribution/q_90 ints/max ints/min
column
status 1 0 0 0 0 1 0 1.0 1.00005 1.0 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
medium 1 0 0 0 0 0 1 NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
price 1 0 0 1 0 0 0 1.0 1.00005 1.0 ... 1.0 58.0 58.0 58.0 58.0 58.0 58.0 58.0 NaN NaN
height 1 0 1 0 0 0 0 1.0 1.00005 1.0 ... 1.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0
length 1 0 1 0 0 0 0 1.0 1.00005 1.0 ... 1.0 1000.0 1000.0 1000.0 1000.0 1000.0 1000.0 1000.0 1000.0 1000.0

5 rows Ă— 24 columns

Display the Logs#

There are many ways to display the data! Examples in “Integrations”, “WhyLabs”, and “Use Cases” showcase how to use a variety of tools to see your data. Also the Notebook_Profile_Visualizer helps you display the profile with a variety of charts.

Your log will from above returns results including the profile. It’s this profile that we can view and export as a Pandas DataFrame.

[9]:
# Note run any of the examples above to get the results for this block

#grab profile from result set
profile = results.profile()

#grab a 'view' of the profile for inspection
prof_view = profile.view()

#inspect profile as a Pandas DataFrame
prof_df = prof_view.to_pandas()
prof_df
[9]:
counts/n counts/null types/integral types/fractional types/boolean types/string types/object cardinality/est cardinality/upper_1 cardinality/lower_1 ... distribution/min distribution/q_10 distribution/q_25 distribution/median distribution/q_75 distribution/q_90 type ints/max ints/min frequent_items/frequent_strings
column
Purchase Canceled 909 72 0 837 0 0 0 2.000000 2.000100 2.000000 ... 0.000 0.0000 0.000 0.0000 0.0000 0.0000 SummaryType.COLUMN NaN NaN NaN
Age at Transaction Date 909 0 0 909 0 0 0 25.000001 25.001250 25.000000 ... 19.000 21.0000 25.000 31.0000 37.0000 40.0000 SummaryType.COLUMN NaN NaN NaN
Transaction Week 909 0 909 0 0 0 0 1.000000 1.000050 1.000000 ... 0.000 0.0000 0.000 0.0000 0.0000 0.0000 SummaryType.COLUMN 0.0 0.0 [FrequentItem(value='0.000000', est=909, upper...
Store Type 909 0 0 0 0 909 0 4.000000 4.000200 4.000000 ... NaN NaN NaN NaN NaN NaN SummaryType.COLUMN NaN NaN [FrequentItem(value='e-Shop', est=375, upper=3...
Product Category 909 0 0 0 0 909 0 6.000000 6.000300 6.000000 ... NaN NaN NaN NaN NaN NaN SummaryType.COLUMN NaN NaN [FrequentItem(value='Books', est=232, upper=23...
Gender 909 0 0 0 0 909 0 2.000000 2.000100 2.000000 ... NaN NaN NaN NaN NaN NaN SummaryType.COLUMN NaN NaN [FrequentItem(value='M', est=455, upper=455, l...
Transaction ID 909 0 0 0 0 909 0 904.722898 916.565225 893.168643 ... NaN NaN NaN NaN NaN NaN SummaryType.COLUMN NaN NaN [FrequentItem(value='T40336799311', est=3, upp...
Item Price 909 0 0 909 0 0 0 672.542875 681.346093 663.953801 ... 7.100 18.2000 43.200 80.1000 116.2000 137.0000 SummaryType.COLUMN NaN NaN NaN
Total Tax 909 0 0 909 0 0 0 800.975225 811.459552 790.745935 ... 0.861 4.8825 10.017 20.5170 36.1725 54.2640 SummaryType.COLUMN NaN NaN NaN
Product Category Code 909 0 909 0 0 0 0 6.000000 6.000300 6.000000 ... 1.000 1.0000 2.000 4.0000 5.0000 6.0000 SummaryType.COLUMN 6.0 1.0 [FrequentItem(value='5.000000', est=232, upper...
Transaction Day of Week 909 0 909 0 0 0 0 1.000000 1.000050 1.000000 ... 0.000 0.0000 0.000 0.0000 0.0000 0.0000 SummaryType.COLUMN 0.0 0.0 [FrequentItem(value='0.000000', est=909, upper...
City Code 909 1 0 908 0 0 0 10.000000 10.000500 10.000000 ... 1.000 1.0000 3.000 5.0000 8.0000 10.0000 SummaryType.COLUMN NaN NaN NaN
Transaction Batch 909 0 909 0 0 0 0 1.000000 1.000050 1.000000 ... 0.000 0.0000 0.000 0.0000 0.0000 0.0000 SummaryType.COLUMN 0.0 0.0 [FrequentItem(value='0.000000', est=909, upper...
Date of Birth 909 0 0 0 0 909 0 801.978113 812.475567 791.736016 ... NaN NaN NaN NaN NaN NaN SummaryType.COLUMN NaN NaN [FrequentItem(value='1981-03-29', est=4, upper...
Customer ID 909 0 0 0 0 909 0 847.420398 858.512667 836.597955 ... NaN NaN NaN NaN NaN NaN SummaryType.COLUMN NaN NaN [FrequentItem(value='C274278', est=4, upper=3,...
Total Amount 909 0 0 909 0 0 0 842.098548 853.121157 831.344071 ... -767.975 25.3045 82.433 188.4025 358.6830 555.2625 SummaryType.COLUMN NaN NaN NaN
Product Subcategory 909 0 0 0 0 909 0 18.000001 18.000899 18.000000 ... NaN NaN NaN NaN NaN NaN SummaryType.COLUMN NaN NaN [FrequentItem(value='Women', est=133, upper=13...
Product Subcategory Code 909 0 909 0 0 0 0 12.000000 12.000599 12.000000 ... 1.000 1.0000 3.000 5.0000 10.0000 11.0000 SummaryType.COLUMN 12.0 1.0 [FrequentItem(value='4.000000', est=148, upper...

18 rows Ă— 24 columns