whylogs.datasets.weather#

Module Contents#

Classes#

Weather

Weather Forecast Dataset

WeatherDatasetIterator

Iterator to retrieve inference batches, when multiple batches are required.

Attributes#

whylogs.datasets.weather.logger#
whylogs.datasets.weather.base_config#
class whylogs.datasets.weather.Weather(version: str = 'in_domain')#

Bases: whylogs.datasets.base.Dataset

Weather Forecast Dataset

The Weather Forecast Dataset contains meteorological features at a particular place (defined by latitude and longitude features) and time. This dataset can present data distribution shifts over both time and space.

The original data was sourced from the Weather Prediction Dataset. From the source data additional transformations were made, such as: feature renaming, feature selection and subsampling. The original dataset is described in Shifts: A Dataset of Real Distributional Shift Across Multiple Large-Scale Tasks, by Malinin, Andrey, et al.

For a detailed description, please use the dataset’s describe() method or visit whylog’s documentation website.

Parameters

version (str) –

baseline_df: pandas.DataFrame#
inference_df: pandas.DataFrame#
inference_interval: str = '1d'#
number_days: int = 1#
unit: str = 'D'#
url: str#
baseline_timestamp: Union[datetime.date, datetime.datetime]#
inference_start_timestamp: Union[datetime.date, datetime.datetime]#
original: bool = False#
dataset_config: Optional[whylogs.datasets.configs.DatasetConfig]#
classmethod config() whylogs.datasets.configs.DatasetConfig#
Return type

whylogs.datasets.configs.DatasetConfig

get_baseline() whylogs.datasets.base.Batch#

Get baseline Batch object.

Returns

A batch object representing the complete baseline data.

Return type

Batch

get_inference_data(target_date: Optional[Union[datetime.date, datetime.datetime]] = None, number_batches: Optional[int] = None) Union[whylogs.datasets.base.Batch, Iterable[whylogs.datasets.base.Batch]]#

Get batch(es) from inference dataset.

Parameters
  • target_date (Optional[Union[date, datetime]], optional) – Target date for single batch. If datetime is passed, only date will be considered, by default None

  • number_batches (Optional[int], optional) – Number of batches to be retrieved. Each batch will have a time interval as defined by inference_interval from set_parameters. By default None

Returns

Can return a single batch or an interator of batches, depending on input parameters

Return type

Union[Batch, Iterable[Batch]]

set_parameters(inference_interval: Optional[str] = None, baseline_timestamp: Optional[Union[datetime.date, datetime.datetime]] = None, inference_start_timestamp: Optional[Union[datetime.date, datetime.datetime]] = None, original: Optional[bool] = None) None#

Set timestamp and interval parameters for the dataset object.

Parameters
  • inference_interval (Optional[str], optional) – Interval for the inference batches. If none is passed, daily inference batches will be returned, by default None

  • baseline_timestamp (Optional[Union[date, datetime]], optional) – Timestamp for the baseline dataset. If none is passed, timestamp will be equal to the current day, by default None

  • inference_start_timestamp (Optional[Union[date, datetime]], optional) – Timestamp for the start of the inference dataset. If none is passed, timestamp will be equal to tomorrow’s date, by default None

  • original (Optional[bool], optional) – _If true, sets both baseline and inference timestamps to the dataset’s original timestamp, by default None

Return type

None

classmethod describe_versions() Tuple[str]#

Describe available versions for the given dataset.

Return type

Tuple[str]

classmethod describe() Optional[str]#

Display overall dataset description.

Return type

Optional[str]

class whylogs.datasets.weather.WeatherDatasetIterator(df: pandas.DataFrame, number_days: int, number_batches: int, version: str, config=DatasetConfig)#

Iterator to retrieve inference batches, when multiple batches are required.

Parameters