Time Series Visualization

This module provides interactive time series visualization using Plotly, with support for multiple datasets and flexible customization options.

Overview

The time series visualization module includes two main functions:

  • visualize_ts_plotly() - Visualize multiple time series datasets with Plotly
  • visualize_ts_comparison() - Compare datasets with optional statistical overlays

These functions provide a flexible, interactive way to explore time series data with support for train/validation/test splits or any custom dataset groupings.

Installation

The time series visualization functions require plotly:

Using pip:

pip install plotly

Using uv:

uv pip install plotly

Quick Start

Basic Visualization

import pandas as pd
import numpy as np
from spotforecast2.preprocessing.time_series_visualization import visualize_ts_plotly

# Create sample datasets
np.random.seed(42)
dates_train = pd.date_range('2024-01-01', periods=100, freq='h')
dates_val = pd.date_range('2024-05-11', periods=50, freq='h')
dates_test = pd.date_range('2024-07-01', periods=30, freq='h')

data_train = pd.DataFrame({
    'temperature': np.random.normal(20, 5, 100),
    'humidity': np.random.normal(60, 10, 100)
}, index=dates_train)

data_val = pd.DataFrame({
    'temperature': np.random.normal(22, 5, 50),
    'humidity': np.random.normal(55, 10, 50)
}, index=dates_val)

data_test = pd.DataFrame({
    'temperature': np.random.normal(25, 5, 30),
    'humidity': np.random.normal(50, 10, 30)
}, index=dates_test)

# Visualize all datasets
dataframes = {
    'Train': data_train,
    'Validation': data_val,
    'Test': data_test
}

visualize_ts_plotly(dataframes)

Single Dataset Visualization

# Visualize a single dataset
dataframes_single = {'Data': data_train}
visualize_ts_plotly(dataframes_single, columns=['temperature'])

Custom Styling

# Customize colors and template
visualize_ts_plotly(
    dataframes,  # From Basic Visualization
    template='plotly_dark',
    colors={
        'Train': 'blue',
        'Validation': 'green',
        'Test': 'red'
    },
    figsize=(1400, 600)
)

API Reference

visualize_ts_plotly()

Visualize multiple time series datasets interactively with Plotly.

Signature:

def visualize_ts_plotly(
    dataframes: Dict[str, pd.DataFrame],
    columns: Optional[List[str]] = None,
    title_suffix: str = "",
    figsize: tuple[int, int] = (1000, 500),
    template: str = "plotly_white",
    colors: Optional[Dict[str, str]] = None,
    **kwargs: Any,
) -> None

Parameters:

Parameter Type Default Description
dataframes Dict[str, DataFrame] Required Dictionary mapping dataset names to DataFrames with datetime index
columns list[str] None Columns to visualize. If None, all columns are used
title_suffix str “” Suffix to append to column names in titles (e.g., “[°C]”)
figsize tuple[int, int] (1000, 500) Figure size as (width, height) in pixels
template str “plotly_white” Plotly template name (“plotly_white”, “plotly_dark”, “ggplot2”, etc.)
colors Dict[str, str] None Dictionary mapping dataset names to colors. If None, uses default colors
**kwargs Any - Additional arguments passed to go.Scatter() (e.g., fill=‘tozeroy’)

Returns:

None. Displays Plotly figures.

Raises:

  • ValueError - If dataframes dict is empty, contains empty DataFrames, or if specified columns don’t exist
  • ImportError - If plotly is not installed
  • TypeError - If dataframes parameter is not a dictionary

Example:

import pandas as pd
import numpy as np
from spotforecast2.preprocessing.time_series_visualization import visualize_ts_plotly

# Create sample data
np.random.seed(42)
dates_api = pd.date_range('2024-01-01', periods=100, freq='h')
df_api = pd.DataFrame({
    'temperature': np.random.normal(20, 5, 100),
    'humidity': np.random.normal(60, 10, 100)
}, index=dates_api)

# Visualize single dataset
visualize_ts_plotly({'Data': df_api})

visualize_ts_comparison()

Compare multiple datasets with optional statistical overlays.

Signature:

def visualize_ts_comparison(
    dataframes: Dict[str, pd.DataFrame],
    columns: Optional[List[str]] = None,
    title_suffix: str = "",
    figsize: tuple[int, int] = (1000, 500),
    template: str = "plotly_white",
    colors: Optional[Dict[str, str]] = None,
    show_mean: bool = False,
    **kwargs: Any,
) -> None

Parameters:

Parameter Type Default Description
dataframes Dict[str, DataFrame] Required Dictionary mapping dataset names to DataFrames
columns list[str] None Columns to visualize. If None, all columns are used
title_suffix str “” Suffix to append to titles
figsize tuple[int, int] (1000, 500) Figure size as (width, height) in pixels
template str “plotly_white” Plotly template
colors Dict[str, str] None Dictionary mapping dataset names to colors
show_mean bool False If True, overlay the mean of all datasets
**kwargs Any - Additional arguments for go.Scatter()

Returns:

None. Displays Plotly figures.

Raises:

  • ValueError - If dataframes dict is empty
  • ImportError - If plotly is not installed

Example:

import pandas as pd
import numpy as np
from spotforecast2.preprocessing.time_series_visualization import visualize_ts_comparison

# Create sample data
np.random.seed(42)
dates1 = pd.date_range('2024-01-01', periods=100, freq='h')
dates2 = pd.date_range('2024-05-11', periods=100, freq='h')

df1 = pd.DataFrame({
    'value': np.random.normal(20, 5, 100)
}, index=dates1)

df2 = pd.DataFrame({
    'value': np.random.normal(22, 5, 100)
}, index=dates2)

# Compare with mean overlay
visualize_ts_comparison(
    {'Dataset1': df1, 'Dataset2': df2},
    show_mean=True
)

API Reference (Matplotlib)

While visualize_ts_plotly provides highly dynamic web-interactivity, mathematical reporting often requires static, publication-ready vector graphics natively supported by Matplotlib.

plot_zoomed_timeseries()

Creates a two-panel vector plot: The top panel shows the full time series with a highlighted zoom region, and the bottom panel provides a detailed view of that focused local segment.

Signature:

def plot_zoomed_timeseries(
    data: pd.DataFrame,
    target: str,
    zoom: tuple[str, str],
    title: Optional[str] = None,
    figsize: tuple[int, int] = (8, 4),
    show: bool = True,
) -> plt.Figure:

Example:

import pandas as pd
import matplotlib.pyplot as plt
from spotforecast2.preprocessing.time_series_visualization import plot_zoomed_timeseries

dates = pd.date_range("2024-01-01", periods=100, freq="h")
df_zoom = pd.DataFrame({"value": range(100)}, index=dates)

fig1 = plot_zoomed_timeseries(
    data=df_zoom,
    target="value",
    zoom=("2024-01-02 00:00", "2024-01-03 00:00"),
    show=False
)
plt.close(fig1) # Prevent automatic display during test suites

plot_seasonality()

Evaluates cyclic distributions (annual, weekly, daily) using boxplots mapping density and target variance tracking.

Signature:

def plot_seasonality(
    data: pd.DataFrame,
    target: str,
    figsize: tuple[int, int] = (8, 5),
    show: bool = True,
    logscale: Union[bool, list[bool]] = False,
) -> plt.Figure:

Example:

from spotforecast2.preprocessing.time_series_visualization import plot_seasonality
import numpy as np

dates = pd.date_range("2024-01-01", periods=1000, freq="h")
# Simulating a basic daily sine wave
cyclical_data = np.sin(np.linspace(0, 50, 1000)) * 10 
df_season = pd.DataFrame({"value": cyclical_data}, index=dates)

fig2 = plot_seasonality(
    data=df_season, 
    target="value", 
    logscale=False, 
    show=False
)
plt.close(fig2)

Complete Workflow Examples

Train/Validation/Test Split Visualization

import pandas as pd
import numpy as np
from spotforecast2.preprocessing.time_series_visualization import visualize_ts_plotly

# Create time series data
np.random.seed(42)
wf_data = pd.DataFrame({
    'temperature': np.sin(np.linspace(0, 10, 300)) + np.random.normal(0, 0.1, 300),
    'humidity': np.cos(np.linspace(0, 10, 300)) * 100 + np.random.normal(50, 5, 300)
}, index=pd.date_range('2024-01-01', periods=300, freq='h'))

# Split data
split1 = int(0.6 * len(wf_data))
split2 = int(0.8 * len(wf_data))

wf_train = wf_data.iloc[:split1]
wf_val = wf_data.iloc[split1:split2]
wf_test = wf_data.iloc[split2:]

# Visualize
visualize_ts_plotly(
    {
        'Train': wf_train,
        'Validation': wf_val,
        'Test': wf_test
    },
    template='plotly_white',
    figsize=(1200, 600)
)

Multiple Datasets Comparison

from spotforecast2.preprocessing.time_series_visualization import visualize_ts_comparison

# Create datasets from different time periods
dates_w = pd.date_range('2024-01-01', periods=100, freq='h')
dates_s = pd.date_range('2024-04-01', periods=100, freq='h')
dates_m = pd.date_range('2024-07-01', periods=100, freq='h')

df_w = pd.DataFrame({'temperature': np.random.normal(15, 3, 100)}, index=dates_w)
df_s = pd.DataFrame({'temperature': np.random.normal(22, 3, 100)}, index=dates_s)
df_m = pd.DataFrame({'temperature': np.random.normal(25, 3, 100)}, index=dates_m)

# Compare with mean
visualize_ts_comparison(
    {
        'Winter': df_w,
        'Spring': df_s,
        'Summer': df_m
    },
    show_mean=True,
    colors={'Winter': 'blue', 'Spring': 'green', 'Summer': 'red'}
)

Dynamic Dataset Handling

# Function works with any number of datasets
dataframes_dyn = {}

for i in range(5):
    dates_dyn = pd.date_range(f'2024-{i+1:02d}-01', periods=50, freq='h')
    dataframes_dyn[f'Month_{i+1}'] = pd.DataFrame({
        'sales': np.random.gamma(2, 2, 50) * 1000
    }, index=dates_dyn)

visualize_ts_plotly(
    dataframes_dyn,
    title_suffix='[USD]',
    figsize=(1400, 600)
)

Parameters and Configuration

figsize Parameter

Figure size as (width, height) in pixels:

# Small figure
visualize_ts_plotly(dataframes_dyn, figsize=(800, 400))

# Large figure for detailed inspection
visualize_ts_plotly(dataframes_dyn, figsize=(1600, 800))

Template Options

Plotly provides several built-in templates:

# Light theme (default)
visualize_ts_plotly(dataframes_dyn, template='plotly_white')

# Dark theme
visualize_ts_plotly(dataframes_dyn, template='plotly_dark')

# Minimal theme
visualize_ts_plotly(dataframes_dyn, template='plotly')

# Other themes
visualize_ts_plotly(dataframes_dyn, template='ggplot2')
visualize_ts_plotly(dataframes_dyn, template='seaborn')

Color Customization

Define custom colors for each dataset:

colors_custom = {
    'Train': '#1f77b4',      # Blue
    'Validation': '#ff7f0e', # Orange
    'Test': '#2ca02c'        # Green
}

visualize_ts_plotly({
    'Train': wf_train,
    'Validation': wf_val,
    'Test': wf_test
}, colors=colors_custom)

Advanced Scatter Customization

Pass additional options to Plotly Scatter:

visualize_ts_plotly(
    dataframes_dyn,
    fill='tozeroy',           # Fill area under line
    line=dict(width=2),       # Line width
    opacity=0.8               # Transparency
)

Best Practices

1. Use Datetime Index

Always use pandas datetime index for proper time axis handling:

# Good
df_good = pd.DataFrame(df_api.values, columns=df_api.columns, index=pd.date_range('2024-01-01', periods=len(df_api), freq='h'))

# Avoid
df_bad = pd.DataFrame(df_api.values, columns=df_api.columns)  # Will use default integer index

2. Consistent Data Shapes

Ensure all DataFrames have consistent columns for comparison:

# Verify columns match
columns_shared = set(df_w.columns) & set(df_s.columns) & set(df_m.columns)
if not columns_shared:
    raise ValueError("DataFrames have no common columns")

3. Handle Large Datasets

For large time series, consider subsampling:

# Subsample every 10th point
df_sub = wf_data[::10]
visualize_ts_plotly({'Data': df_sub})

4. Meaningful Dataset Names

Use descriptive names for datasets:

# Good
dataframes_good = {
    'Training (2023)': wf_train,
    'Validation (Jan 2024)': wf_val,
    'Testing (Feb 2024)': wf_test
}

# Avoid
dataframes_bad = {
    'A': wf_train,
    'B': wf_val,
    'C': wf_test
}

Troubleshooting

Issue: Overlapping Datasets

If datasets overlap in time, use separate figures:

# Visualize one column at a time
for col in dataframes_dyn[list(dataframes_dyn.keys())[0]].columns:
    visualize_ts_plotly(dataframes_dyn, columns=[col])

Issue: Memory Issues with Large Datasets

Downsample before visualization:

# Downsample to hourly (using the dense season data from earlier)
df_downsampled = df_season.resample('1D').mean()
fig_down = plot_seasonality(df_downsampled, target="value", show=False)
plt.close(fig_down)

Issue: Missing Data in Visualization

Handle missing values before visualization:

# Forward fill missing values
df_filled = df_api.ffill()
visualize_ts_plotly({'Data': df_filled})

Testing

This module includes comprehensive pytest tests validating all documentation examples and API functionality. Tests are located in tests/test_docs_time_series_visualization_examples.py.

Running Tests

Run all time series visualization tests:

uv run pytest tests/test_docs_time_series_visualization_examples.py -v

Run specific test class:

uv run pytest tests/test_docs_time_series_visualization_examples.py::TestVisualizeTimeSeriesPlotlyBasic -v

Test Coverage

The test suite includes 50 comprehensive tests covering:

  • Basic Visualization (9 tests): Single/multiple dataset visualization, column selection, custom parameters
  • Comparison Functionality (6 tests): Dataset comparison, statistical overlays, customization
  • Complete Workflows (3 tests): Train/val/test split visualization, multi-dataset comparison, dynamic datasets
  • Parameters & Configuration (8 tests): figsize options, template variations, color customization
  • Best Practices (4 tests): Datetime index handling, consistent shapes, subsampling for large datasets
  • Edge Cases (7 tests): Single value, constant values, NaN handling, negative/large values, many columns
  • API Examples (5 tests): Quick start examples, API function validation
  • Data Integrity (3 tests): Index preservation, data value preservation, dataset independence
  • Safety-Critical (5 tests): Error handling, empty input validation, determinism

Test Validation Command

Verify all time series visualization tests pass:

uv run pytest tests/test_docs_time_series_visualization_examples.py --tb=short -q

Expected output: 50 passed

Documentation Examples Tested

All code examples in this documentation have been validated with pytest: - Quick start examples (all variants) - Complete workflow examples (train/val/test split, comparison, dynamic) - Parameter configuration examples (figsize, templates, colors) - Best practices examples (datetime index, consistent shapes, large datasets) - Troubleshooting examples (overlapping datasets, memory issues, missing data)

See Also

References

  • Plotly Dash and Plotly.py documentation: https://plotly.com/python/
  • Pandas datetime index: https://pandas.pydata.org/docs/user_guide/timeseries.html