import pandas as pd
import numpy as np
from spotforecast2.preprocessing.time_series_visualization import visualize_ts_plotly
# Create sample datasets
np.random.seed(42)
dates_train = pd.date_range('2024-01-01', periods=100, freq='h')
dates_val = pd.date_range('2024-05-11', periods=50, freq='h')
dates_test = pd.date_range('2024-07-01', periods=30, freq='h')
data_train = pd.DataFrame({
'temperature': np.random.normal(20, 5, 100),
'humidity': np.random.normal(60, 10, 100)
}, index=dates_train)
data_val = pd.DataFrame({
'temperature': np.random.normal(22, 5, 50),
'humidity': np.random.normal(55, 10, 50)
}, index=dates_val)
data_test = pd.DataFrame({
'temperature': np.random.normal(25, 5, 30),
'humidity': np.random.normal(50, 10, 30)
}, index=dates_test)
# Visualize all datasets
dataframes = {
'Train': data_train,
'Validation': data_val,
'Test': data_test
}
visualize_ts_plotly(dataframes)Time Series Visualization
This module provides interactive time series visualization using Plotly, with support for multiple datasets and flexible customization options.
Overview
The time series visualization module includes two main functions:
visualize_ts_plotly()- Visualize multiple time series datasets with Plotlyvisualize_ts_comparison()- Compare datasets with optional statistical overlays
These functions provide a flexible, interactive way to explore time series data with support for train/validation/test splits or any custom dataset groupings.
Installation
The time series visualization functions require plotly:
Using pip:
pip install plotlyUsing uv:
uv pip install plotlyQuick Start
Basic Visualization
Single Dataset Visualization
# Visualize a single dataset
dataframes_single = {'Data': data_train}
visualize_ts_plotly(dataframes_single, columns=['temperature'])Custom Styling
# Customize colors and template
visualize_ts_plotly(
dataframes, # From Basic Visualization
template='plotly_dark',
colors={
'Train': 'blue',
'Validation': 'green',
'Test': 'red'
},
figsize=(1400, 600)
)API Reference
visualize_ts_plotly()
Visualize multiple time series datasets interactively with Plotly.
Signature:
def visualize_ts_plotly(
dataframes: Dict[str, pd.DataFrame],
columns: Optional[List[str]] = None,
title_suffix: str = "",
figsize: tuple[int, int] = (1000, 500),
template: str = "plotly_white",
colors: Optional[Dict[str, str]] = None,
**kwargs: Any,
) -> NoneParameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
dataframes |
Dict[str, DataFrame] | Required | Dictionary mapping dataset names to DataFrames with datetime index |
columns |
list[str] | None | Columns to visualize. If None, all columns are used |
title_suffix |
str | “” | Suffix to append to column names in titles (e.g., “[°C]”) |
figsize |
tuple[int, int] | (1000, 500) | Figure size as (width, height) in pixels |
template |
str | “plotly_white” | Plotly template name (“plotly_white”, “plotly_dark”, “ggplot2”, etc.) |
colors |
Dict[str, str] | None | Dictionary mapping dataset names to colors. If None, uses default colors |
**kwargs |
Any | - | Additional arguments passed to go.Scatter() (e.g., fill=‘tozeroy’) |
Returns:
None. Displays Plotly figures.
Raises:
ValueError- If dataframes dict is empty, contains empty DataFrames, or if specified columns don’t existImportError- If plotly is not installedTypeError- If dataframes parameter is not a dictionary
Example:
import pandas as pd
import numpy as np
from spotforecast2.preprocessing.time_series_visualization import visualize_ts_plotly
# Create sample data
np.random.seed(42)
dates_api = pd.date_range('2024-01-01', periods=100, freq='h')
df_api = pd.DataFrame({
'temperature': np.random.normal(20, 5, 100),
'humidity': np.random.normal(60, 10, 100)
}, index=dates_api)
# Visualize single dataset
visualize_ts_plotly({'Data': df_api})visualize_ts_comparison()
Compare multiple datasets with optional statistical overlays.
Signature:
def visualize_ts_comparison(
dataframes: Dict[str, pd.DataFrame],
columns: Optional[List[str]] = None,
title_suffix: str = "",
figsize: tuple[int, int] = (1000, 500),
template: str = "plotly_white",
colors: Optional[Dict[str, str]] = None,
show_mean: bool = False,
**kwargs: Any,
) -> NoneParameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
dataframes |
Dict[str, DataFrame] | Required | Dictionary mapping dataset names to DataFrames |
columns |
list[str] | None | Columns to visualize. If None, all columns are used |
title_suffix |
str | “” | Suffix to append to titles |
figsize |
tuple[int, int] | (1000, 500) | Figure size as (width, height) in pixels |
template |
str | “plotly_white” | Plotly template |
colors |
Dict[str, str] | None | Dictionary mapping dataset names to colors |
show_mean |
bool | False | If True, overlay the mean of all datasets |
**kwargs |
Any | - | Additional arguments for go.Scatter() |
Returns:
None. Displays Plotly figures.
Raises:
ValueError- If dataframes dict is emptyImportError- If plotly is not installed
Example:
import pandas as pd
import numpy as np
from spotforecast2.preprocessing.time_series_visualization import visualize_ts_comparison
# Create sample data
np.random.seed(42)
dates1 = pd.date_range('2024-01-01', periods=100, freq='h')
dates2 = pd.date_range('2024-05-11', periods=100, freq='h')
df1 = pd.DataFrame({
'value': np.random.normal(20, 5, 100)
}, index=dates1)
df2 = pd.DataFrame({
'value': np.random.normal(22, 5, 100)
}, index=dates2)
# Compare with mean overlay
visualize_ts_comparison(
{'Dataset1': df1, 'Dataset2': df2},
show_mean=True
)API Reference (Matplotlib)
While visualize_ts_plotly provides highly dynamic web-interactivity, mathematical reporting often requires static, publication-ready vector graphics natively supported by Matplotlib.
plot_zoomed_timeseries()
Creates a two-panel vector plot: The top panel shows the full time series with a highlighted zoom region, and the bottom panel provides a detailed view of that focused local segment.
Signature:
def plot_zoomed_timeseries(
data: pd.DataFrame,
target: str,
zoom: tuple[str, str],
title: Optional[str] = None,
figsize: tuple[int, int] = (8, 4),
show: bool = True,
) -> plt.Figure:Example:
import pandas as pd
import matplotlib.pyplot as plt
from spotforecast2.preprocessing.time_series_visualization import plot_zoomed_timeseries
dates = pd.date_range("2024-01-01", periods=100, freq="h")
df_zoom = pd.DataFrame({"value": range(100)}, index=dates)
fig1 = plot_zoomed_timeseries(
data=df_zoom,
target="value",
zoom=("2024-01-02 00:00", "2024-01-03 00:00"),
show=False
)
plt.close(fig1) # Prevent automatic display during test suitesplot_seasonality()
Evaluates cyclic distributions (annual, weekly, daily) using boxplots mapping density and target variance tracking.
Signature:
def plot_seasonality(
data: pd.DataFrame,
target: str,
figsize: tuple[int, int] = (8, 5),
show: bool = True,
logscale: Union[bool, list[bool]] = False,
) -> plt.Figure:Example:
from spotforecast2.preprocessing.time_series_visualization import plot_seasonality
import numpy as np
dates = pd.date_range("2024-01-01", periods=1000, freq="h")
# Simulating a basic daily sine wave
cyclical_data = np.sin(np.linspace(0, 50, 1000)) * 10
df_season = pd.DataFrame({"value": cyclical_data}, index=dates)
fig2 = plot_seasonality(
data=df_season,
target="value",
logscale=False,
show=False
)
plt.close(fig2)Complete Workflow Examples
Train/Validation/Test Split Visualization
import pandas as pd
import numpy as np
from spotforecast2.preprocessing.time_series_visualization import visualize_ts_plotly
# Create time series data
np.random.seed(42)
wf_data = pd.DataFrame({
'temperature': np.sin(np.linspace(0, 10, 300)) + np.random.normal(0, 0.1, 300),
'humidity': np.cos(np.linspace(0, 10, 300)) * 100 + np.random.normal(50, 5, 300)
}, index=pd.date_range('2024-01-01', periods=300, freq='h'))
# Split data
split1 = int(0.6 * len(wf_data))
split2 = int(0.8 * len(wf_data))
wf_train = wf_data.iloc[:split1]
wf_val = wf_data.iloc[split1:split2]
wf_test = wf_data.iloc[split2:]
# Visualize
visualize_ts_plotly(
{
'Train': wf_train,
'Validation': wf_val,
'Test': wf_test
},
template='plotly_white',
figsize=(1200, 600)
)Multiple Datasets Comparison
from spotforecast2.preprocessing.time_series_visualization import visualize_ts_comparison
# Create datasets from different time periods
dates_w = pd.date_range('2024-01-01', periods=100, freq='h')
dates_s = pd.date_range('2024-04-01', periods=100, freq='h')
dates_m = pd.date_range('2024-07-01', periods=100, freq='h')
df_w = pd.DataFrame({'temperature': np.random.normal(15, 3, 100)}, index=dates_w)
df_s = pd.DataFrame({'temperature': np.random.normal(22, 3, 100)}, index=dates_s)
df_m = pd.DataFrame({'temperature': np.random.normal(25, 3, 100)}, index=dates_m)
# Compare with mean
visualize_ts_comparison(
{
'Winter': df_w,
'Spring': df_s,
'Summer': df_m
},
show_mean=True,
colors={'Winter': 'blue', 'Spring': 'green', 'Summer': 'red'}
)Dynamic Dataset Handling
# Function works with any number of datasets
dataframes_dyn = {}
for i in range(5):
dates_dyn = pd.date_range(f'2024-{i+1:02d}-01', periods=50, freq='h')
dataframes_dyn[f'Month_{i+1}'] = pd.DataFrame({
'sales': np.random.gamma(2, 2, 50) * 1000
}, index=dates_dyn)
visualize_ts_plotly(
dataframes_dyn,
title_suffix='[USD]',
figsize=(1400, 600)
)Parameters and Configuration
figsize Parameter
Figure size as (width, height) in pixels:
# Small figure
visualize_ts_plotly(dataframes_dyn, figsize=(800, 400))
# Large figure for detailed inspection
visualize_ts_plotly(dataframes_dyn, figsize=(1600, 800))Template Options
Plotly provides several built-in templates:
# Light theme (default)
visualize_ts_plotly(dataframes_dyn, template='plotly_white')
# Dark theme
visualize_ts_plotly(dataframes_dyn, template='plotly_dark')
# Minimal theme
visualize_ts_plotly(dataframes_dyn, template='plotly')
# Other themes
visualize_ts_plotly(dataframes_dyn, template='ggplot2')
visualize_ts_plotly(dataframes_dyn, template='seaborn')Color Customization
Define custom colors for each dataset:
colors_custom = {
'Train': '#1f77b4', # Blue
'Validation': '#ff7f0e', # Orange
'Test': '#2ca02c' # Green
}
visualize_ts_plotly({
'Train': wf_train,
'Validation': wf_val,
'Test': wf_test
}, colors=colors_custom)Advanced Scatter Customization
Pass additional options to Plotly Scatter:
visualize_ts_plotly(
dataframes_dyn,
fill='tozeroy', # Fill area under line
line=dict(width=2), # Line width
opacity=0.8 # Transparency
)Best Practices
1. Use Datetime Index
Always use pandas datetime index for proper time axis handling:
# Good
df_good = pd.DataFrame(df_api.values, columns=df_api.columns, index=pd.date_range('2024-01-01', periods=len(df_api), freq='h'))
# Avoid
df_bad = pd.DataFrame(df_api.values, columns=df_api.columns) # Will use default integer index2. Consistent Data Shapes
Ensure all DataFrames have consistent columns for comparison:
# Verify columns match
columns_shared = set(df_w.columns) & set(df_s.columns) & set(df_m.columns)
if not columns_shared:
raise ValueError("DataFrames have no common columns")3. Handle Large Datasets
For large time series, consider subsampling:
# Subsample every 10th point
df_sub = wf_data[::10]
visualize_ts_plotly({'Data': df_sub})4. Meaningful Dataset Names
Use descriptive names for datasets:
# Good
dataframes_good = {
'Training (2023)': wf_train,
'Validation (Jan 2024)': wf_val,
'Testing (Feb 2024)': wf_test
}
# Avoid
dataframes_bad = {
'A': wf_train,
'B': wf_val,
'C': wf_test
}Troubleshooting
Issue: Overlapping Datasets
If datasets overlap in time, use separate figures:
# Visualize one column at a time
for col in dataframes_dyn[list(dataframes_dyn.keys())[0]].columns:
visualize_ts_plotly(dataframes_dyn, columns=[col])Issue: Memory Issues with Large Datasets
Downsample before visualization:
# Downsample to hourly (using the dense season data from earlier)
df_downsampled = df_season.resample('1D').mean()
fig_down = plot_seasonality(df_downsampled, target="value", show=False)
plt.close(fig_down)Issue: Missing Data in Visualization
Handle missing values before visualization:
# Forward fill missing values
df_filled = df_api.ffill()
visualize_ts_plotly({'Data': df_filled})Testing
This module includes comprehensive pytest tests validating all documentation examples and API functionality. Tests are located in tests/test_docs_time_series_visualization_examples.py.
Running Tests
Run all time series visualization tests:
uv run pytest tests/test_docs_time_series_visualization_examples.py -vRun specific test class:
uv run pytest tests/test_docs_time_series_visualization_examples.py::TestVisualizeTimeSeriesPlotlyBasic -vTest Coverage
The test suite includes 50 comprehensive tests covering:
- Basic Visualization (9 tests): Single/multiple dataset visualization, column selection, custom parameters
- Comparison Functionality (6 tests): Dataset comparison, statistical overlays, customization
- Complete Workflows (3 tests): Train/val/test split visualization, multi-dataset comparison, dynamic datasets
- Parameters & Configuration (8 tests): figsize options, template variations, color customization
- Best Practices (4 tests): Datetime index handling, consistent shapes, subsampling for large datasets
- Edge Cases (7 tests): Single value, constant values, NaN handling, negative/large values, many columns
- API Examples (5 tests): Quick start examples, API function validation
- Data Integrity (3 tests): Index preservation, data value preservation, dataset independence
- Safety-Critical (5 tests): Error handling, empty input validation, determinism
Test Validation Command
Verify all time series visualization tests pass:
uv run pytest tests/test_docs_time_series_visualization_examples.py --tb=short -qExpected output: 50 passed
Documentation Examples Tested
All code examples in this documentation have been validated with pytest: - Quick start examples (all variants) - Complete workflow examples (train/val/test split, comparison, dynamic) - Parameter configuration examples (figsize, templates, colors) - Best practices examples (datetime index, consistent shapes, large datasets) - Troubleshooting examples (overlapping datasets, memory issues, missing data)
See Also
References
- Plotly Dash and Plotly.py documentation: https://plotly.com/python/
- Pandas datetime index: https://pandas.pydata.org/docs/user_guide/timeseries.html