preprocessing.time_series_visualization

preprocessing.time_series_visualization

Time series visualization.

Functions

Name	Description
plot_forecast	Plot model forecast against actuals and display CV metrics.
plot_predictions	Plot actual values against one or more prediction series.
plot_seasonality	Plot seasonal patterns (annual, weekly, daily) for a given target.
plot_zoomed_timeseries	Plot a time series with a zoomed-in focus area.
visualize_ts_comparison	Visualize time series with optional statistical overlays.
visualize_ts_plotly	Visualize multiple time series datasets interactively with Plotly.

plot_forecast

preprocessing.time_series_visualization.plot_forecast(
    model,
    X,
    y,
    cv_results=None,
    title='Forecast',
    figsize=None,
    show=True,
    nrows=None,
    ncols=1,
    sharex=True,
)

Plot model forecast against actuals and display CV metrics.

Parameters

Name	Type	Description	Default
model	Any	Fitted scikit-learn model.	required
X	pd.DataFrame	Feature matrix (e.g., test set).	required
y	Union[pd.Series, pd.DataFrame]	Target series or DataFrame (e.g., test set).	required
cv_results	Optional[Dict[str, Any]]	Optional dictionary of cross-validation results from `evaluate()` or `sklearn.model_selection.cross_validate()`.	`None`
title	str	Title of the plot. Defaults to “Forecast”.	`'Forecast'`
figsize	Optional[tuple]	Figure dimensions.	`None`
show	bool	Whether to display the plot. Defaults to True.	`True`
nrows	Optional[int]	Number of rows for subplots (multivariate).	`None`
ncols	int	Number of columns for subplots (multivariate).	`1`
sharex	bool	Whether to share x-axis for subplots. Defaults to True.	`True`

Returns

Name	Type	Description
	`plt`.`Figure`	plt.Figure: The matplotlib Figure object.

Examples

>>> import matplotlib.pyplot as plt
>>> import pandas as pd
>>> import numpy as np
>>> from sklearn.linear_model import LinearRegression
>>> from spotforecast2.preprocessing.time_series_visualization import plot_forecast
>>> # Create sample data
>>> dates = pd.date_range("2023-01-01", periods=10, freq="D")
>>> X = pd.DataFrame({"feat": np.arange(10)}, index=dates)
>>> y = pd.Series(np.arange(10), index=dates)
>>> model = LinearRegression().fit(X, y)
>>> # Plot forecast
>>> fig = plot_forecast(model, X, y, show=False)
>>> plt.close(fig)

plot_predictions

preprocessing.time_series_visualization.plot_predictions(
    y_true,
    predictions,
    slice_seq=None,
    title='Predictions vs Actuals',
    figsize=None,
    show=True,
    nrows=None,
    ncols=1,
    sharex=True,
)

Plot actual values against one or more prediction series.

Allows visualizing model performance by overlaying predictions on top of actual data. Supports slicing to focus on a specific time range (e.g., the recent test set). Handles both univariate and multivariate targets by creating subplots for multiple targets.

Parameters

Name	Type	Description	Default
y_true	Union[pd.Series, pd.DataFrame]	Series or DataFrame containing the actual target values.	required
predictions	Dict[str, Union[pd.Series, pd.DataFrame, np.ndarray]]	Dictionary where keys are labels (e.g., model names) and values are the corresponding predictions. If arrays are provided, they must have the same length as the sliced `y_true`.	required
slice_seq	Optional[slice]	Optional slice object to select a subset of the data. If None, the entire series is plotted. Example: `slice(-96, None)` to select the last 96 points.	`None`
title	str	Title of the plot. Defaults to “Predictions vs Actuals”.	`'Predictions vs Actuals'`
figsize	Optional[tuple]	Tuple defining figure width and height. If None, automatically calculated based on number of subplots.	`None`
show	bool	Whether to display the plot. Defaults to True.	`True`
nrows	Optional[int]	Number of rows for subplots (multivariate). Defaults to n_targets.	`None`
ncols	int	Number of columns for subplots (multivariate). Defaults to 1.	`1`
sharex	bool	Whether to share x-axis for subplots. Defaults to True.	`True`

Returns

Name	Type	Description
	`plt`.`Figure`	plt.Figure: The matplotlib Figure object containing the plot.

Examples

>>> import matplotlib.pyplot as plt
>>> import pandas as pd
>>> import numpy as np
>>> from spotforecast2.preprocessing.time_series_visualization import plot_predictions
>>> # Create sample data
>>> dates = pd.date_range("2023-01-01", periods=10, freq="D")
>>> y_true = pd.Series(np.arange(10), index=dates, name="Target")
>>> predictions = {"Model A": y_true + 0.5}
>>> # Plot predictions
>>> fig = plot_predictions(y_true, predictions, show=False)
>>> plt.close(fig)

plot_seasonality

preprocessing.time_series_visualization.plot_seasonality(
    data,
    target,
    figsize=(8, 5),
    show=True,
    logscale=False,
)

Plot seasonal patterns (annual, weekly, daily) for a given target.

Creates a 2x2 grid of plots: 1. Distribution by month (boxplot + median). 2. Distribution by week day (boxplot + median). 3. Distribution by hour of day (boxplot + median). 4. Mean target value by day of week and hour.

Parameters

Name	Type	Description	Default
data	pd.DataFrame	DataFrame containing the time series data. Must have a DatetimeIndex or an index convertible to datetime.	required
target	str	Name of the column to plot.	required
figsize	tuple[int, int]	Figure dimensions (width, height). Defaults to (8, 5).	`(8, 5)`
show	bool	Whether to display the plot immediately. Defaults to True.	`True`
logscale	Union[bool, list[bool]]	Whether to use a log scale for the y-axis. Can be a single boolean (applies to all 4 plots) or a list of 4 booleans (applies to each plot individually). Defaults to False.	`False`

Returns

Name	Type	Description
	`plt`.`Figure`	plt.Figure: The matplotlib Figure object.

Examples

>>> import pandas as pd
>>> import matplotlib.pyplot as plt
>>> from spotforecast2.preprocessing.time_series_visualization import plot_seasonality
>>> # Create sample data
>>> dates = pd.date_range("2023-01-01", periods=1000, freq="h")
>>> df = pd.DataFrame({"value": range(1, 1001)}, index=dates)
>>> # Plot seasonality with log scale for all plots
>>> fig = plot_seasonality(data=df, target="value", logscale=True, show=False)
>>> plt.close(fig)
>>> # Plot seasonality with log scale for the first plot only
>>> fig = plot_seasonality(
...     data=df,
...     target="value",
...     logscale=[True, False, False, False],
...     show=False
... )
>>> plt.close(fig)

plot_zoomed_timeseries

preprocessing.time_series_visualization.plot_zoomed_timeseries(
    data,
    target,
    zoom,
    title=None,
    figsize=(8, 4),
    show=True,
)

Plot a time series with a zoomed-in focus area.

Creates a two-panel plot: 1. Top panel: Full time series with the zoom area highlighted. 2. Bottom panel: Zoomed-in view of the specified time range.

Parameters

Name	Type	Description	Default
data	pd.DataFrame	DataFrame containing the time series data. Must have a DatetimeIndex or an index convertible to datetime.	required
target	str	Name of the column to plot.	required
zoom	tuple[str, str]	Tuple of (start_date, end_date) strings defining the zoom range.	required
title	Optional[str]	Optional title for the plot. If None, defaults to target name.	`None`
figsize	tuple[int, int]	Figure dimensions (width, height). Defaults to (8, 4).	`(8, 4)`
show	bool	Whether to display the plot immediately. Defaults to True.	`True`

Returns

Name	Type	Description
	`plt`.`Figure`	plt.Figure: The matplotlib Figure object.

Examples

>>> import pandas as pd
>>> import matplotlib.pyplot as plt
>>> from spotforecast2.preprocessing.time_series_visualization import plot_zoomed_timeseries
>>> # Create sample data
>>> dates = pd.date_range("2023-01-01", periods=100, freq="h")
>>> df = pd.DataFrame({"value": range(100)}, index=dates)
>>> # Plot with zoom
>>> fig = plot_zoomed_timeseries(
...     data=df,
...     target="value",
...     zoom=("2023-01-02 00:00", "2023-01-03 00:00"),
...     show=False
... )
>>> plt.close(fig)

visualize_ts_comparison

preprocessing.time_series_visualization.visualize_ts_comparison(
    dataframes,
    columns=None,
    title_suffix='',
    figsize=(1000, 500),
    template='plotly_white',
    colors=None,
    show_mean=False,
    **kwargs,
)

Visualize time series with optional statistical overlays.

Similar to visualize_ts_plotly but adds options for statistical overlays like mean values across all datasets.

Parameters

Name	Type	Description	Default
dataframes	Dict[str, pd.DataFrame]	Dictionary mapping dataset names to pandas DataFrames.	required
columns	Optional[List[str]]	List of column names to visualize. If None, all columns are used. Default: None.	`None`
title_suffix	str	Suffix to append to column names. Default: ““.	`''`
figsize	tuple[int, int]	Figure size as (width, height) in pixels. Default: (1000, 500).	`(1000, 500)`
template	str	Plotly template. Default: ‘plotly_white’.	`'plotly_white'`
colors	Optional[Dict[str, str]]	Dictionary mapping dataset names to colors. Default: None.	`None`
show_mean	bool	If True, overlay the mean of all datasets. Default: False.	`False`
**kwargs	Any	Additional keyword arguments for go.Scatter().	`{}`

Returns

Name	Type	Description
	None	None. Displays Plotly figures.

Raises

Name	Type	Description
	ValueError	If dataframes is empty.
	ImportError	If plotly is not installed.

Examples

>>> import pandas as pd
>>> import numpy as np
>>> from spotforecast2.preprocessing.time_series_visualization import visualize_ts_comparison
>>>
>>> # Create sample data
>>> np.random.seed(42)
>>> dates1 = pd.date_range('2024-01-01', periods=100, freq='h')
>>> dates2 = pd.date_range('2024-05-11', periods=100, freq='h')
>>>
>>> df1 = pd.DataFrame({
...     'temperature': np.random.normal(20, 5, 100)
... }, index=dates1)
>>>
>>> df2 = pd.DataFrame({
...     'temperature': np.random.normal(22, 5, 100)
... }, index=dates2)
>>>
>>> # Compare with mean overlay
>>> visualize_ts_comparison(
...     {'Dataset1': df1, 'Dataset2': df2},
...     show_mean=True
... )

visualize_ts_plotly

preprocessing.time_series_visualization.visualize_ts_plotly(
    dataframes,
    columns=None,
    title_suffix='',
    figsize=(1000, 500),
    template='plotly_white',
    colors=None,
    **kwargs,
)

Visualize multiple time series datasets interactively with Plotly.

Creates interactive Plotly scatter plots for specified columns across multiple datasets (e.g., train, validation, test splits). Each dataset is displayed as a separate line with a unique color and name in the legend.

Parameters

Name	Type	Description	Default
dataframes	Dict[str, pd.DataFrame]	Dictionary mapping dataset names to pandas DataFrames with datetime index. Example: {‘Train’: df_train, ‘Validation’: df_val, ‘Test’: df_test}	required
columns	Optional[List[str]]	List of column names to visualize. If None, all columns are used. Default: None.	`None`
title_suffix	str	Suffix to append to the column name in the title. Useful for adding units or descriptions. Default: ““.	`''`
figsize	tuple[int, int]	Figure size as (width, height) in pixels. Default: (1000, 500).	`(1000, 500)`
template	str	Plotly template name for styling. Options include ‘plotly_white’, ‘plotly_dark’, ‘plotly’, ‘ggplot2’, etc. Default: ‘plotly_white’.	`'plotly_white'`
colors	Optional[Dict[str, str]]	Dictionary mapping dataset names to colors. If None, uses Plotly default colors. Example: {‘Train’: ‘blue’, ‘Validation’: ‘orange’}. Default: None.	`None`
**kwargs	Any	Additional keyword arguments passed to go.Scatter() (e.g., mode=‘lines+markers’, line=dict(dash=‘dash’)).	`{}`

Returns

Name	Type	Description
	None	None. Displays Plotly figures.

Raises

Name	Type	Description
	ValueError	If dataframes dict is empty, contains no columns, or if specified columns don’t exist in all dataframes.
	ImportError	If plotly is not installed.
	TypeError	If dataframes parameter is not a dictionary.

Examples

>>> import pandas as pd
>>> import numpy as np
>>> from spotforecast2.preprocessing.time_series_visualization import visualize_ts_plotly
>>>
>>> # Create sample time series data
>>> np.random.seed(42)
>>> dates_train = pd.date_range('2024-01-01', periods=100, freq='h')
>>> dates_val = pd.date_range('2024-05-11', periods=50, freq='h')
>>> dates_test = pd.date_range('2024-07-01', periods=30, freq='h')
>>>
>>> data_train = pd.DataFrame({
...     'temperature': np.random.normal(20, 5, 100),
...     'humidity': np.random.normal(60, 10, 100)
... }, index=dates_train)
>>>
>>> data_val = pd.DataFrame({
...     'temperature': np.random.normal(22, 5, 50),
...     'humidity': np.random.normal(55, 10, 50)
... }, index=dates_val)
>>>
>>> data_test = pd.DataFrame({
...     'temperature': np.random.normal(25, 5, 30),
...     'humidity': np.random.normal(50, 10, 30)
... }, index=dates_test)
>>>
>>> # Visualize all datasets
>>> dataframes = {
...     'Train': data_train,
...     'Validation': data_val,
...     'Test': data_test
... }
>>> visualize_ts_plotly(dataframes)

Single dataset example:

>>> # Visualize single dataset
>>> dataframes = {'Data': data_train}
>>> visualize_ts_plotly(dataframes, columns=['temperature'])

Custom styling:

>>> visualize_ts_plotly(
...     dataframes,
...     columns=['temperature'],
...     template='plotly_dark',
...     colors={'Train': 'blue', 'Validation': 'green', 'Test': 'red'},
...     mode='lines+markers'
... )