preprocessing.time_series_visualization
preprocessing.time_series_visualization
Time series visualization.
Functions
| Name | Description |
|---|---|
| plot_forecast | Plot model forecast against actuals and display CV metrics. |
| plot_predictions | Plot actual values against one or more prediction series. |
| plot_seasonality | Plot seasonal patterns (annual, weekly, daily) for a given target. |
| plot_zoomed_timeseries | Plot a time series with a zoomed-in focus area. |
| visualize_ts_comparison | Visualize time series with optional statistical overlays. |
| visualize_ts_plotly | Visualize multiple time series datasets interactively with Plotly. |
plot_forecast
preprocessing.time_series_visualization.plot_forecast(
model,
X,
y,
cv_results=None,
title='Forecast',
figsize=None,
show=True,
nrows=None,
ncols=1,
sharex=True,
)Plot model forecast against actuals and display CV metrics.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| model | Any | Fitted scikit-learn model. | required |
| X | pd.DataFrame | Feature matrix (e.g., test set). | required |
| y | Union[pd.Series, pd.DataFrame] | Target series or DataFrame (e.g., test set). | required |
| cv_results | Optional[Dict[str, Any]] | Optional dictionary of cross-validation results from evaluate() or sklearn.model_selection.cross_validate(). |
None |
| title | str | Title of the plot. Defaults to “Forecast”. | 'Forecast' |
| figsize | Optional[tuple] | Figure dimensions. | None |
| show | bool | Whether to display the plot. Defaults to True. | True |
| nrows | Optional[int] | Number of rows for subplots (multivariate). | None |
| ncols | int | Number of columns for subplots (multivariate). | 1 |
| sharex | bool | Whether to share x-axis for subplots. Defaults to True. | True |
Returns
| Name | Type | Description |
|---|---|---|
plt.Figure |
plt.Figure: The matplotlib Figure object. |
Examples
>>> import matplotlib.pyplot as plt
>>> import pandas as pd
>>> import numpy as np
>>> from sklearn.linear_model import LinearRegression
>>> from spotforecast2.preprocessing.time_series_visualization import plot_forecast
>>> # Create sample data
>>> dates = pd.date_range("2023-01-01", periods=10, freq="D")
>>> X = pd.DataFrame({"feat": np.arange(10)}, index=dates)
>>> y = pd.Series(np.arange(10), index=dates)
>>> model = LinearRegression().fit(X, y)
>>> # Plot forecast
>>> fig = plot_forecast(model, X, y, show=False)
>>> plt.close(fig)plot_predictions
preprocessing.time_series_visualization.plot_predictions(
y_true,
predictions,
slice_seq=None,
title='Predictions vs Actuals',
figsize=None,
show=True,
nrows=None,
ncols=1,
sharex=True,
)Plot actual values against one or more prediction series.
Allows visualizing model performance by overlaying predictions on top of actual data. Supports slicing to focus on a specific time range (e.g., the recent test set). Handles both univariate and multivariate targets by creating subplots for multiple targets.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| y_true | Union[pd.Series, pd.DataFrame] | Series or DataFrame containing the actual target values. | required |
| predictions | Dict[str, Union[pd.Series, pd.DataFrame, np.ndarray]] | Dictionary where keys are labels (e.g., model names) and values are the corresponding predictions. If arrays are provided, they must have the same length as the sliced y_true. |
required |
| slice_seq | Optional[slice] | Optional slice object to select a subset of the data. If None, the entire series is plotted. Example: slice(-96, None) to select the last 96 points. |
None |
| title | str | Title of the plot. Defaults to “Predictions vs Actuals”. | 'Predictions vs Actuals' |
| figsize | Optional[tuple] | Tuple defining figure width and height. If None, automatically calculated based on number of subplots. | None |
| show | bool | Whether to display the plot. Defaults to True. | True |
| nrows | Optional[int] | Number of rows for subplots (multivariate). Defaults to n_targets. | None |
| ncols | int | Number of columns for subplots (multivariate). Defaults to 1. | 1 |
| sharex | bool | Whether to share x-axis for subplots. Defaults to True. | True |
Returns
| Name | Type | Description |
|---|---|---|
plt.Figure |
plt.Figure: The matplotlib Figure object containing the plot. |
Examples
>>> import matplotlib.pyplot as plt
>>> import pandas as pd
>>> import numpy as np
>>> from spotforecast2.preprocessing.time_series_visualization import plot_predictions
>>> # Create sample data
>>> dates = pd.date_range("2023-01-01", periods=10, freq="D")
>>> y_true = pd.Series(np.arange(10), index=dates, name="Target")
>>> predictions = {"Model A": y_true + 0.5}
>>> # Plot predictions
>>> fig = plot_predictions(y_true, predictions, show=False)
>>> plt.close(fig)plot_seasonality
preprocessing.time_series_visualization.plot_seasonality(
data,
target,
figsize=(8, 5),
show=True,
logscale=False,
)Plot seasonal patterns (annual, weekly, daily) for a given target.
Creates a 2x2 grid of plots: 1. Distribution by month (boxplot + median). 2. Distribution by week day (boxplot + median). 3. Distribution by hour of day (boxplot + median). 4. Mean target value by day of week and hour.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| data | pd.DataFrame | DataFrame containing the time series data. Must have a DatetimeIndex or an index convertible to datetime. | required |
| target | str | Name of the column to plot. | required |
| figsize | tuple[int, int] | Figure dimensions (width, height). Defaults to (8, 5). | (8, 5) |
| show | bool | Whether to display the plot immediately. Defaults to True. | True |
| logscale | Union[bool, list[bool]] | Whether to use a log scale for the y-axis. Can be a single boolean (applies to all 4 plots) or a list of 4 booleans (applies to each plot individually). Defaults to False. | False |
Returns
| Name | Type | Description |
|---|---|---|
plt.Figure |
plt.Figure: The matplotlib Figure object. |
Examples
>>> import pandas as pd
>>> import matplotlib.pyplot as plt
>>> from spotforecast2.preprocessing.time_series_visualization import plot_seasonality
>>> # Create sample data
>>> dates = pd.date_range("2023-01-01", periods=1000, freq="h")
>>> df = pd.DataFrame({"value": range(1, 1001)}, index=dates)
>>> # Plot seasonality with log scale for all plots
>>> fig = plot_seasonality(data=df, target="value", logscale=True, show=False)
>>> plt.close(fig)
>>> # Plot seasonality with log scale for the first plot only
>>> fig = plot_seasonality(
... data=df,
... target="value",
... logscale=[True, False, False, False],
... show=False
... )
>>> plt.close(fig)plot_zoomed_timeseries
preprocessing.time_series_visualization.plot_zoomed_timeseries(
data,
target,
zoom,
title=None,
figsize=(8, 4),
show=True,
)Plot a time series with a zoomed-in focus area.
Creates a two-panel plot: 1. Top panel: Full time series with the zoom area highlighted. 2. Bottom panel: Zoomed-in view of the specified time range.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| data | pd.DataFrame | DataFrame containing the time series data. Must have a DatetimeIndex or an index convertible to datetime. | required |
| target | str | Name of the column to plot. | required |
| zoom | tuple[str, str] | Tuple of (start_date, end_date) strings defining the zoom range. | required |
| title | Optional[str] | Optional title for the plot. If None, defaults to target name. | None |
| figsize | tuple[int, int] | Figure dimensions (width, height). Defaults to (8, 4). | (8, 4) |
| show | bool | Whether to display the plot immediately. Defaults to True. | True |
Returns
| Name | Type | Description |
|---|---|---|
plt.Figure |
plt.Figure: The matplotlib Figure object. |
Examples
>>> import pandas as pd
>>> import matplotlib.pyplot as plt
>>> from spotforecast2.preprocessing.time_series_visualization import plot_zoomed_timeseries
>>> # Create sample data
>>> dates = pd.date_range("2023-01-01", periods=100, freq="h")
>>> df = pd.DataFrame({"value": range(100)}, index=dates)
>>> # Plot with zoom
>>> fig = plot_zoomed_timeseries(
... data=df,
... target="value",
... zoom=("2023-01-02 00:00", "2023-01-03 00:00"),
... show=False
... )
>>> plt.close(fig)visualize_ts_comparison
preprocessing.time_series_visualization.visualize_ts_comparison(
dataframes,
columns=None,
title_suffix='',
figsize=(1000, 500),
template='plotly_white',
colors=None,
show_mean=False,
**kwargs,
)Visualize time series with optional statistical overlays.
Similar to visualize_ts_plotly but adds options for statistical overlays like mean values across all datasets.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| dataframes | Dict[str, pd.DataFrame] | Dictionary mapping dataset names to pandas DataFrames. | required |
| columns | Optional[List[str]] | List of column names to visualize. If None, all columns are used. Default: None. | None |
| title_suffix | str | Suffix to append to column names. Default: ““. | '' |
| figsize | tuple[int, int] | Figure size as (width, height) in pixels. Default: (1000, 500). | (1000, 500) |
| template | str | Plotly template. Default: ‘plotly_white’. | 'plotly_white' |
| colors | Optional[Dict[str, str]] | Dictionary mapping dataset names to colors. Default: None. | None |
| show_mean | bool | If True, overlay the mean of all datasets. Default: False. | False |
| **kwargs | Any | Additional keyword arguments for go.Scatter(). | {} |
Returns
| Name | Type | Description |
|---|---|---|
| None | None. Displays Plotly figures. |
Raises
| Name | Type | Description |
|---|---|---|
| ValueError | If dataframes is empty. | |
| ImportError | If plotly is not installed. |
Examples
>>> import pandas as pd
>>> import numpy as np
>>> from spotforecast2.preprocessing.time_series_visualization import visualize_ts_comparison
>>>
>>> # Create sample data
>>> np.random.seed(42)
>>> dates1 = pd.date_range('2024-01-01', periods=100, freq='h')
>>> dates2 = pd.date_range('2024-05-11', periods=100, freq='h')
>>>
>>> df1 = pd.DataFrame({
... 'temperature': np.random.normal(20, 5, 100)
... }, index=dates1)
>>>
>>> df2 = pd.DataFrame({
... 'temperature': np.random.normal(22, 5, 100)
... }, index=dates2)
>>>
>>> # Compare with mean overlay
>>> visualize_ts_comparison(
... {'Dataset1': df1, 'Dataset2': df2},
... show_mean=True
... )visualize_ts_plotly
preprocessing.time_series_visualization.visualize_ts_plotly(
dataframes,
columns=None,
title_suffix='',
figsize=(1000, 500),
template='plotly_white',
colors=None,
**kwargs,
)Visualize multiple time series datasets interactively with Plotly.
Creates interactive Plotly scatter plots for specified columns across multiple datasets (e.g., train, validation, test splits). Each dataset is displayed as a separate line with a unique color and name in the legend.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| dataframes | Dict[str, pd.DataFrame] | Dictionary mapping dataset names to pandas DataFrames with datetime index. Example: {‘Train’: df_train, ‘Validation’: df_val, ‘Test’: df_test} | required |
| columns | Optional[List[str]] | List of column names to visualize. If None, all columns are used. Default: None. | None |
| title_suffix | str | Suffix to append to the column name in the title. Useful for adding units or descriptions. Default: ““. | '' |
| figsize | tuple[int, int] | Figure size as (width, height) in pixels. Default: (1000, 500). | (1000, 500) |
| template | str | Plotly template name for styling. Options include ‘plotly_white’, ‘plotly_dark’, ‘plotly’, ‘ggplot2’, etc. Default: ‘plotly_white’. | 'plotly_white' |
| colors | Optional[Dict[str, str]] | Dictionary mapping dataset names to colors. If None, uses Plotly default colors. Example: {‘Train’: ‘blue’, ‘Validation’: ‘orange’}. Default: None. | None |
| **kwargs | Any | Additional keyword arguments passed to go.Scatter() (e.g., mode=‘lines+markers’, line=dict(dash=‘dash’)). | {} |
Returns
| Name | Type | Description |
|---|---|---|
| None | None. Displays Plotly figures. |
Raises
| Name | Type | Description |
|---|---|---|
| ValueError | If dataframes dict is empty, contains no columns, or if specified columns don’t exist in all dataframes. | |
| ImportError | If plotly is not installed. | |
| TypeError | If dataframes parameter is not a dictionary. |
Examples
>>> import pandas as pd
>>> import numpy as np
>>> from spotforecast2.preprocessing.time_series_visualization import visualize_ts_plotly
>>>
>>> # Create sample time series data
>>> np.random.seed(42)
>>> dates_train = pd.date_range('2024-01-01', periods=100, freq='h')
>>> dates_val = pd.date_range('2024-05-11', periods=50, freq='h')
>>> dates_test = pd.date_range('2024-07-01', periods=30, freq='h')
>>>
>>> data_train = pd.DataFrame({
... 'temperature': np.random.normal(20, 5, 100),
... 'humidity': np.random.normal(60, 10, 100)
... }, index=dates_train)
>>>
>>> data_val = pd.DataFrame({
... 'temperature': np.random.normal(22, 5, 50),
... 'humidity': np.random.normal(55, 10, 50)
... }, index=dates_val)
>>>
>>> data_test = pd.DataFrame({
... 'temperature': np.random.normal(25, 5, 30),
... 'humidity': np.random.normal(50, 10, 30)
... }, index=dates_test)
>>>
>>> # Visualize all datasets
>>> dataframes = {
... 'Train': data_train,
... 'Validation': data_val,
... 'Test': data_test
... }
>>> visualize_ts_plotly(dataframes)Single dataset example:
>>> # Visualize single dataset
>>> dataframes = {'Data': data_train}
>>> visualize_ts_plotly(dataframes, columns=['temperature'])Custom styling:
>>> visualize_ts_plotly(
... dataframes,
... columns=['temperature'],
... template='plotly_dark',
... colors={'Train': 'blue', 'Validation': 'green', 'Test': 'red'},
... mode='lines+markers'
... )