data.data
data.data
Data structures for input and processed data.
Classes
Data
Container for input time series data.
Period
Class abstraction for the information required to encode a period.
Data
Container for input time series data.
Attributes
data
pd .DataFrame
pandas DataFrame containing the input time series data.
Methods
from_csv
data.data.Data.from_csv(
csv_path,
timezone,
columns= None ,
parse_dates= True ,
index_col= 0 ,
** kwargs,
)
Load data from a CSV file.
The CSV must contain a datetime column that becomes the DataFrame index. The index is localized to the provided timezone if it is naive, and then converted to UTC.
Parameters
csv_path
Path
Path to the CSV file.
required
timezone
Optional [str ]
Timezone to assign if the index has no timezone. Must be provided if the index is naive.
required
columns
Optional [List [str ]]
List of column names to include. If provided, only these columns will be loaded from the CSV (optimizes reading speed). If None, all columns are loaded.
None
parse_dates
bool or list
Passed to pd.read_csv. Defaults to True.
True
index_col
int or str
Column to use as index. Defaults to 0.
0
**kwargs
Any
Additional keyword arguments forwarded to pd.read_csv.
{}
Returns
Data
Data
Instance containing the loaded DataFrame.
Raises
ValueError
If the CSV does not yield a DatetimeIndex.
ValueError
If the index is timezone-naive and no timezone is provided.
Examples
>>> from spotforecast2_safe.data import Data
>>> data = Data.from_csv(
... Path("data.csv" ),
... timezone= "UTC" ,
... columns= ["target_col" ]
... )
from_dataframe
data.data.Data.from_dataframe(df, timezone, columns= None )
Create a new Data instance from an existing DataFrame.
The DataFrame must have a datetime index. The index is localized to the provided timezone if it is naive, and then converted to UTC.
Parameters
df
pd .DataFrame
Input DataFrame containing data.
required
timezone
Optional [str ]
Timezone to assign if the index is naive. Must be provided if the index has no timezone.
required
columns
Optional [List [str ]]
List of column names to include. If provided, only these columns will be selected from the DataFrame. If None, all columns are used.
None
Returns
Data
Data
Instance containing the provided DataFrame.
Raises
ValueError
If the DataFrame index is not a DatetimeIndex.
ValueError
If the index is timezone-naive and no timezone is provided.
Period
data.data.Period(name, n_periods, column, input_range)
Class abstraction for the information required to encode a period.
Attributes
name
str
Name of the period (e.g., ‘hour’, ‘day’).
n_periods
int
Number of periods to encode (e.g., 24 for hours).
column
str
Name of the column in the DataFrame containing the period information.
input_range
Tuple [int , int ]
Tuple of (min, max) values for the period (e.g., (0, 23) for hours).
Examples
>>> from spotforecast2_safe.data import Period
>>> period = Period(name= "hour" , n_periods= 24 , column= "hour" , input_range= (0 , 23 ))
>>> period.name
'hour'