data.fetch_data.fetch_data

data.fetch_data.fetch_data(
    filename=None,
    dataframe=None,
    columns=None,
    index_col=0,
    parse_dates=True,
    dayfirst=False,
    timezone='UTC',
)

Fetches a dataset from a CSV file or processes a DataFrame.

Parameters

Name Type Description Default
filename str or Path Full absolute path of the CSV file containing the dataset (e.g., '/home/data/my_data.csv'). Required when dataframe is None. Use get_data_home() or get_package_data_home() to build the path, for example fetch_data(filename=get_data_home() / "my_data.csv"). None
dataframe pd.DataFrame A pandas DataFrame to process. If provided, it will be processed with proper timezone handling. Mutually exclusive with filename. None
columns list List of columns to be included in the dataset. If None, all columns are included. If an empty list is provided, a ValueError is raised. Default: None. None
index_col int Column index to be used as the index. Default: 0. 0
parse_dates bool Whether to parse dates in the index column. Default: True. True
dayfirst bool Whether the day comes first in date parsing. Default: False. False
timezone str Timezone to set for the datetime index. If a DataFrame with naive index is provided, it will be localized to this timezone then converted to UTC. Default: “UTC”. 'UTC'

Returns

Name Type Description
pd.DataFrame pd.DataFrame: The dataset with UTC timezone.

Raises

Name Type Description
ValueError If columns is an empty list, if both filename and dataframe are provided, if neither filename nor dataframe is provided, or if filename is not an absolute path.
FileNotFoundError If CSV file does not exist.

Examples

from spotforecast2_safe.data.fetch_data import fetch_data, get_package_data_home
# demo02.csv is included in the package datasets
path_demo = get_package_data_home() / "demo02.csv"
df = fetch_data(filename=path_demo)
df.head()
A B C D E F G H I J K
DateTime
1964-08-02 14:00:00+00:00 0.202969 8.255128 334.0 0 0.111049 -0.121741 1597.0 0.067896 0 NaN 0.0
1964-08-02 15:00:00+00:00 0.145975 7.542355 339.0 0 -0.003927 0.103541 1609.0 -0.093175 0 NaN 0.0
1964-08-02 16:00:00+00:00 0.094389 8.174336 344.0 0 0.043963 0.041291 1660.0 0.047823 0 NaN 0.0
1964-08-02 17:00:00+00:00 -0.202353 7.387896 341.0 0 0.067118 0.072999 1567.0 -0.051628 0 NaN 0.0
1964-08-02 18:00:00+00:00 -0.013810 7.581125 335.0 0 -0.138614 -0.006495 1467.0 0.016003 0 NaN 0.0