generic
GenericData
¶
Bases: GenericFileDataset
A class for handling generic data.
This class inherits from the base.GenericFileDataset class and provides an interface for handling generic data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
filename |
str
|
The name of the file containing the data. |
required |
target |
str
|
The name of the target column. |
required |
n_features |
int
|
The number of features in the dataset. |
required |
n_samples |
int
|
The number of samples in the dataset. |
required |
converters |
Dict[str, callable]
|
A dictionary of functions for converting column data. |
required |
parse_dates |
List[str]
|
A list of column names to parse as dates. |
required |
directory |
str
|
The directory where the file is located. |
required |
task |
str
|
The type of task. Default is base.REG for regression. |
REG
|
fraction |
float
|
The fraction of the data to use. Default is 1.0 for all data. |
1.0
|
Returns:
Type | Description |
---|---|
Generator
|
An iterator over the data in the file. |
Examples:
>>> from spotriver.data.generic import GenericData
import importlib.resources as pkg_resources
import spotriver.data as data
inp_file = pkg_resources.files(data)
csv_path = str(inp_file.resolve())
dataset = GenericData(filename="UnivariateData.csv",
directory=csv_path,
target="Consumption",
n_features=1,
n_samples=51_706,
converters={"Consumption": float},
parse_dates={"Time": "%Y-%m-%d %H:%M:%S%z"})
for x, y in dataset:
print(x, y)
break
{'Time': datetime.datetime(2016, 12, 31, 23, 0, tzinfo=datetime.timezone.utc)} 10951.217
Source code in spotriver/data/generic.py
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 |
|