repair
apply_penalty_NA(y, penalty_NA, sd=0.1, stop_on_zero_return=False)
¶
Replaces NaN values in y with a penalty value of penalty_NA and issues a warning if necessary.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
y |
ndarray
|
y array |
required |
penalty_NA |
float
|
penalty value to replace NaN values in y |
required |
sd |
float
|
standard deviation for the random noise added to penalty_NA. Default is 0.1. |
0.1
|
stop_on_zero_return |
bool
|
whether to stop if the returned dimension is less than 1. Default is False. |
False
|
Returns:
Type | Description |
---|---|
ndarray
|
numpy.ndarray: y array with NaN values replaced by penalty value |
Examples:
>>> import numpy as np
>>> from spotpython.utils.repair import apply_penalty_NA
>>> y = np.array([1, np.nan, 2])
>>> y_cleaned = apply_penalty_NA(y, 0)
>>> print(y_cleaned)
[1. 0. 2.]
Source code in spotpython/utils/repair.py
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 |
|
remove_nan(X, y, stop_on_zero_return=False)
¶
Remove rows from X and y where y contains NaN values and issue a warning if the dimension of the returned y array is smaller than the dimension of the original y array. Issues a ValueError if the dimension of the returned y array is less than 1 and stop_on_zero_return is True.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
ndarray
|
X array |
required |
y |
ndarray
|
y array |
required |
stop_on_zero_return |
bool
|
whether to stop if the returned dimension is less than 1. Default is False. |
False
|
Returns:
Type | Description |
---|---|
Tuple[ndarray, ndarray]
|
Tuple[numpy.ndarray, np.ndarray]: X and y arrays with rows containing NaN values in y removed. |
Examples:
>>> import numpy as np
from spotpython.utils.repair import remove_nan
X = np.array([[1, 2], [3, 4], [5, 6]])
y = np.array([1, np.nan, 2])
X_cleaned, y_cleaned = remove_nan(X, y)
print(X_cleaned, y_cleaned)
[[1 2]
[5 6]] [1. 2.]
Source code in spotpython/utils/repair.py
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
|
repair_non_numeric(X, var_type)
¶
Round non-numeric values to integers. This applies to all variables except for “num” and “float”.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
ndarray
|
X array |
required |
var_type |
list
|
list with type information |
required |
Returns:
Type | Description |
---|---|
ndarray
|
numpy.ndarray: X array with non-numeric values rounded to integers |
Examples:
>>> X = np.array([[1.2, 2.3], [3.4, 4.5]])
>>> var_type = ["num", "factor"]
>>> repair_non_numeric(X, var_type)
array([[1., 2.],
[3., 4.]])
Source code in spotpython/utils/repair.py
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
|