Skip to content

metrics

apk(actual, predicted, k=10)

Computes the average precision at k. This function computes the average precision at k between two lists of items.

Parameters:

Name Type Description Default
actual list

A list of elements that are to be predicted (order doesn’t matter)

required
predicted list

A list of predicted elements (order does matter)

required
k int

The maximum number of predicted elements

10

Returns:

Name Type Description
score float

The average precision at k over the input lists

Source code in spotpython/utils/metrics.py
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
def apk(actual, predicted, k=10):
    """
    Computes the average precision at k.
    This function computes the average precision at k between two lists of
    items.

    Args:
        actual (list): A list of elements that are to be predicted (order doesn't matter)
        predicted (list): A list of predicted elements (order does matter)
        k (int): The maximum number of predicted elements

    Returns:
        score (float): The average precision at k over the input lists
    """
    if len(predicted) > k:
        predicted = predicted[:k]

    score = 0.0
    num_hits = 0.0

    for i, p in enumerate(predicted):
        if p in actual and p not in predicted[:i]:
            num_hits += 1.0
            score += num_hits / (i + 1.0)

    if not actual:
        return 0.0

    return score / min(len(actual), k)

calculate_xai_consistency(attributions)

Calculate the consistency between different XAI methods. Computes the pairwise correlation between different XAI methods’ attributions and returns their mean correlation as a measure of consistency. A higher value indicates greater agreement between different XAI methods.

Parameters:

Name Type Description Default
attributions ndarray

Array of shape (n_methods, n_features) containing feature importance scores from different XAI methods. Each row represents a different XAI method’s attributions, and each column represents a feature.

required

Returns:

Name Type Description
float float

Mean correlation between XAI methods, ranging from -1 to 1. - 1: Perfect consistency between methods - 0: No consistency between methods - -1: Perfect negative consistency between methods

Examples:

>>> import numpy as np
>>> # Three XAI methods' attributions for four features
>>> attributions = np.array([
...     [0.1, 0.2, 0.3, 0.4],  # Method 1
...     [0.2, 0.3, 0.4, 0.5],  # Method 2
...     [0.0, 0.1, 0.2, 0.3]   # Method 3
... ])
>>> consistency = calculate_xai_consistency(attributions)
>>> print(f"XAI Consistency: {consistency:.2f}")
Attribution Correlation Matrix:
[[ 1.    0.97  0.98]
 [ 0.97  1.    0.99]
 [ 0.98  0.99  1.  ]]
XAI Consistency: 0.98
Note

The correlation matrix is computed using numpy’s corrcoef function, which calculates Pearson correlation coefficients. Only the upper triangle of the correlation matrix is used to avoid counting correlations twice.

Source code in spotpython/utils/metrics.py
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
def calculate_xai_consistency(attributions) -> float:
    """Calculate the consistency between different XAI methods.
    Computes the pairwise correlation between different XAI methods' attributions
    and returns their mean correlation as a measure of consistency. A higher value
    indicates greater agreement between different XAI methods.

    Args:
        attributions (np.ndarray): Array of shape (n_methods, n_features) containing
            feature importance scores from different XAI methods. Each row represents
            a different XAI method's attributions, and each column represents a feature.

    Returns:
        float: Mean correlation between XAI methods, ranging from -1 to 1.
            - 1: Perfect consistency between methods
            - 0: No consistency between methods
            - -1: Perfect negative consistency between methods

    Examples:
        >>> import numpy as np
        >>> # Three XAI methods' attributions for four features
        >>> attributions = np.array([
        ...     [0.1, 0.2, 0.3, 0.4],  # Method 1
        ...     [0.2, 0.3, 0.4, 0.5],  # Method 2
        ...     [0.0, 0.1, 0.2, 0.3]   # Method 3
        ... ])
        >>> consistency = calculate_xai_consistency(attributions)
        >>> print(f"XAI Consistency: {consistency:.2f}")
        Attribution Correlation Matrix:
        [[ 1.    0.97  0.98]
         [ 0.97  1.    0.99]
         [ 0.98  0.99  1.  ]]
        XAI Consistency: 0.98

    Note:
        The correlation matrix is computed using numpy's corrcoef function, which
        calculates Pearson correlation coefficients. Only the upper triangle of
        the correlation matrix is used to avoid counting correlations twice.
    """
    global_attr_np = np.array(attributions)
    corr_matrix = np.corrcoef(global_attr_np)
    print("Attribution Correlation Matrix:")
    print(corr_matrix)

    # Calculate the mean of the upper triangle of the correlation matrix
    upper_triangle_indices = np.triu_indices_from(corr_matrix, k=1)
    upper_triangle_values = corr_matrix[upper_triangle_indices]
    result_xai = upper_triangle_values.mean()
    print("XAI Consistency (mean of upper triangle of correlation matrix):")
    print(result_xai)
    return result_xai

get_metric_sign(metric_name)

Returns the sign of a metric.

Parameters:

Name Type Description Default
metric_name str

The name of the metric. Can be one of the following: - “accuracy_score” - “cohen_kappa_score” - “f1_score” - “hamming_loss” - “hinge_loss” -“jaccard_score” - “matthews_corrcoef” - “precision_score” - “recall_score” - “roc_auc_score” - “zero_one_loss”

required

Returns:

Name Type Description
sign float

The sign of the metric. -1 for max, +1 for min.

Raises:

Type Description
ValueError

If the metric is not found.

Examples:

>>> from spotpython.metrics import get_metric_sign
>>> get_metric_sign("accuracy_score")
-1
>>> get_metric_sign("hamming_loss")
+1
Source code in spotpython/utils/metrics.py
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
def get_metric_sign(metric_name):
    """Returns the sign of a metric.

    Args:
        metric_name (str):
            The name of the metric. Can be one of the following:
                - "accuracy_score"
                - "cohen_kappa_score"
                - "f1_score"
                - "hamming_loss"
                - "hinge_loss"
                -"jaccard_score"
                - "matthews_corrcoef"
                - "precision_score"
                - "recall_score"
                - "roc_auc_score"
                - "zero_one_loss"

    Returns:
        sign (float): The sign of the metric. -1 for max, +1 for min.

    Raises:
        ValueError: If the metric is not found.

    Examples:
        >>> from spotpython.metrics import get_metric_sign
        >>> get_metric_sign("accuracy_score")
        -1
        >>> get_metric_sign("hamming_loss")
        +1

    """
    if metric_name in [
        "accuracy_score",
        "cohen_kappa_score",
        "f1_score",
        "jaccard_score",
        "matthews_corrcoef",
        "precision_score",
        "recall_score",
        "roc_auc_score",
        "explained_variance_score",
        "r2_score",
        "d2_absolute_error_score",
        "d2_pinball_score",
        "d2_tweedie_score",
    ]:
        return -1
    elif metric_name in [
        "hamming_loss",
        "hinge_loss",
        "zero_one_loss",
        "max_error",
        "mean_absolute_error",
        "mean_squared_error",
        "root_mean_squared_error",
        "mean_squared_log_error",
        "root_mean_squared_log_error",
        "median_absolute_error",
        "mean_poisson_deviance",
        "mean_gamma_deviance",
        "mean_absolute_percentage_error",
    ]:
        return +1
    else:
        raise ValueError(f"Metric '{metric_name}' not found.")

mapk(actual, predicted, k=10)

Computes the mean average precision at k. This function computes the mean average precision at k between two lists of lists of items.

Parameters:

Name Type Description Default
actual list

A list of lists of elements that are to be predicted (order doesn’t matter in the lists)

required
predicted list

A list of lists of predicted elements (order matters in the lists)

required
k int

The maximum number of predicted elements

10

Returns:

Name Type Description
score float

The mean average precision at k over the input lists

Source code in spotpython/utils/metrics.py
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
def mapk(actual, predicted, k=10):
    """
    Computes the mean average precision at k.
    This function computes the mean average precision at k between two lists
    of lists of items.

    Args:
        actual (list): A list of lists of elements that are to be predicted
            (order doesn't matter in the lists)
        predicted (list): A list of lists of predicted elements
            (order matters in the lists)
        k (int): The maximum number of predicted elements

    Returns:
        score (float): The mean average precision at k over the input lists
    """
    return np.mean([apk(a, p, k) for a, p in zip(actual, predicted)])

mapk_score(y_true, y_pred, k=3)

Wrapper for mapk func using numpy arrays

Args: y_true (np.array): array of true values y_pred (np.array): array of predicted values k (int): number of predictions

Returns:

Name Type Description
score float

mean average precision at k

Examples:

>>> y_true = np.array([0, 1, 2, 2])
>>> y_pred = np.array([[0.5, 0.2, 0.2],  # 0 is in top 2
         [0.3, 0.4, 0.2],  # 1 is in top 2
         [0.2, 0.4, 0.3],  # 2 is in top 2
         [0.7, 0.2, 0.1]]) # 2 isn't in top 2
>>> mapk_score(y_true, y_pred, k=1)
0.25
>>> mapk_score(y_true, y_pred, k=2)
0.375
>>> mapk_score(y_true, y_pred, k=3)
0.4583333333333333
>>> mapk_score(y_true, y_pred, k=4)
0.4583333333333333
Source code in spotpython/utils/metrics.py
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
def mapk_score(y_true, y_pred, k=3):
    """Wrapper for mapk func using numpy arrays

     Args:
            y_true (np.array): array of true values
            y_pred (np.array): array of predicted values
            k (int): number of predictions

    Returns:
            score (float): mean average precision at k

    Examples:
            >>> y_true = np.array([0, 1, 2, 2])
            >>> y_pred = np.array([[0.5, 0.2, 0.2],  # 0 is in top 2
                     [0.3, 0.4, 0.2],  # 1 is in top 2
                     [0.2, 0.4, 0.3],  # 2 is in top 2
                     [0.7, 0.2, 0.1]]) # 2 isn't in top 2
            >>> mapk_score(y_true, y_pred, k=1)
            0.25
            >>> mapk_score(y_true, y_pred, k=2)
            0.375
            >>> mapk_score(y_true, y_pred, k=3)
            0.4583333333333333
            >>> mapk_score(y_true, y_pred, k=4)
            0.4583333333333333
    """
    y_true = series_to_array(y_true)
    sorted_prediction_ids = np.argsort(-y_pred, axis=1)
    top_k_prediction_ids = sorted_prediction_ids[:, :k]
    score = mapk(y_true.reshape(-1, 1), top_k_prediction_ids, k=k)
    return score

mapk_scorer(estimator, X, y)

Scorer for mean average precision at k. This function computes the mean average precision at k between two lists of lists of items.

Parameters:

Name Type Description Default
estimator sklearn estimator

The estimator to be used for prediction.

required
X array-like of shape (n_samples, n_features

The input samples.

required
y array-like of shape (n_samples,

The target values.

required

Returns:

Name Type Description
score float

The mean average precision at k over the input lists

Source code in spotpython/utils/metrics.py
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
def mapk_scorer(estimator, X, y):
    """
    Scorer for mean average precision at k.
    This function computes the mean average precision at k between two lists
    of lists of items.

    Args:
        estimator (sklearn estimator): The estimator to be used for prediction.
        X (array-like of shape (n_samples, n_features)): The input samples.
        y (array-like of shape (n_samples,)): The target values.

    Returns:
        score (float): The mean average precision at k over the input lists
    """
    y_pred = estimator.predict_proba(X)
    score = mapk_score(y, y_pred, k=3)
    return score