utils.stats.condition_index

utils.stats.condition_index(df)

Calculates the Condition Index for a DataFrame to assess multicollinearity.

The Condition Index is computed based on the eigenvalues of the covariance matrix of the standardized data. High condition indices suggest potential multicollinearity issues.

Parameters

Name Type Description Default
df pandas.DataFrame A DataFrame containing the independent variables. required

Returns

Name Type Description
pd.DataFrame pandas.DataFrame: A DataFrame with the following columns: - ‘Index’: The index of the eigenvalue. - ‘Eigenvalue’: The eigenvalue of the covariance matrix. - ‘Condition Index’: The Condition Index for the eigenvalue.

Examples

>>> from spotpython.utils.stats import condition_index
>>> import pandas as pd
>>> data = pd.DataFrame({
...     'x1': [1, 2, 3, 4, 5],
...     'x2': [2, 4, 6, 8, 10],
...     'x3': [1, 3, 5, 7, 9]
... })
>>> condition_index(data)
   Index  Eigenvalue  Condition Index
0      0    1.140000         1.000000
1      1    0.000000              inf
2      2    0.002857        20.000000