utils.stats.condition_index
utils.stats.condition_index(df)Calculates the Condition Index for a DataFrame to assess multicollinearity.
The Condition Index is computed based on the eigenvalues of the covariance matrix of the standardized data. High condition indices suggest potential multicollinearity issues.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| df | pandas.DataFrame | A DataFrame containing the independent variables. | required |
Returns
| Name | Type | Description |
|---|---|---|
| pd.DataFrame | pandas.DataFrame: A DataFrame with the following columns: - ‘Index’: The index of the eigenvalue. - ‘Eigenvalue’: The eigenvalue of the covariance matrix. - ‘Condition Index’: The Condition Index for the eigenvalue. |
Examples
>>> from spotpython.utils.stats import condition_index
>>> import pandas as pd
>>> data = pd.DataFrame({
... 'x1': [1, 2, 3, 4, 5],
... 'x2': [2, 4, 6, 8, 10],
... 'x3': [1, 3, 5, 7, 9]
... })
>>> condition_index(data)
Index Eigenvalue Condition Index
0 0 1.140000 1.000000
1 1 0.000000 inf
2 2 0.002857 20.000000