utils.stats.vif
utils.stats.vif(X, sorted=True)Calculates the Variance Inflation Factor (VIF) for each feature in a DataFrame.
VIF measures the multicollinearity among independent variables within a regression model. High VIF values indicate high multicollinearity, which can cause issues with model interpretation and stability.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| X | pandas.DataFrame | A DataFrame containing the independent variables. | required |
| sorted | bool | Whether to sort the output DataFrame by VIF values. | True |
Returns
| Name | Type | Description |
|---|---|---|
| pd.DataFrame | pandas.DataFrame: A DataFrame with two columns: - “feature”: The name of the feature. - “VIF”: The Variance Inflation Factor for the feature. |
Examples
>>> from spotpython.utils.stats import vif
>>> import pandas as pd
>>> data = pd.DataFrame({
... 'x1': [1, 2, 3, 4, 5],
... 'x2': [2, 4, 6, 8, 10],
... 'x3': [1, 3, 5, 7, 9]
... })
>>> vif(data)
feature VIF
0 x1 1260.000000
1 x2 0.000000
2 x3 630.000000