utils.stats.vif

utils.stats.vif(X, sorted=True)

Calculates the Variance Inflation Factor (VIF) for each feature in a DataFrame.

VIF measures the multicollinearity among independent variables within a regression model. High VIF values indicate high multicollinearity, which can cause issues with model interpretation and stability.

Parameters

Name Type Description Default
X pandas.DataFrame A DataFrame containing the independent variables. required
sorted bool Whether to sort the output DataFrame by VIF values. True

Returns

Name Type Description
pd.DataFrame pandas.DataFrame: A DataFrame with two columns: - “feature”: The name of the feature. - “VIF”: The Variance Inflation Factor for the feature.

Examples

>>> from spotpython.utils.stats import vif
>>> import pandas as pd
>>> data = pd.DataFrame({
...     'x1': [1, 2, 3, 4, 5],
...     'x2': [2, 4, 6, 8, 10],
...     'x3': [1, 3, 5, 7, 9]
... })
>>> vif(data)
   feature          VIF
0      x1  1260.000000
1      x2         0.000000
2      x3   630.000000