utils.stats.compute_coefficients_table

utils.stats.compute_coefficients_table(model, X_encoded, y, vif_table=None)

Compute a coefficients table containing

  1. Variable name
  2. Zero-order correlation
  3. Partial correlation
  4. Semipartial (part) correlation
  5. Tolerance (1 / VIF)
  6. VIF

Parameters

Name Type Description Default
model statsmodels.regression.linear_model.RegressionResultsWrapper A fitted OLS model from statsmodels. required
X_encoded pd.DataFrame The DataFrame used to fit the model, including ‘const’. required
y pd.Series Dependent variable used in fitting the model. required
vif_table pd.DataFrame A DataFrame with columns [“feature”, “VIF”] for each column in X_encoded (typ. from statsmodels.stats.outliers_influence.variance_inflation_factor). Default is None. None

Returns

Name Type Description
pd.DataFrame pd.DataFrame with columns: - “Variable” - “Zero-Order r” - “Partial r” - “Semipartial r” - “Tolerance” - “VIF”

Examples

>>> from spotpython.utils.stats import compute_coefficients_table
>>> import pandas as pd
>>> import statsmodels.api as sm
>>> data = pd.DataFrame({
...     'x1': [1, 2, 3, 4, 5],
...     'x2': [2, 4, 6, 8, 10],
...     'x3': [1, 3, 5, 7, 9]
... })
>>> y = pd.Series([1, 2, 3, 4, 5])
>>> X = sm.add_constant(data)
>>> model = sm.OLS(y, X).fit()
>>> vif_table = pd.DataFrame({
...     'feature': ['x1', 'x2', 'x3'],
...     'VIF': [1, 2, 3]
... })
>>> compute_coefficients_table(model, data, y, vif_table)
   Variable  Zero-Order r  Partial r  Semipartial r  Tolerance  VIF
0       x1           0.0        0.0            0.0        1.0  1.0
1       x2           0.0        0.0            0.0        0.5  2.0
2       x3           0.0        0.0            0.0        0.333333  3.0