utils.stats.compute_coefficients_table
utils.stats.compute_coefficients_table(model, X_encoded, y, vif_table=None)Compute a coefficients table containing
- Variable name
- Zero-order correlation
- Partial correlation
- Semipartial (part) correlation
- Tolerance (1 / VIF)
- VIF
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| model | statsmodels.regression.linear_model.RegressionResultsWrapper |
A fitted OLS model from statsmodels. | required |
| X_encoded | pd.DataFrame | The DataFrame used to fit the model, including ‘const’. | required |
| y | pd.Series | Dependent variable used in fitting the model. | required |
| vif_table | pd.DataFrame | A DataFrame with columns [“feature”, “VIF”] for each column in X_encoded (typ. from statsmodels.stats.outliers_influence.variance_inflation_factor). Default is None. | None |
Returns
| Name | Type | Description |
|---|---|---|
| pd.DataFrame | pd.DataFrame with columns: - “Variable” - “Zero-Order r” - “Partial r” - “Semipartial r” - “Tolerance” - “VIF” |
Examples
>>> from spotpython.utils.stats import compute_coefficients_table
>>> import pandas as pd
>>> import statsmodels.api as sm
>>> data = pd.DataFrame({
... 'x1': [1, 2, 3, 4, 5],
... 'x2': [2, 4, 6, 8, 10],
... 'x3': [1, 3, 5, 7, 9]
... })
>>> y = pd.Series([1, 2, 3, 4, 5])
>>> X = sm.add_constant(data)
>>> model = sm.OLS(y, X).fit()
>>> vif_table = pd.DataFrame({
... 'feature': ['x1', 'x2', 'x3'],
... 'VIF': [1, 2, 3]
... })
>>> compute_coefficients_table(model, data, y, vif_table)
Variable Zero-Order r Partial r Semipartial r Tolerance VIF
0 x1 0.0 0.0 0.0 1.0 1.0
1 x2 0.0 0.0 0.0 0.5 2.0
2 x3 0.0 0.0 0.0 0.333333 3.0