utils.stats.plot_coeff_vs_pvals_by_included

utils.stats.plot_coeff_vs_pvals_by_included(
    data,
    xlabels=None,
    xlim=(0, 1),
    xlab='P value',
    ylim=None,
    ylab=None,
    yscale_log=False,
    title=None,
    grid=True,
    ncol=2,
    show=True,
    y_scaler=1.1,
)

Generates a panel of scatter plots with effect estimates of all possible models against p-values. Uses a dictionry generated by the fit_all_lm function. Each plot includes effect estimates from all models including a specific variable.

Parameters

Name Type Description Default
data dict A dictionary, generated by the fit_all_lm function, containing the following keys: - estimate (pd.DataFrame): A DataFrame containing the estimates. - xlist (list): A list of variables. - fun (str): The function name. - family (str): The family of the model. required
xlabels list A list of x-axis labels. None
xlim tuple The x-axis limits. (0, 1)
xlab str The x-axis label. 'P value'
ylim tuple The y-axis limits. None
ylab str The y-axis label. None
yscale_log bool Whether to scale y-axis to log10. Default is False. False
title str The title of the plot. None
grid bool Whether to display gridlines. Default is True. True
ncol int Number of columns in the plot grid. Default is 2. 2
show bool Whether to display the plot. Default is True. True
y_scaler float A scaling factor for the y-axis limits. Default is 1.1, i.e., 10% more than the maximum value. 1.1

Returns

Name Type Description
None None

Notes

  • Based on the R package ‘allestimates’ by Zhiqiang Wang, see https://cran.r-project.org/package=allestimates

References

Wang, Z. (2007). Two Postestimation Commands for Assessing Confounding Effects in Epidemiological Studies. The Stata Journal, 7(2), 183-196. https://doi.org/10.1177/1536867X0700700203

Examples

data = { “estimate”: pd.DataFrame({ “variables”: [“Crude”, “AL”, “AM”, “AN”, “AO”], “estimate”: [0.5, 0.6, 0.7, 0.8, 0.9], “conf_low”: [0.1, 0.2, 0.3, 0.4, 0.5], “conf_high”: [0.9, 1.0, 1.1, 1.2, 1.3], “p”: [0.01, 0.02, 0.03, 0.04, 0.05], “aic”: [100, 200, 300, 400, 500], “n”: [10, 20, 30, 40, 50] }), “xlist”: [“AL”, “AM”, “AN”, “AO”], “fun”: “all_lm” } plot_coeff_vs_pvals_by_included(data)