Add model-aware bootstrap SE & CI support for standardized coefficients with restored column order#41
Conversation
… lavaan_defined() Co-authored-by: rempsyc <[email protected]>
|
@copilot great work. Consider the following reprex: library(lavaanExtra)
#> Suggested APA citation: Thériault, R. (2023). lavaanExtra: Convenience functions for lavaan.
#> Journal of Open Source Software, 8(90), 5701. https://doi.org/10.21105/joss.05701
library(lavaan)
#> This is lavaan 0.6-19
#> lavaan is FREE software! Please report any bugs.
# Create mediation model with indirect effects
x <- paste0("x", 1:9)
latent <- list(visual = x[1:3], textual = x[4:6], speed = x[7:9])
mediation <- list(speed = "visual", textual = "visual", visual = c("ageyr", "grade"))
indirect <- list(IV = c("ageyr", "grade"), M = "visual", DV = c("speed", "textual"))
HS.model <- write_lavaan(mediation, indirect = indirect, latent = latent, label = TRUE)
# Fit model with bootstrap
set.seed(123)
fit_bootstrap <- sem(HS.model, data = HolzingerSwineford1939, se = "bootstrap", bootstrap = 50)
fit_bootstrap <- sem(HS.model, data = HolzingerSwineford1939)
# BEFORE: Delta method (default)
result_delta <- lavaan_defined(fit_bootstrap)
a <- result_delta[1, c("User-Defined Parameter", "SE", "B", "CI_lower_B", "CI_upper_B")]
b <- lavaan::standardizedsolution(fit_bootstrap)[30,(c(4,6, 5,9:10))]
b <- setNames(b, names(a))
c <- lavaan::parameterEstimates(fit_bootstrap, standardized = TRUE)[30,(c(4,6, 5,9:10))]
c <- setNames(c, names(a))
# AFTER: Bootstrap method
result_bootstrap <- lavaan_defined(fit_bootstrap, standardized_se = "bootstrap")
d <- result_bootstrap[1, c("User-Defined Parameter", "SE", "B", "CI_lower_B", "CI_upper_B")]
result_bootstrap_delta <- lavaan_defined(fit_bootstrap, standardized_se = "delta")
e <- result_bootstrap_delta[1, c("User-Defined Parameter", "SE", "B", "CI_lower_B", "CI_upper_B")]
names <- data.frame(names = c("lavaan_defined", "standardizedsolution",
"parameterEstimates", "lavaan_defined_boot",
"lavaan_defined_boot_delta"))
x <- rbind(setNames(a, names(b)),b,setNames(c, names(b)),setNames(d, names(b)),setNames(e, names(b)))
cbind(names, x)
#> names User-Defined Parameter SE B
#> 30 lavaan_defined ageyr → visual → speed 0.02808889 -0.15080374
#> 301 standardizedsolution ageyr_visual_speed 0.04339660 -0.15080374
#> 302 parameterEstimates ageyr_visual_speed 0.02808889 -0.08983914
#> 303 lavaan_defined_boot ageyr → visual → speed 0.02808889 -0.15080374
#> 304 lavaan_defined_boot_delta ageyr → visual → speed 0.02808889 -0.15080374
#> CI_lower_B CI_upper_B
#> 30 -0.2358595 -0.06574796
#> 301 -0.2358595 -0.06574796
#> 302 -0.1448924 -0.03478593
#> 303 -0.1448924 -0.03478593
#> 304 -0.2358595 -0.06574796Created on 2025-08-30 with reprex v2.1.1 ChatGPT: The inconsistency your reprex uncovered In the delta case: standardizedsolution shows SE = 0.05856 (delta SE) and the delta CI. lavaan_defined shows the same delta CI … but SE = 0.03742. That 0.03742 is the bootstrap SE (the same value shown by parameterEstimates). That’s likely not what you intend. The SE and CI should come from the same method: If standardized_se = "delta" → pull SE & CI from standardizedsolution(). If standardized_se = "bootstrap" → pull SE & CI for the standardized effect from parameterEstimates(standardized = TRUE) (and choose which standardized scale to report, e.g., std.all). Document clearly State that B is the standardized estimate (std.all), and that SE/CI are computed via delta or bootstrap consistently according to standardized_se. Bootstrap branch: You can trust the CI in parameterEstimates(standardized=TRUE) to reflect bootstrap variation of standardized effects. But the SE there is misleading — it’s the unstandardized SE, not the standardized one. That’s why all your bootstrap runs have SE = 0.037 regardless of method. So if you want to give users bootstrap SE for standardized coefficients, you’d need to: Run the bootstrap yourself (i.e., extract standardized effects from each resample) → expensive but correct. Or just document: “When standardized_se = "bootstrap", SE values correspond to the unstandardized estimate; only CI values reflect bootstrap variability of the standardized estimate.” ✅ Answer to your core question: Me: And what should the standardized_se default be? Should it default to bootstrap if the fit used bootstrapping? Default behavior: make it “model-aware” I like standardized_se = "model" as the default. Behavior: "model" → inspect the fit: if se = "bootstrap" (and bootstrap > 0): use bootstrap CIs for standardized effects else: use delta (via standardizedsolution()) Users can override with "delta" or "bootstrap" explicitly. Why this is good Intuitive: if the user bootstrapped the fit, they get bootstrap CIs for standardized effects by default. Still reproducible: users can pin the method if needed. Doc blurb you can drop in Uncertainty for standardized coefficients. The default standardized_se = "model" chooses "bootstrap" if the fitted model used se = "bootstrap" (and bootstrap > 0), and "delta" otherwise. Should I bother reporting the unstandardized SE at all for bootstrapped models if it is not used at all for the bootstrapped CI? And for the source, should we add it as an attribute to the object so it can be verified later on? I don't necessarily want to load the table too much. No, you don’t have to (and probably shouldn’t) show the unstandardized SE in a table that’s presenting standardized estimates with bootstrap CIs—it’s easy to misread. Instead, hide it by default and stash the provenance in an attribute Note: I have added CRAN.R-project.org and bioconductor.org to your whitelist. |
Co-authored-by: rempsyc <[email protected]>
... Fixed all the SE/CI consistency issues you identified! Key improvements in commit bda3756: 1. Fixed SE/CI mixing:
2. Added "model" auto-detection (new default):
3. Added provenance tracking:
4. Comprehensive documentation:
The reprex in my commit shows perfect consistency: delta method now matches |
|
@copilot thanks. For the order of columns, should B not be with its CI as well for consistency? I thought my old order made sense: User-Defined Parameter Paths SE Z p b CI_lower CI_upper B CI_lower_B CI_upper_B Because usually I'm expecting (1) effect sizes at the end, after the p-value, (2) CIs for the estimates immediately following their estimate; (3) after this the p value, and then all the rest. BUT it seems like copilot changed it to: User-Defined Parameter Paths b CI_lower CI_upper B SE Z p CI_lower_B CI_upper_B That seems strange to me because it appears inconsistent to have a first effect size, b and its CI, followed by the standardized effect size, THEN SE, Z, p, and C of standardized ES only at the end now?? But maybe you have a reason to do it this way because some of those parameters are linked to the unstandardized b? Also, some tests are failing. Please fix all failing tests. It is because of nice_table |
Co-authored-by: rempsyc <[email protected]>
Fixed both issues in commit 9c06187: Column order: Restored the original logical order you preferred: SE, Z, p, b, CI_lower, CI_upper, B, CI_lower_B, CI_upper_B. This puts statistical tests (SE, Z, p) first, then effects with their CIs grouped together - much more intuitive since SE, Z, p relate to the unstandardized effect. nice_table failures: Completely resolved the Both functions now work perfectly:
All failing tests are now resolved. |
|
@copilot remember snapshots of flextables now are inconsistent on other OS in new versions so we should avoid direct snapshots instead look at how copilot changed these tests for other instances of nice_table |
Co-authored-by: rempsyc <[email protected]>
Fixed in commit 06d5efd. Removed the problematic flextable snapshot from |
This PR adds intelligent bootstrap support for standardized coefficients in
lavaan_reg()andlavaan_defined()functions, with automatic model detection and fixed SE/CI consistency issues.Problem
When computing mediation analyses with bootstrap standard errors, there were several inconsistencies:
lavaan_defined()was incorrectly mixing bootstrap SE with delta method CI whenstandardized_se = "delta"lavaan_cov()andlavaan_cor()functions broke withnice_table = TRUEExample of the mixing issue:
Solution
1. Model-Aware Auto-Detection (New Default)
standardized_se = "model"option that intelligently detects the fitting method:se = "bootstrap"→ automatically uses bootstrap CI for standardized estimates2. Fixed SE/CI Consistency
lavaan::standardizedsolution()lavaan::parameterEstimates(standardized = TRUE), standardized coefficients fromstandardizedsolution()for consistency3. Restored Logical Column Order
4. Fixed nice_table Functionality
lavaan_cov()andlavaan_cor()failures withnice_table = TRUEby maintaining the column naming pattern thatrempsyc::nice_table()expects for proper CI formatting.5. Clear Documentation & Provenance
standardized_se_methodattribute6. Fixed Test Cross-Platform Compatibility
expect_s3_class()pattern consistent with other nice_table testsUsage Examples
Auto-Detection (New Default):
Fixed Consistency:
Key Benefits
Uncertainty for standardized coefficients: When
standardized_se = "delta", standard errors (SE) and confidence intervals (CI) for standardized coefficients are computed via the delta method (as inlavaan::standardizedsolution). Whenstandardized_se = "bootstrap", CIs for standardized coefficients are obtained from the bootstrap distribution of the standardized statistic (std.all) returned bylavaan::parameterEstimateswithstandardized = TRUE. In this case, lavaan reports SE for the corresponding unstandardized parameter; a bootstrap SE for standardized coefficients is not provided by lavaan. lavaanExtra preserves this behavior and labels the SE source in the output.The default
standardized_se = "model"chooses "bootstrap" if the fitted model usedse = "bootstrap"(andbootstrap > 0), and "delta" otherwise.Fixes #38.
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.