Changelog
Source:NEWS.md
mixqr 0.2.0
Three method extensions built on the core EM, each closing a documented gap in the mixture-of-quantile-regressions toolkit.
-
Expectile and M-quantile component-loss families (
family = "expectile"/"mquantile"). Asymmetric-least-squares (Newey & Powell 1987) and asymmetric- Huber (Breckling & Chambers 1988) component losses, fitted by IRLS through new registry engines. Expectile components are crossing-free in the asymmetry level by construction; M-quantile dials between the quantile and the expectile. -
Component-specific penalized variable selection (
mixqr_pen()). A SCAD / adaptive-LASSO / LASSO / MCP penalty on the weighted check-loss M-step (the quantile analogue of Khalili & Chen 2007), with each component getting its own sparse support, a mixture BIC path for tuning, and component pruning. The inner solve reusesrqPen;selectedVars()reports the active set per component. -
Joint multi-quantile estimation with shared classification and non-crossing (
mixqr_nc()). Fits a vector of quantile levels jointly with one latent classification shared across all levels (a coupled E-step), closing the two problems Wu & Yao (2016, sec. 5) leave open: cross-level classification ambiguity and within-component crossing (repaired by monotone rearrangement, Chernozhukov, Fernandez-Val & Galichon 2010).sim_mixqr_cross()provides a crossing-exhibiting design for demonstrations.
mixqr 0.1.1
First CRAN release.
Exported two extension-API building blocks,
weighted_rq()andconstrained_kde(), so companion packages (location-varying gating, non-crossing) can reuse the component and error-density machinery without forking the core.Post-review refinements (correctness and performance) addressing two independent adversarial peer reviews:
Constraint integrity (R1). The constrained KDE now preserves the tau-quantile = 0 constraint in every feasible case: the two-constant Hall-Presnell weights are used only when non-negative and well-conditioned, otherwise a per-point empirical-likelihood tilt (Hall & Presnell 1999) enforces the constraint (verified for tau in {0.05, 0.5, 0.9, 0.95}). Genuinely infeasible (one-sided) components are flagged via
fit$diagnostics$constraintand a warning, never silently mis-calibrated.Faithful Algorithm 3.1 (R2). The stochastic-EM P-step now draws the mixing probabilities (rejection-sampled, eq. 3.4) and the error density (bootstrap), not only the regression coefficients.
Calibrated standard errors (R3). Sparsity SEs are disclosed as classification-conditional (in
summary()); under-supported components and rank-deficient weighted designs now warn;se_method/se_conditionalrecorded.kdEM performance (R4). The E-step uses O(n) grid interpolation and the grid is built by a binned/FFT KDE (
stats::density), removing the O(n^2) cost – kdEM is now ~3x ALD (was ~220x), meeting the speed target.Bounded separability diagnostic (R6).
mi_fractionis now a bounded trace ratio in [0, 1] (previously could return ~1e14 on imbalanced clusters).New responsibility-based overlap diagnostic (
fit$diagnostics$overlap), independent of the stochastic-EM path.Real data (R5). Ships the
enginedataset (Brinkman 1981 ethanol-combustion data, the Wu & Yao Fig. 5 example); the README example now uses it. Added a golden test reproducing the Wu & Yao Table 1 simulation means.Selection rigor (R7).
mixqr_select()gainscriterion = "cv"(K-fold cross-validated held-out predictive log-likelihood) that PENALISES complexity and works for either engine; AIC/BIC selection now emits the mixture-boundary caveat and the ALD likelihood is labelled a working likelihood.Slope-based identifiability (R8). Default label ordering is now by slope (aligned with Wu & Yao Thm 2.1’s distinct-slope condition), and the distinctness guard uses a scale-relative threshold.
Robustness / UX (R10). Rank-deficient (collinear) designs now error clearly; added
confint.mixqr()(Wald intervals).Calibrated standard errors. The sparsity variance now reads
f(0)off a kernel density estimate of the component residuals (Wu & Yao 2016, p.166) rather than the ALD working density. A Monte-Carlo benchmark (inst/benchmarks/se_coverage.R) showsvariance = "stochEM"now achieves ~95% (near-nominal) coverage for the regression coefficients, up from ~67-77%; the mixing-probability intervals reflect the documented finite-sample pi-bias.Diagnostics & docs. New
mixqr()help sections on the Wu & Yao sec.6 semiparametric bias and on standard-error validity;predict(type = "quantile_byclass"); component-collapse and ALD non-monotonicity warnings;fit$total_iter(total EM iterations across starts). Removed the dead package URL.Documentation site. A full pkgdown website with a comprehensive applied tutorial (“A Tutorial on Mixtures of Quantile Regressions”) featuring publication-ready ggplot2 visualizations, a get-started vignette, and a validation & diagnostics article. Added
inst/CITATION, author/affiliation metadata, and documented every exported method.
mixqr 0.1.0
First release. The frequentist EM substrate (sub-project 01 of the QMM suite).
-
mixqr()fits finite mixtures of tau-quantile regressions with two engines:"ald"(fast parametric asymmetric-Laplace mixture, genuine likelihood + AIC/BIC) and"kdEM"(Wu & Yao 2016 kernel-density EM with nonparametric component error densities, unequal or pooled), via a generic pluggable EM drivermixqr_em(). - Constrained KDE error densities with two-constant Hall & Presnell (1999) weights enforcing the tau-th quantile = 0 exactly.
- Multi-start estimation; label-switching constraint and identifiability guard (Wu & Yao Thm 2.1).
- Variance: sparsity standard errors (eq. 3.3) and the stochastic-EM multiple-imputation estimator (Algorithm 3.1,
V_W + (1 + 1/B) V_B) with a cluster-separability diagnostic. -
mixqr_select()for component-count selection (AIC/BIC). - S3 methods: print, summary, coef, vcov, logLik, AIC, BIC, predict, plot, fitted, residuals, nobs.
- Simulation generators
sim_mixqr2()/sim_mixqr3()reproducing the Wu & Yao 2- and 3-component designs. - Extensible engine contract (
register_mixqr_engine()) and reserveddiagnostics$crossing/diagnostics$class_stabilityslots — the integration channel for QMM sub-projects 03 (gating) and 04 (non-crossing).
Note: this v0.1 is pure R. Rcpp acceleration of the KDE/E-step hot loops is planned.