Skip to contents

mixqr 0.2.0

Three method extensions built on the core EM, each closing a documented gap in the mixture-of-quantile-regressions toolkit.

  • Expectile and M-quantile component-loss families (family = "expectile" / "mquantile"). Asymmetric-least-squares (Newey & Powell 1987) and asymmetric- Huber (Breckling & Chambers 1988) component losses, fitted by IRLS through new registry engines. Expectile components are crossing-free in the asymmetry level by construction; M-quantile dials between the quantile and the expectile.
  • Component-specific penalized variable selection (mixqr_pen()). A SCAD / adaptive-LASSO / LASSO / MCP penalty on the weighted check-loss M-step (the quantile analogue of Khalili & Chen 2007), with each component getting its own sparse support, a mixture BIC path for tuning, and component pruning. The inner solve reuses rqPen; selectedVars() reports the active set per component.
  • Joint multi-quantile estimation with shared classification and non-crossing (mixqr_nc()). Fits a vector of quantile levels jointly with one latent classification shared across all levels (a coupled E-step), closing the two problems Wu & Yao (2016, sec. 5) leave open: cross-level classification ambiguity and within-component crossing (repaired by monotone rearrangement, Chernozhukov, Fernandez-Val & Galichon 2010). sim_mixqr_cross() provides a crossing-exhibiting design for demonstrations.

mixqr 0.1.1

First CRAN release.

  • Exported two extension-API building blocks, weighted_rq() and constrained_kde(), so companion packages (location-varying gating, non-crossing) can reuse the component and error-density machinery without forking the core.

  • Post-review refinements (correctness and performance) addressing two independent adversarial peer reviews:

  • Constraint integrity (R1). The constrained KDE now preserves the tau-quantile = 0 constraint in every feasible case: the two-constant Hall-Presnell weights are used only when non-negative and well-conditioned, otherwise a per-point empirical-likelihood tilt (Hall & Presnell 1999) enforces the constraint (verified for tau in {0.05, 0.5, 0.9, 0.95}). Genuinely infeasible (one-sided) components are flagged via fit$diagnostics$constraint and a warning, never silently mis-calibrated.

  • Faithful Algorithm 3.1 (R2). The stochastic-EM P-step now draws the mixing probabilities (rejection-sampled, eq. 3.4) and the error density (bootstrap), not only the regression coefficients.

  • Calibrated standard errors (R3). Sparsity SEs are disclosed as classification-conditional (in summary()); under-supported components and rank-deficient weighted designs now warn; se_method/se_conditional recorded.

  • kdEM performance (R4). The E-step uses O(n) grid interpolation and the grid is built by a binned/FFT KDE (stats::density), removing the O(n^2) cost – kdEM is now ~3x ALD (was ~220x), meeting the speed target.

  • Bounded separability diagnostic (R6). mi_fraction is now a bounded trace ratio in [0, 1] (previously could return ~1e14 on imbalanced clusters).

  • New responsibility-based overlap diagnostic (fit$diagnostics$overlap), independent of the stochastic-EM path.

  • Real data (R5). Ships the engine dataset (Brinkman 1981 ethanol-combustion data, the Wu & Yao Fig. 5 example); the README example now uses it. Added a golden test reproducing the Wu & Yao Table 1 simulation means.

  • Selection rigor (R7). mixqr_select() gains criterion = "cv" (K-fold cross-validated held-out predictive log-likelihood) that PENALISES complexity and works for either engine; AIC/BIC selection now emits the mixture-boundary caveat and the ALD likelihood is labelled a working likelihood.

  • Slope-based identifiability (R8). Default label ordering is now by slope (aligned with Wu & Yao Thm 2.1’s distinct-slope condition), and the distinctness guard uses a scale-relative threshold.

  • Robustness / UX (R10). Rank-deficient (collinear) designs now error clearly; added confint.mixqr() (Wald intervals).

  • Calibrated standard errors. The sparsity variance now reads f(0) off a kernel density estimate of the component residuals (Wu & Yao 2016, p.166) rather than the ALD working density. A Monte-Carlo benchmark (inst/benchmarks/se_coverage.R) shows variance = "stochEM" now achieves ~95% (near-nominal) coverage for the regression coefficients, up from ~67-77%; the mixing-probability intervals reflect the documented finite-sample pi-bias.

  • Diagnostics & docs. New mixqr() help sections on the Wu & Yao sec.6 semiparametric bias and on standard-error validity; predict(type = "quantile_byclass"); component-collapse and ALD non-monotonicity warnings; fit$total_iter (total EM iterations across starts). Removed the dead package URL.

  • Documentation site. A full pkgdown website with a comprehensive applied tutorial (“A Tutorial on Mixtures of Quantile Regressions”) featuring publication-ready ggplot2 visualizations, a get-started vignette, and a validation & diagnostics article. Added inst/CITATION, author/affiliation metadata, and documented every exported method.

mixqr 0.1.0

First release. The frequentist EM substrate (sub-project 01 of the QMM suite).

  • mixqr() fits finite mixtures of tau-quantile regressions with two engines: "ald" (fast parametric asymmetric-Laplace mixture, genuine likelihood + AIC/BIC) and "kdEM" (Wu & Yao 2016 kernel-density EM with nonparametric component error densities, unequal or pooled), via a generic pluggable EM driver mixqr_em().
  • Constrained KDE error densities with two-constant Hall & Presnell (1999) weights enforcing the tau-th quantile = 0 exactly.
  • Multi-start estimation; label-switching constraint and identifiability guard (Wu & Yao Thm 2.1).
  • Variance: sparsity standard errors (eq. 3.3) and the stochastic-EM multiple-imputation estimator (Algorithm 3.1, V_W + (1 + 1/B) V_B) with a cluster-separability diagnostic.
  • mixqr_select() for component-count selection (AIC/BIC).
  • S3 methods: print, summary, coef, vcov, logLik, AIC, BIC, predict, plot, fitted, residuals, nobs.
  • Simulation generators sim_mixqr2() / sim_mixqr3() reproducing the Wu & Yao 2- and 3-component designs.
  • Extensible engine contract (register_mixqr_engine()) and reserved diagnostics$crossing / diagnostics$class_stability slots — the integration channel for QMM sub-projects 03 (gating) and 04 (non-crossing).

Note: this v0.1 is pure R. Rcpp acceleration of the KDE/E-step hot loops is planned.