Skip to contents

Component-specific penalized variable selection for the mixqr() model: each latent component gets its own sparse slope vector, a covariate active in one quantile-regression cluster and dropped in another. The weighted check-loss M-step gains a SCAD / adaptive-LASSO / LASSO / MCP penalty (intercept free); the penalty strength lambda is selected by a mixture BIC over a path, and components whose mixing weight falls below pi_min are pruned.

Usage

mixqr_pen(
  formula,
  data,
  tau = 0.5,
  m = 2L,
  penalty = c("SCAD", "aLASSO", "LASSO", "MCP"),
  lambda = NULL,
  nlambda = 40L,
  a = 3.7,
  alasso_gamma = 1,
  penalty_factor = NULL,
  pi_min = 0.02,
  nstart = 10L,
  control = mixqr_control(),
  weights = NULL
)

Arguments

formula, data, tau, m, nstart, control, weights

As in mixqr().

penalty

One of "SCAD" (default), "aLASSO", "LASSO", "MCP". "aLASSO" is implemented as a LASSO with data-driven penalty factors 1 / |b_pilot|^gamma from a responsibility-weighted unpenalised pilot.

lambda

Optional fixed penalty (skips selection). If NULL, a path of nlambda values is built and tuned by BIC.

nlambda

Path length when lambda is NULL. Default 40.

a

SCAD/MCP concavity. Default 3.7.

alasso_gamma

Adaptive-LASSO weight exponent (default 1).

penalty_factor

Optional length-p per-slope multipliers (0 never penalises a covariate). NULL penalises all slopes equally.

pi_min

Components with mixing weight below this are pruned. Default 0.02.

Value

A "mixqr" object with an extra $selection slot (active covariates per component, chosen lambda, the BIC path, and df).

References

Khalili, A. and Chen, J. (2007). Variable selection in finite mixture of regression models. JASA 102, 1025–1038. Sherwood, B., Li, S. and Maidman, A. (2025). rqPen. R Journal.

Examples

# \donttest{
set.seed(1)
n <- 250; p <- 8; x <- matrix(rnorm(n * p), n)
colnames(x) <- paste0("x", 1:p)
z <- rbinom(n, 1, 0.5)
y <- ifelse(z == 0, 1 + 2 * x[, 1], -1 + 2 * x[, 2]) + rnorm(n)
d <- data.frame(y = y, x)
fit <- mixqr_pen(y ~ ., data = d, tau = 0.5, m = 2, penalty = "SCAD")
selectedVars(fit)
#> $comp1
#> [1] "x2"
#> 
#> $comp2
#> [1] "x1"
#> 
# }