Predict iVarPro Gradients on New Data

Calculates case-specific iVarPro gradients on a new feature matrix by reusing the rule-level gradients stored in a previously computed iVarPro object and recomputing rule membership for the new cases.

# S3 method for class 'ivarpro'
predict(object,
  newdata = NULL,
  model = NULL,
  noise.na = NULL,
  path.store.membership = FALSE,
  save.data = TRUE,
  ...)

Arguments

object: An object returned by ivarpro (a numeric data.frame for univariate outcomes, or a list of such data.frames for multivariate outcomes). The object must contain the "ivarpro.path" attribute with stored rule metadata and rule-level gradients.
newdata: Optional data.frame of predictors for which to compute case-specific gradients. If supplied, gradients are computed for newdata using full membership (all eligible trees). If NULL (default), the function attempts a restore prediction for the training data, using OOB membership so that the returned gradients match the original ivarpro() output when possible.
model: Optional model override. If NULL (default), the model is taken from attr(object, "model"). The stored model can be either a varpro object or a randomForestSRC rfsrc grow object.
noise.na: Logical controlling how cells with no usable contributing rules are handled. If NULL (default), inherits the setting stored in attr(object, "ivarpro.path")$noise.na. If TRUE, such cells are set to NA; if FALSE, they are set to zero.
path.store.membership: Logical. If TRUE, store the rule membership indices (case IDs) for the prediction in attr(out, "ivarpro.path")$oobMembership. This enables ladder-based bands in plot.ivarpro and summaries via ivarpro_band() for the predicted gradients, at the cost of additional memory. Default is FALSE.
save.data: Logical. If TRUE (default) and newdata is supplied, save the predictor data used for prediction as attr(out, "data"), enabling downstream plotting functions (e.g., plot.ivarpro) to retrieve data automatically.
...: Additional arguments passed to randomForestSRC::predict.rfsrc().

Details

A previously computed iVarPro object contains (i) a set of retained VarPro rules identified by tree and node/branch IDs, (ii) a release-variable index for each rule, and (iii) an estimated rule-level gradient. For new cases, the function obtains terminal node membership for each tree using randomForestSRC::predict.rfsrc(membership = TRUE) and determines which stored rules apply to each case by matching the case's terminal node to the stored rule node/branch ID for that tree. Case-specific gradients are then computed by aggregating the stored rule-level gradients over all rules that apply to the case and release the corresponding variable.

When newdata is not supplied, predict() attempts to return OOB case-specific gradients for the original training data. When the iVarPro object was created with membership information stored in its path (path.store.membership = TRUE, the default for ivarpro()), the restore prediction will reproduce the original iVarPro matrix.

If save.data = TRUE and newdata is supplied, attr(out, "data") contains the predictor matrix used for prediction, so downstream wrappers such as plot.ivarpro can be used directly. If you want ladder bands on predicted gradients, set path.store.membership = TRUE when predicting.

Value

Returns an object with the same shape as object:

For univariate outcomes, a numeric data.frame of dimension $n_{test} \times p$ containing predicted case-specific gradients.
For multivariate outcomes, a named list of such data.frames, one per outcome coordinate.

The returned object includes an "ivarpro.path" attribute containing rule metadata and (optionally) prediction-time membership indices.

Examples

# \donttest{

## ------------------------------------------------------------
##
## Restore mode example (training OOB gradients are reproduced)
##
## ------------------------------------------------------------

data(peakVO2, package = "randomForestSRC")

## Fit VarPro and iVarPro
vp  <- varpro(Surv(ttodead, died) ~ ., peakVO2, ntree = 50)
imp <- ivarpro(vp)

## Restore prediction: should match the original iVarPro matrix
p.restore <- predict(imp)
all.equal(p.restore, imp, check.attributes = FALSE)

## Downstream plotting using the restored gradients
plot(p.restore, var = "peak.vo2",
     col.var = "interval", size.var = "y",
     data = attr(imp, "data"))


## ------------------------------------------------------------
##
## Synthetic example (Friedman #1) with prediction on new data
##
## ------------------------------------------------------------

if (requireNamespace("mlbench", quietly = TRUE)) {

  set.seed(123)

  ## training data
  tr <- mlbench::mlbench.friedman1(500, sd = 1)
  train <- data.frame(tr$x, y = tr$y)
  colnames(train)[1:ncol(tr$x)] <- paste0("x", 1:ncol(tr$x))

  ## ivarpro fit on training data
  vp <- varpro(y ~ ., train, ntree = 100)
  imp <- ivarpro(vp)

  ## test data
  te <- mlbench::mlbench.friedman1(1e4, sd = 1)
  test <- data.frame(te$x)
  colnames(test) <- paste0("x", 1:ncol(te$x))

  ## predicted gradients on test data
  p.test <- predict(imp, newdata = test)

  ## partial plot directly on predicted object
  plot(p.test, var = "x1", col.var = "x2")


  ## push x1 outside the original Friedman support [0, 1]
  ## a heuristic way to test out-of-distribution (OOD)
  test.ood <- test
  test.ood$x1 <- runif(nrow(test), min = -0.5, max = 1.5)

  ## predicted gradients on OOD test data
  p.test.ood <- predict(imp, newdata = test.ood)

  ## partial plot showing support for x1
  plot(p.test.ood, var = "x1", col.var = "x2",
       x.dist = c("density", "rug"))
  ## reference lines for the original training support of x1
  abline(v = c(0, 1), lty = 2)

}

# }

Arguments

Details

Value

See also

Examples