Rank Features by Discriminative Power — rankFeatures • PhiSpace

Uses supervised learning methods (PLS-DA, PLS, or DWD) to rank features by their ability to discriminate between groups or predict a continuous response. Based on PhiSpace's correlatePhiSpace but generalized to work with any numeric data (not just PhiSpace scores).

rankFeatures(
  data,
  response,
  method = c("PLSDA", "PLS", "DWD"),
  source = c("reducedDim", "assay"),
  assay_name = "logcounts",
  reducedDim_name = "PhiSpace",
  ncomp = NULL,
  center = TRUE,
  scale = FALSE,
  dwd_params = list(),
  seed = NULL
)

Arguments

data: A matrix or data.frame where rows are observations and columns are features. Can also be a SpatialExperiment, SingleCellExperiment, or SummarizedExperiment object.
response: Either a character string specifying a column name in colData (when data is an experiment object), or a vector of response values. For classification (PLSDA, DWD), should be factor or character. For regression (PLS), should be numeric.
method: Character string specifying the method: "PLSDA" for classification, "PLS" for regression, or "DWD" for binary classification. Default is "PLSDA".
source: Character string specifying data source when data is an experiment object: "assay" or "reducedDim". Default is "reducedDim".
assay_name: Character string specifying which assay to use when source = "assay". Default is "logcounts".
reducedDim_name: Character string specifying which reduced dimension to use when source = "reducedDim". Default is "PhiSpace".
ncomp: Integer specifying number of components. If NULL (default), set automatically: number of classes for PLSDA, min(10, ncol(data)/2) for PLS. Not used for DWD.
center: Logical indicating whether to center features. Default is TRUE.
scale: Logical indicating whether to scale features. Default is FALSE.
dwd_params: List of parameters for DWD method (see Details).
seed: Integer seed for reproducibility. Default is NULL.

Value

A list with class "FeatureRanking" containing:

method: The method used
importance_scores: Matrix of feature importance scores
scores: Component or discriminant scores for observations
model: The fitted model object
feature_ranking: Data frame with features ranked by importance
response_summary: Summary of the response variable
parameters: List of parameters used

Details

Methods:

PLSDA: Partial Least Squares Discriminant Analysis for multi-class classification. Features ranked by coefficient magnitude.
PLS: Partial Least Squares regression for continuous response. Features ranked by coefficient magnitude.
DWD: Distance Weighted Discrimination for binary classification. Features ranked by discriminant weights.

DWD Parameters (in dwd_params list):

kernel: Kernel function (default: vanilladot())
qval: q-value parameter (default: 1)
lambda: Regularization parameter or sequence (default: auto-tuned)
cv_folds: Cross-validation folds (default: 5)

Examples

if (FALSE) { # \dontrun{
# With matrix input - PLSDA
expr_mat <- t(assay(spe, "logcounts"))
result <- rankFeatures(expr_mat, response = spe$cluster, method = "PLSDA")
head(result$feature_ranking, 10)

# With experiment object - from assay
result <- rankFeatures(spe, response = "cluster", method = "PLSDA",
                       source = "assay", assay_name = "logcounts")

# From reduced dimensions (e.g., PhiSpace scores)
result <- rankFeatures(spe, response = "cluster", method = "PLSDA",
                       source = "reducedDim", reducedDim_name = "PhiSpace")

# PLS for continuous response
result <- rankFeatures(expr_mat, response = spe$spatial_score, method = "PLS")

# DWD for binary classification
result <- rankFeatures(expr_mat, response = spe$treatment, method = "DWD")
} # }