Uses supervised learning methods (PLS-DA, PLS, or DWD) to rank features by their ability to discriminate between groups or predict a continuous response. Based on PhiSpace's correlatePhiSpace but generalized to work with any numeric data (not just PhiSpace scores).

rankFeatures(
  data,
  response,
  method = c("PLSDA", "PLS", "DWD"),
  source = c("reducedDim", "assay"),
  assay_name = "logcounts",
  reducedDim_name = "PhiSpace",
  ncomp = NULL,
  center = TRUE,
  scale = FALSE,
  dwd_params = list(),
  seed = NULL
)

Arguments

data

A matrix or data.frame where rows are observations and columns are features. Can also be a SpatialExperiment, SingleCellExperiment, or SummarizedExperiment object.

response

Either a character string specifying a column name in colData (when data is an experiment object), or a vector of response values. For classification (PLSDA, DWD), should be factor or character. For regression (PLS), should be numeric.

method

Character string specifying the method: "PLSDA" for classification, "PLS" for regression, or "DWD" for binary classification. Default is "PLSDA".

source

Character string specifying data source when data is an experiment object: "assay" or "reducedDim". Default is "reducedDim".

assay_name

Character string specifying which assay to use when source = "assay". Default is "logcounts".

reducedDim_name

Character string specifying which reduced dimension to use when source = "reducedDim". Default is "PhiSpace".

ncomp

Integer specifying number of components. If NULL (default), set automatically: number of classes for PLSDA, min(10, ncol(data)/2) for PLS. Not used for DWD.

center

Logical indicating whether to center features. Default is TRUE.

scale

Logical indicating whether to scale features. Default is FALSE.

dwd_params

List of parameters for DWD method (see Details).

seed

Integer seed for reproducibility. Default is NULL.

Value

A list with class "FeatureRanking" containing:

method

The method used

importance_scores

Matrix of feature importance scores

scores

Component or discriminant scores for observations

model

The fitted model object

feature_ranking

Data frame with features ranked by importance

response_summary

Summary of the response variable

parameters

List of parameters used

Details

Methods:

  • PLSDA: Partial Least Squares Discriminant Analysis for multi-class classification. Features ranked by coefficient magnitude.

  • PLS: Partial Least Squares regression for continuous response. Features ranked by coefficient magnitude.

  • DWD: Distance Weighted Discrimination for binary classification. Features ranked by discriminant weights.

DWD Parameters (in dwd_params list):

  • kernel: Kernel function (default: vanilladot())

  • qval: q-value parameter (default: 1)

  • lambda: Regularization parameter or sequence (default: auto-tuned)

  • cv_folds: Cross-validation folds (default: 5)

Examples

if (FALSE) { # \dontrun{
# With matrix input - PLSDA
expr_mat <- t(assay(spe, "logcounts"))
result <- rankFeatures(expr_mat, response = spe$cluster, method = "PLSDA")
head(result$feature_ranking, 10)

# With experiment object - from assay
result <- rankFeatures(spe, response = "cluster", method = "PLSDA",
                       source = "assay", assay_name = "logcounts")

# From reduced dimensions (e.g., PhiSpace scores)
result <- rankFeatures(spe, response = "cluster", method = "PLSDA",
                       source = "reducedDim", reducedDim_name = "PhiSpace")

# PLS for continuous response
result <- rankFeatures(expr_mat, response = spe$spatial_score, method = "PLS")

# DWD for binary classification
result <- rankFeatures(expr_mat, response = spe$treatment, method = "DWD")
} # }