PhiSpace annotates a query dataset or a list of query datasets, given an annotated bulk or single-cell RNA-seq references. PhiSpace can simultaneously model multiple layers of cell phenotypes, e.g. cell type and disease condtion.

PhiSpace(
  reference,
  query,
  phenotypes = NULL,
  response = NULL,
  PhiSpaceAssay = "rank",
  regMethod = c("PLS", "PCA"),
  ncomp = NULL,
  nfeat = NULL,
  selectedFeat = NULL,
  center = TRUE,
  scale = FALSE,
  DRinfo = FALSE,
  storeUnNorm = FALSE,
  updateRef = FALSE,
  assay2rank = NULL
)

Arguments

reference

The references. A SingleCellExperiment (SCE) object or a list of SCE objects. Each must contain an assay named by PhiSpaceAssay.

query

The queries. An SCE object or a list of SCE object. Each must contain an assay named by PhiSpaceAssay.

phenotypes

Which phenotypes (e.g. "cell type") to predict. If NULL, then have to specify response.

response

Named matrix. Rows correpond to cells (columns) in reference; columns correspond to phenotypes. If not NULL, then will override phenotypes.

PhiSpaceAssay

Character. Which assay to use to train.

regMethod

Character. Regression method: one of "PLS" and "PCA".

ncomp

Integer. Number of components. If NULL, will use the default, i.e. same as the total number of phenotypes.

nfeat

Integer. Number of features to choose to predict each phenotype. See details.

selectedFeat

Character. Alternatively, can provide a vector of pre-selected features.

center

Logic. Whether to perform centering. See details.

scale

Logic. Whether to perform scaling. See details.

DRinfo

Logic. Whether to return dimension reduction information from PCA or PLS. By default disabled to save memory.

storeUnNorm

Store unnormalised raw PhiSpace scores or not. Default is FALSE.

updateRef

Update reference (store reference PhiSpace scores in reference sce object) or not.

assay2rank

Which assay should be used for rank transform. If not specified, "rank" will be used.

Value

  • If updateRef = FALSE (default): An updated query SCE object with PhiSpace annotation results stored in reducedDim slot "PhiSpace";

  • If updateRef = TRUE: A list of updated reference and query SCE objects with PhiSpace annotation results stored in reducedDim slot "PhiSpace".

Details

  • Parameter tuning By default, PhiSpace takes ncomp equal to the total number of phenotypes. For example, if there are 20 cell types and 5 sample sources (e.g. in vivo and in vitro) defined in the reference, ncomp will be set to be 25. By default, PhiSpace doesn't do any feature selection. However, feature selection is very straightforward in PhiSpace. The partial least squares model automatically rank all features according to their contribution to predicting each phenotype. For example, if there are 25 phenotypes then PLS will return 25 rankings of all features. Based on this, customised feature selection can be done in two ways. The user either specify nfeat, i.e. the number of features (or markers) used to predict each phenotype; or specify selectedFeat, i.e. a subset of features used to predict all phenotypes (e.g. highly variable genes). If nfeat is provided, then selectedFeat will be automatically defined as the union of the nfeat markers of all phenotypes.

  • Center and scale By default, PhiSpace only center the data but do not scale. This is because we often use more features than standard single-cell pipelines (e.g. Seurat selects 2,000 highly variable genes) for a better prediction. This means that we might have included genes with small variance and scaling would greatly inflate the expression levels of these genes.

References

Mao J., Deng, Y. and Lê Cao, K.-A. (2024).Φ-Space: Continuous phenotyping of single-cell multi-omics data. bioRxiv.