PhiSpace.Rd
PhiSpace annotates a query dataset or a list of query datasets, given an annotated bulk or single-cell RNA-seq references. PhiSpace can simultaneously model multiple layers of cell phenotypes, e.g. cell type and disease condtion.
PhiSpace(
reference,
query,
phenotypes = NULL,
response = NULL,
PhiSpaceAssay = "rank",
regMethod = c("PLS", "PCA"),
ncomp = NULL,
nfeat = NULL,
selectedFeat = NULL,
center = TRUE,
scale = FALSE,
DRinfo = FALSE,
storeUnNorm = FALSE,
updateRef = FALSE,
assay2rank = NULL
)
The references. A SingleCellExperiment
(SCE) object or a list of SCE objects. Each must contain an assay named by PhiSpaceAssay
.
The queries. An SCE object or a list of SCE object. Each must contain an assay named by PhiSpaceAssay
.
Which phenotypes (e.g. "cell type") to predict. If NULL
, then have to specify response
.
Named matrix. Rows correpond to cells (columns) in reference; columns correspond to phenotypes. If not NULL
, then will override phenotypes
.
Character. Which assay to use to train.
Character. Regression method: one of "PLS" and "PCA".
Integer. Number of components. If NULL
, will use the default, i.e. same as the total number of phenotypes.
Integer. Number of features to choose to predict each phenotype. See details.
Character. Alternatively, can provide a vector of pre-selected features.
Logic. Whether to perform centering. See details.
Logic. Whether to perform scaling. See details.
Logic. Whether to return dimension reduction information from PCA or PLS. By default disabled to save memory.
Store unnormalised raw PhiSpace scores or not. Default is FALSE
.
Update reference (store reference PhiSpace scores in reference sce object) or not.
Which assay should be used for rank transform. If not specified, "rank" will be used.
If updateRef = FALSE
(default): An updated query SCE object with PhiSpace annotation results stored in reducedDim slot "PhiSpace";
If updateRef = TRUE
: A list of updated reference and query SCE objects with PhiSpace annotation results stored in reducedDim slot "PhiSpace".
Parameter tuning By default, PhiSpace takes ncomp
equal to the total number of phenotypes. For example, if there are
20 cell types and 5 sample sources (e.g. in vivo and in vitro) defined in the reference, ncomp
will be set to be 25.
By default, PhiSpace doesn't do any feature selection. However, feature selection is very straightforward in PhiSpace. The partial least squares model automatically rank all features
according to their contribution to predicting each phenotype. For example, if there are 25 phenotypes then PLS will return
25 rankings of all features. Based on this, customised feature selection can be done in two ways. The user either
specify nfeat
, i.e. the number of features (or markers) used to predict each phenotype; or specify selectedFeat
, i.e.
a subset of features used to predict all phenotypes (e.g. highly variable genes). If nfeat
is provided, then selectedFeat
will be automatically defined as the union of the nfeat
markers of all phenotypes.
Center and scale By default, PhiSpace only center the data but do not scale. This is because we often use more features than standard single-cell pipelines (e.g. Seurat selects 2,000 highly variable genes) for a better prediction. This means that we might have included genes with small variance and scaling would greatly inflate the expression levels of these genes.
Mao J., Deng, Y. and Lê Cao, K.-A. (2024).Φ-Space: Continuous phenotyping of single-cell multi-omics data. bioRxiv.