Run t-SNE embedding with Rtsne method — getTSNE • singleCellTK

T-Stochastic Neighbour Embedding (t-SNE) algorithm is commonly for 2D visualization of single-cell data. This function wraps the Rtsne Rtsne function.

With this funciton, users can create tSNE embedding directly from raw count matrix, with necessary preprocessing including normalization, scaling, dimension reduction all automated. Yet we still recommend having the PCA as input, so that the result can match with the clustering based on the same input PCA, and will be much faster.

getTSNE(
  inSCE,
  useAssay = "logcounts",
  useReducedDim = NULL,
  useAltExp = NULL,
  reducedDimName = "TSNE",
  logNorm = FALSE,
  useFeatureSubset = NULL,
  nTop = 2000,
  center = TRUE,
  scale = TRUE,
  pca = TRUE,
  partialPCA = FALSE,
  initialDims = 25,
  theta = 0.5,
  perplexity = 30,
  nIterations = 1000,
  numThreads = 1,
  seed = NULL
)

Arguments

inSCE: Input SingleCellExperiment object.
useAssay: Assay to use for tSNE computation. If useAltExp is specified, useAssay has to exist in assays(altExp(inSCE, useAltExp)). Default "logcounts".
useReducedDim: The low dimension representation to use for UMAP computation. Default NULL.
useAltExp: The subset to use for tSNE computation, usually for the selected.variable features. Default NULL.
reducedDimName: a name to store the results of the dimension reductions. Default "TSNE".
logNorm: Whether the counts will need to be log-normalized prior to generating the tSNE via scaterlogNormCounts. Ignored when using useReducedDim. Default FALSE.
useFeatureSubset: Subset of feature to use for dimension reduction. A character string indicating a rowData variable that stores the logical vector of HVG selection, or a vector that can subset the rows of inSCE. Default NULL.
nTop: Automatically detect this number of variable features to use for dimension reduction. Ignored when using useReducedDim or using useFeatureSubset. Default 2000.
center: Whether data should be centered before PCA is applied. Ignored when using useReducedDim. Default TRUE.
scale: Whether data should be scaled before PCA is applied. Ignored when using useReducedDim. Default TRUE.
pca: Whether an initial PCA step should be performed. Ignored when using useReducedDim. Default TRUE.
partialPCA: Whether truncated PCA should be used to calculate principal components (requires the irlba package). This is faster for large input matrices. Ignored when using useReducedDim. Default FALSE.
initialDims: Number of dimensions from PCA to use as input in tSNE. Default 25.
theta: Numeric value for speed/accuracy trade-off (increase for less accuracy), set to 0.0 for exact TSNE. Default 0.5.
perplexity: perplexity parameter. Should not be bigger than 3 * perplexity < ncol(inSCE) - 1. Default 30. See Rtsne details for interpretation.
nIterations: maximum iterations. Default 1000.
numThreads: Integer, number of threads to use using OpenMP, Default 1. 0 corresponds to using all available cores.
seed: Random seed for reproducibility of tSNE results. Default NULL will use global seed in use by the R environment.

Value

A SingleCellExperiment object with tSNE computation updated in reducedDim(inSCE, reducedDimName).

Examples

data(scExample, package = "singleCellTK")
sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'")
# Run from raw counts
sce <- getTSNE(inSCE = sce, useAssay = "counts", logNorm = TRUE, nTop = 2000,
               scale = TRUE, pca = TRUE)
#> Tue Jun 28 22:03:13 2022 ... Computing Rtsne.
if (FALSE) {
# Run from PCA
sce <- scaterlogNormCounts(sce, "logcounts")
sce <- runModelGeneVar(sce)
sce <- scaterPCA(sce, useAssay = "logcounts",
                 useFeatureSubset = "HVG_modelGeneVar2000", scale = TRUE)
sce <- getTSNE(sce, useReducedDim = "PCA")
}