Uniform Manifold Approximation and Projection (UMAP) algorithm is commonly for 2D visualization of single-cell data. These functions wrap the scater calculateUMAP function.

Users can use runQuickUMAP to directly create UMAP embedding from raw count matrix, with necessary preprocessing including normalization, variable feature selection, scaling, dimension reduction all automated. Therefore, useReducedDim is disabled for runQuickUMAP.

In a complete analysis, we still recommend having dimension reduction such as PCA created beforehand and select proper numbers of dimensions for using runUMAP, so that the result can match with the clustering based on the same input PCA.

runUMAP(
  inSCE,
  useReducedDim = "PCA",
  useAssay = NULL,
  useAltExp = NULL,
  sample = NULL,
  reducedDimName = "UMAP",
  logNorm = TRUE,
  useFeatureSubset = NULL,
  nTop = 2000,
  scale = TRUE,
  pca = TRUE,
  initialDims = 25,
  nNeighbors = 30,
  nIterations = 200,
  alpha = 1,
  minDist = 0.01,
  spread = 1,
  seed = NULL,
  verbose = TRUE,
  BPPARAM = SerialParam()
)

runQuickUMAP(inSCE, useAssay = "counts", sample = "sample", ...)

getUMAP(
  inSCE,
  useReducedDim = "PCA",
  useAssay = NULL,
  useAltExp = NULL,
  sample = NULL,
  reducedDimName = "UMAP",
  logNorm = TRUE,
  useFeatureSubset = NULL,
  nTop = 2000,
  scale = TRUE,
  pca = TRUE,
  initialDims = 25,
  nNeighbors = 30,
  nIterations = 200,
  alpha = 1,
  minDist = 0.01,
  spread = 1,
  seed = NULL,
  BPPARAM = SerialParam()
)

Arguments

inSCE

Input SingleCellExperiment object.

useReducedDim

The low dimension representation to use for UMAP computation. If useAltExp is specified, useReducedDim has to exist in reducedDims(altExp(inSCE, useAltExp)). Default "PCA".

useAssay

Assay to use for UMAP computation. If useAltExp is specified, useAssay has to exist in assays(altExp(inSCE, useAltExp)). Ignored when using useReducedDim. Default NULL.

useAltExp

The subset to use for UMAP computation, usually for the selected variable features. Default NULL.

sample

Character vector. Indicates which sample each cell belongs to. If given a single character, will take the annotation from colData. Default NULL.

reducedDimName

A name to store the results of the UMAP embedding coordinates obtained from this method. Default "UMAP".

logNorm

Whether the counts will need to be log-normalized prior to generating the UMAP via scaterlogNormCounts. Ignored when using useReducedDim. Default TRUE.

useFeatureSubset

Subset of feature to use for dimension reduction. A character string indicating a rowData variable that stores the logical vector of HVG selection, or a vector that can subset the rows of inSCE. Default NULL.

nTop

Automatically detect this number of variable features to use for dimension reduction. Ignored when using useReducedDim or using useFeatureSubset. Default 2000.

scale

Whether useAssay matrix will need to be standardized. Default TRUE.

pca

Logical. Whether to perform dimension reduction with PCA before UMAP. Ignored when using useReducedDim. Default TRUE.

initialDims

Number of dimensions from PCA to use as input in UMAP. Default 25.

nNeighbors

The size of local neighborhood used for manifold approximation. Larger values result in more global views of the manifold, while smaller values result in more local data being preserved. Default 30. See calculateUMAP for more information.

nIterations

The number of iterations performed during layout optimization. Default is 200.

alpha

The initial value of "learning rate" of layout optimization. Default is 1.

minDist

The effective minimum distance between embedded points. Smaller values will result in a more clustered/clumped embedding where nearby points on the manifold are drawn closer together, while larger values will result on a more even dispersal of points. Default 0.01. See calculateUMAP for more information.

spread

The effective scale of embedded points. In combination with minDist, this determines how clustered/clumped the embedded points are. Default 1. See calculateUMAP for more information.

seed

Random seed for reproducibility of UMAP results. Default NULL will use global seed in use by the R environment.

verbose

Logical. Whether to print log messages. Default TRUE.

BPPARAM

A BiocParallelParam object specifying whether the PCA should be parallelized.

...

Parameters passed to runUMAP

Value

A SingleCellExperiment object with UMAP computation updated in reducedDim(inSCE, reducedDimName).

Examples

data(scExample, package = "singleCellTK")
sce <- subsetSCECols(sce, colData = "type != 'EmptyDroplet'")
# Run from raw counts
sce <- runQuickUMAP(sce)
#> Sat Mar 18 10:31:36 2023 ... Computing Scater UMAP for sample 'pbmc_4k'.
#> Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
#> Also defined by ‘spam’
#> Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
#> Also defined by ‘spam’
#> Found more than one class "dist" in cache; using the first, from namespace 'BiocGenerics'
#> Also defined by ‘spam’
plotDimRed(sce, "UMAP")