Uniform Manifold Approximation and Projection (UMAP) dimension reduction for celda sce object

Embeds cells in two dimensions using umap based on a celda model. For celda_C sce objects, PCA on the normalized counts is used to reduce the number of features before applying UMAP. For celda_CG sce object, UMAP is run on module probabilities to reduce the number of features instead of using PCA. Module probabilities are square-root transformed before applying UMAP.

celdaUmap(
  sce,
  useAssay = "counts",
  altExpName = "featureSubset",
  maxCells = NULL,
  minClusterSize = 100,
  modules = NULL,
  seed = 12345,
  nNeighbors = 30,
  minDist = 0.75,
  spread = 1,
  pca = TRUE,
  initialDims = 50,
  normalize = "proportion",
  scaleFactor = NULL,
  transformationFun = sqrt,
  cores = 1,
  ...
)

# S4 method for SingleCellExperiment
celdaUmap(
  sce,
  useAssay = "counts",
  altExpName = "featureSubset",
  maxCells = NULL,
  minClusterSize = 100,
  modules = NULL,
  seed = 12345,
  nNeighbors = 30,
  minDist = 0.75,
  spread = 1,
  pca = TRUE,
  initialDims = 50,
  normalize = "proportion",
  scaleFactor = NULL,
  transformationFun = sqrt,
  cores = 1,
  ...
)

Arguments

sce	A SingleCellExperiment object returned by celda_C, celda_G, or celda_CG.
useAssay	A string specifying which assay slot to use. Default "counts".
altExpName	The name for the altExp slot to use. Default "featureSubset".
maxCells	Integer. Maximum number of cells to plot. Cells will be randomly subsampled if `ncol(sce) > maxCells`. Larger numbers of cells requires more memory. If NULL, no subsampling will be performed. Default NULL.
minClusterSize	Integer. Do not subsample cell clusters below this threshold. Default 100.
modules	Integer vector. Determines which features modules to use for UMAP. If NULL, all modules will be used. Default NULL.
seed	Integer. Passed to with_seed. For reproducibility, a default value of 12345 is used. If NULL, no calls to with_seed are made.
nNeighbors	The size of local neighborhood used for manifold approximation. Larger values result in more global views of the manifold, while smaller values result in more local data being preserved. Default 30. See umap for more information.
minDist	The effective minimum distance between embedded points. Smaller values will result in a more clustered/clumped embedding where nearby points on the manifold are drawn closer together, while larger values will result on a more even dispersal of points. Default 0.75. See umap for more information.
spread	The effective scale of embedded points. In combination with `min_dist`, this determines how clustered/clumped the embedded points are. Default 1. See umap for more information.
pca	Logical. Whether to perform dimensionality reduction with PCA before UMAP. Only works for celda_C `sce` objects.
initialDims	Integer. Number of dimensions from PCA to use as input in UMAP. Default 50. Only works for celda_C `sce` objects.
normalize	Character. Passed to normalizeCounts in normalization step. Divides counts by the library sizes for each cell. One of 'proportion', 'cpm', 'median', or 'mean'. 'proportion' uses the total counts for each cell as the library size. 'cpm' divides the library size of each cell by one million to produce counts per million. 'median' divides the library size of each cell by the median library size across all cells. 'mean' divides the library size of each cell by the mean library size across all cells.
scaleFactor	Numeric. Sets the scale factor for cell-level normalization. This scale factor is multiplied to each cell after the library size of each cell had been adjusted in `normalize`. Default `NULL` which means no scale factor is applied.
transformationFun	Function. Applys a transformation such as 'sqrt', 'log', 'log2', 'log10', or 'log1p'. If `NULL`, no transformation will be applied. Occurs after applying normalization and scale factor. Default `NULL`.
cores	Number of threads to use. Default 1.
...	Additional parameters to pass to umap.

Value

sce with UMAP coordinates (columns "celda_UMAP1" & "celda_UMAP2") added to reducedDim(sce, "celda_UMAP").

Examples

data(sceCeldaCG)
umapRes <- celdaUmap(sceCeldaCG)