Embeds cells in two dimensions using umap based on a celda model. For celda_C sce objects, PCA on the normalized counts is used to reduce the number of features before applying UMAP. For celda_CG sce object, UMAP is run on module probabilities to reduce the number of features instead of using PCA. Module probabilities are square-root transformed before applying UMAP.

celdaUmap(
  sce,
  useAssay = "counts",
  altExpName = "featureSubset",
  maxCells = NULL,
  minClusterSize = 100,
  modules = NULL,
  seed = 12345,
  nNeighbors = 30,
  minDist = 0.75,
  spread = 1,
  pca = TRUE,
  initialDims = 50,
  normalize = "proportion",
  scaleFactor = NULL,
  transformationFun = sqrt,
  cores = 1,
  ...
)

# S4 method for SingleCellExperiment
celdaUmap(
  sce,
  useAssay = "counts",
  altExpName = "featureSubset",
  maxCells = NULL,
  minClusterSize = 100,
  modules = NULL,
  seed = 12345,
  nNeighbors = 30,
  minDist = 0.75,
  spread = 1,
  pca = TRUE,
  initialDims = 50,
  normalize = "proportion",
  scaleFactor = NULL,
  transformationFun = sqrt,
  cores = 1,
  ...
)

Arguments

sce

A SingleCellExperiment object returned by celda_C, celda_G, or celda_CG.

useAssay

A string specifying which assay slot to use. Default "counts".

altExpName

The name for the altExp slot to use. Default "featureSubset".

maxCells

Integer. Maximum number of cells to plot. Cells will be randomly subsampled if ncol(sce) > maxCells. Larger numbers of cells requires more memory. If NULL, no subsampling will be performed. Default NULL.

minClusterSize

Integer. Do not subsample cell clusters below this threshold. Default 100.

modules

Integer vector. Determines which features modules to use for UMAP. If NULL, all modules will be used. Default NULL.

seed

Integer. Passed to with_seed. For reproducibility, a default value of 12345 is used. If NULL, no calls to with_seed are made.

nNeighbors

The size of local neighborhood used for manifold approximation. Larger values result in more global views of the manifold, while smaller values result in more local data being preserved. Default 30. See umap for more information.

minDist

The effective minimum distance between embedded points. Smaller values will result in a more clustered/clumped embedding where nearby points on the manifold are drawn closer together, while larger values will result on a more even dispersal of points. Default 0.75. See umap for more information.

spread

The effective scale of embedded points. In combination with min_dist, this determines how clustered/clumped the embedded points are. Default 1. See umap for more information.

pca

Logical. Whether to perform dimensionality reduction with PCA before UMAP. Only works for celda_C sce objects.

initialDims

Integer. Number of dimensions from PCA to use as input in UMAP. Default 50. Only works for celda_C sce objects.

normalize

Character. Passed to normalizeCounts in normalization step. Divides counts by the library sizes for each cell. One of 'proportion', 'cpm', 'median', or 'mean'. 'proportion' uses the total counts for each cell as the library size. 'cpm' divides the library size of each cell by one million to produce counts per million. 'median' divides the library size of each cell by the median library size across all cells. 'mean' divides the library size of each cell by the mean library size across all cells.

scaleFactor

Numeric. Sets the scale factor for cell-level normalization. This scale factor is multiplied to each cell after the library size of each cell had been adjusted in normalize. Default NULL which means no scale factor is applied.

transformationFun

Function. Applys a transformation such as 'sqrt', 'log', 'log2', 'log10', or 'log1p'. If NULL, no transformation will be applied. Occurs after applying normalization and scale factor. Default NULL.

cores

Number of threads to use. Default 1.

...

Additional parameters to pass to umap.

Value

sce with UMAP coordinates (columns "celda_UMAP1" & "celda_UMAP2") added to reducedDim(sce, "celda_UMAP").

Examples

data(sceCeldaCG) umapRes <- celdaUmap(sceCeldaCG)