“celda” stands for “CEllular Latent Dirichlet Allocation”. It is a suite of Bayesian hierarchical models and supporting functions to perform gene and cell clustering for count data generated by single cell RNA-seq platforms. This algorithm is an extension of the Latent Dirichlet Allocation (LDA) topic modeling framework that has been popular in text mining applications. This package also includes a method called DecontX which can be used to estimate and remove contamination in single cell genomic data.

Installation Instructions

To install the latest stable release of celda from Bioconductor (requires R version >= 3.6):

if (!requireNamespace("BiocManager", quietly = TRUE))

The latest stable version of celda can be installed from GitHub using devtools:


The development version of celda can also be installed from GitHub using devtools:


NOTE For MAC OSX users, devtools::install_github() requires installation of libgit2. This can be installed via homebrew:

brew install libgit2

Also, if you receive installation errors when Rcpp is being installed and compiled, try following the steps outlined here to solve the issue:


If you are running R 4.0.0 or later version on MacOS Catalina and you see error 'wchar.h' file not found, you can try the method in this link:


NOTE If you are trying to install celda using Rstudio and get this error: could not find tools necessary to compile a package, you can try this:

options(buildtools.check = function(action) TRUE)

Vignettes and examples

To build the vignettes for Celda and DecontX during installation from GitHub, use the following command:

install_github("campbio/celda", build_vignettes = TRUE)

Note that installation may take an extra 5-10 minutes for building of the vignettes. The Celda and DecontX vignettes can then be accessed via the following commands: