“celda” stands for “CEllular Latent Dirichlet Allocation”. It is a suite of Bayesian hierarchical models and supporting functions to perform gene and cell clustering for count data generated by single cell RNA-seq platforms. This algorithm is an extension of the Latent Dirichlet Allocation (LDA) topic modeling framework that has been popular in text mining applications. This package also includes a method called decontX which can be used to estimate and remove contamination in single cell genomic data.


To install the latest stable release of celda from Bioconductor (requires R version >= 3.6):

if (!requireNamespace("BiocManager", quietly = TRUE))

The latest stable version of celda can be installed from GitHub using devtools:


The development version of celda can also be installed from GitHub using devtools:


NOTE For MAC OSX users, devtools::install_github() requires installation of libgit2. This can be installed via homebrew:

brew install libgit2


  • If you receive installation errors when Rcpp is being installed and compiled, try following the steps outlined here to solve the issue
  • If you are running R 4.0.0 or later version on MacOS Catalina and you see error 'wchar.h' file not found, you can try the method in this link:
  • If you are trying to install celda using Rstudio and get this error: could not find tools necessary to compile a package, you can try typing this before running the install command:
options(buildtools.check = function(action) TRUE)