Despite their many advantages, single-cell technologies can suffer from the presence of contamination arising from different sources. High levels of contamination can hinder important analyses such as clustering, marker identification, and differential expression. This team has previously developed a tool called decontX that estimates and removes contamination arising from the presence of ambient RNA in single-cell RNA-seq data.
This project will develop and test various approaches for estimating contamination in other single-cell data modalities such as single-cell ATAC-seq (scATAC-seq) and data with Antibody-Derived Tags (ADTs). The team will extend its Bayesian model to jointly estimate the contamination across different data types in multi-modal datasets. The project will also extend software that allows for benchmarking and comparisons between different decontamination tools.
The project will develop a framework for benchmarking performance of decontamination tools by providing curated datasets and implementing evaluation metrics. Finally, the team will deploy a Graphical User Interface (GUI) on a web server using R/Shiny to enable non-computational users to easily run and evaluate decontamination tools on their own data. This work will enhance the ability of researchers to gain meaningful insights from their single-cell data with higher levels of contamination.