The ComBat-Seq batch adjustment approach assumes that batch effects represent non-biological but systematic shifts in the mean or variability of genomic features for all samples within a processing batch. It uses either parametric or non-parametric empirical Bayes frameworks for adjusting data for batch effects.

runComBatSeq(
  inSCE,
  useAssay = "counts",
  batch = "batch",
  covariates = NULL,
  bioCond = NULL,
  useSVA = FALSE,
  assayName = "ComBatSeq",
  shrink = FALSE,
  shrinkDisp = FALSE,
  nGene = NULL
)

Arguments

inSCE

SingleCellExperiment inherited object. Required.

useAssay

A single character indicating the name of the assay requiring batch correction. Default "counts".

batch

A single character indicating a field in colData that annotates the batches. Default "batch".

covariates

A character vector indicating the fields in colData that annotates other covariates, such as the cell types. Default NULL.

bioCond

A single character indicating a field in colData that annotates the biological conditions. Default NULL.

useSVA

A logical scalar. Whether to estimate surrogate variables and use them as an empirical control. Default FALSE.

assayName

A single characeter. The name for the corrected assay. Will be saved to assay. Default "ComBat".

shrink

A logical scalar. Whether to apply shrinkage on parameter estimation. Default FALSE.

shrinkDisp

A logical scalar. Whether to apply shrinkage on dispersion. Default FALSE.

nGene

An integer. Number of random genes to use in empirical Bayes estimation, only useful when shrink is set to TRUE. Default NULL.

Value

The input SingleCellExperiment object with assay(inSCE, assayName) updated.

Details

For the parameters covariates and useSVA, when the cell type information is known, it is recommended to specify the cell type annotation to the argument covariates; if the cell types are unknown but expected to be balanced, it is recommended to run with default settings, yet informative covariates could still be useful. If the cell types are unknown and are expected to be unbalanced, it is recommended to set useSVA to TRUE.

Examples

data('sceBatches', package = 'singleCellTK')
sceBatches <- sample(sceBatches, 40)
# Cell type known
sceBatches <- runComBatSeq(sceBatches, "counts", "batch",
                           covariates = "cell_type",
                           assayName = "ComBat_cell_seq")
#> Found 2 batches
#> Using null model in ComBat-seq.
#> Adjusting for 1 covariate(s) or covariate level(s)
#> Estimating dispersions
#> Fitting the GLM model
#> Shrinkage off - using GLM estimates for parameters
#> Adjusting the data
# Cell type unknown but balanced
#sceBatches <- runComBatSeq(sceBatches, "counts", "batch",
#                           assayName = "ComBat_seq")
# Cell type unknown and unbalanced
#sceBatches <- runComBatSeq(sceBatches, "counts", "batch",
#                           useSVA = TRUE,
#                           assayName = "ComBat_sva_seq")