A wrapper function for addPerCellQC. Calculate general quality control metrics for each cell in the count matrix.

runPerCellQC(
  inSCE,
  useAssay = "counts",
  mitoGeneLocation = "rownames",
  mitoRef = c(NULL, "human", "mouse"),
  mitoIDType = c("ensembl", "symbol", "entrez", "ensemblTranscriptID"),
  mitoPrefix = "MT-",
  mitoID = NULL,
  collectionName = NULL,
  geneSetList = NULL,
  geneSetListLocation = "rownames",
  geneSetCollection = NULL,
  percent_top = c(50, 100, 200, 500),
  use_altexps = FALSE,
  flatten = TRUE,
  detectionLimit = 0,
  BPPARAM = BiocParallel::SerialParam()
)

Arguments

inSCE

A SingleCellExperiment object.

useAssay

A string specifying which assay in the SCE to use. Default "counts".

mitoGeneLocation

Character. Describes the location within inSCE where the gene identifiers in the mitochondrial gene sets should be located. If set to "rownames" then the features will be searched for among rownames(inSCE). This can also be set to one of the column names of rowData(inSCE) in which case the gene identifies will be mapped to that column in the rowData of inSCE. See featureIndex for more information. If this parameter is set to NULL, then no mitochondrial metrics will be calculated. Default "rownames".

mitoRef

Character. The species used to extract mitochondrial genes ID from build-in mitochondrial geneset in SCTK. Available species options are "human" and "mouse". Default is "human".

mitoIDType

Character. Types of mitochondrial gene id. SCTK supports "symbol", "entrez", "ensembl" and "ensemblTranscriptID". It is used with mitoRef to extract mitochondrial genes from build-in mitochondrial geneset in SCTK. Default NULL.

mitoPrefix

Character. The prefix used to get mitochondrial gene from either rownames(inSCE) or columns of rowData(inSCE) specified by mitoGeneLocation. This parameter is usually used to extract mitochondrial genes from the gene symbol. For example, mitoPrefix = "^MT-" can be used to detect mito gene symbols like "MT-ND4". Note that case is ignored so "mt-" will still match "MT-ND4". Default "^MT-".

mitoID

Character. A vector of mitochondrial genes to be quantified.

collectionName

Character. Name of a GeneSetCollection obtained by using one of the importGeneSet* functions. Default NULL.

geneSetList

List of gene sets to be quantified. The genes in the assays will be matched to the genes in the list based on geneSetListLocation. Default NULL.

geneSetListLocation

Character or numeric vector. If set to 'rownames', then the genes in geneSetList will be looked up in rownames(inSCE). If another character is supplied, then genes will be looked up in the column names of rowData(inSCE). A character vector with the same length as geneSetList can be supplied if the IDs for different gene sets are found in different places, including a mixture of 'rownames' and rowData(inSCE). An integer or integer vector can be supplied to denote the column index in rowData(inSCE). Default 'rownames'.

geneSetCollection

Class of GeneSetCollection from package GSEABase. The location of the gene IDs in inSCE should be in the description slot of each gene set and should follow the same notation as geneSetListLocation. The function getGmt can be used to read in gene sets from a GMT file. If reading a GMT file, the second column for each gene set should be the description denoting the location of the gene IDs in inSCE. These gene sets will be included with those from geneSetList if both parameters are provided.

percent_top

An integer vector. Each element is treated as a number of top genes to compute the percentage of library size occupied by the most highly expressed genes in each cell. Default c(50, 100, 200, 500).

use_altexps

Logical scalar indicating whether QC statistics should be computed for alternative Experiments in inSCE (altExps(inSCE)). If TRUE, statistics are computed for all alternative experiments. Alternatively, an integer or character vector specifying the alternative Experiments to use to compute QC statistics. Alternatively NULL, in which case alternative experiments are not used. Default FALSE.

flatten

Logical scalar indicating whether the nested DataFrame-class in the output should be flattened. Default TRUE.

detectionLimit

A numeric scalar specifying the lower detection limit for expression. Default 0

BPPARAM

A BiocParallelParam object specifying whether the QC calculations should be parallelized. Default BiocParallel::SerialParam().

Value

A SingleCellExperiment object with cell QC metrics added to the colData slot.

Details

This function allows multiple ways to import mitochondrial genes and quantify their expression in cells. mitoGeneLocation is required for all methods to point to the location within inSCE object that stores the mitochondrial gene IDs or Symbols. The various ways mito genes can be specified are:

  • A combination of mitoRef and mitoIDType parameters can be used to load pre-built mitochondrial gene sets stored in the SCTK package. These parameters are used in the importMitoGeneSet function.

  • The mitoPrefix parameter can be used to search for features matching a particular pattern. The default pattern is an "MT-" at the beginning of the ID.

  • The mitoID parameter can be used to directy supply a vector of mitochondrial gene IDs or names. Only features that exactly match items in this vector will be included in the mitochondrial gene set.

See also

addPerCellQC, link{plotRunPerCellQCResults}, runCellQC

Examples

data(scExample, package = "singleCellTK")
mito.ix = grep("^MT-", rowData(sce)$feature_name)
geneSet <- list("Mito"=rownames(sce)[mito.ix])
sce <- runPerCellQC(sce, geneSetList = geneSet)
#> Sat Mar 18 10:31:07 2023 ... Running 'perCellQCMetrics'
#> Sat Mar 18 10:31:07 2023 ...... Attempting to find mitochondrial genes by identifying features in 'rownames' that match mitochondrial genes from reference 'human' and ID type 'ensembl'.