A wrapper function for addPerCellQC. Calculate general quality control metrics for each cell in the count matrix.
runPerCellQC(
inSCE,
useAssay = "counts",
mitoGeneLocation = "rownames",
mitoRef = c(NULL, "human", "mouse"),
mitoIDType = c("ensembl", "symbol", "entrez", "ensemblTranscriptID"),
mitoPrefix = "MT-",
mitoID = NULL,
collectionName = NULL,
geneSetList = NULL,
geneSetListLocation = "rownames",
geneSetCollection = NULL,
percent_top = c(50, 100, 200, 500),
use_altexps = FALSE,
flatten = TRUE,
detectionLimit = 0,
BPPARAM = BiocParallel::SerialParam()
)
A SingleCellExperiment object.
A string specifying which assay in the SCE to use. Default
"counts"
.
Character. Describes the location within inSCE
where the gene identifiers in the mitochondrial gene sets should be located.
If set to "rownames"
then the features will be searched for among
rownames(inSCE)
. This can also be set to one of the column names of
rowData(inSCE)
in which case the gene identifies will be mapped to
that column in the rowData
of inSCE
. See
featureIndex
for more information. If this parameter is set to
NULL
, then no mitochondrial metrics will be calculated.
Default "rownames"
.
Character. The species used to extract mitochondrial genes ID
from build-in mitochondrial geneset in SCTK. Available species options are
"human"
and "mouse"
. Default is "human"
.
Character. Types of mitochondrial gene id. SCTK supports
"symbol"
, "entrez"
, "ensembl"
and
"ensemblTranscriptID"
. It is used with mitoRef
to extract
mitochondrial genes from build-in mitochondrial geneset in SCTK. Default
NULL
.
Character. The prefix used to get mitochondrial gene from
either rownames(inSCE)
or columns of rowData(inSCE)
specified
by mitoGeneLocation
. This parameter is usually used to extract mitochondrial
genes from the gene symbol. For example, mitoPrefix = "^MT-"
can be used
to detect mito gene symbols like "MT-ND4". Note that case is ignored so "mt-"
will still match "MT-ND4". Default "^MT-"
.
Character. A vector of mitochondrial genes to be quantified.
Character. Name of a GeneSetCollection
obtained
by using one of the importGeneSet*
functions. Default NULL
.
List of gene sets to be quantified. The genes in the
assays will be matched to the genes in the list based on
geneSetListLocation
. Default NULL
.
Character or numeric vector. If set to
'rownames'
, then the genes in geneSetList
will be looked up in
rownames(inSCE)
. If another character is supplied, then genes will be
looked up in the column names of rowData(inSCE)
. A character vector
with the same length as geneSetList
can be supplied if the IDs for
different gene sets are found in different places, including a mixture of
'rownames'
and rowData(inSCE)
. An integer or integer vector can
be supplied to denote the column index in rowData(inSCE)
. Default
'rownames'
.
Class of GeneSetCollection
from package
GSEABase. The location of the gene IDs in inSCE
should be in the
description
slot of each gene set and should follow the
same notation as geneSetListLocation
. The function
getGmt
can be used to read in gene sets from a GMT
file. If reading a GMT file, the second column for each gene set should be
the description denoting the location of the gene IDs in inSCE
. These
gene sets will be included with those from geneSetList
if both
parameters are provided.
An integer vector. Each element is treated as a number of
top genes to compute the percentage of library size occupied by the most
highly expressed genes in each cell. Default c(50, 100, 200, 500)
.
Logical scalar indicating whether QC statistics should
be computed for alternative Experiments in inSCE
(altExps(inSCE)
). If TRUE
, statistics are computed for all
alternative experiments. Alternatively, an integer or character vector
specifying the alternative Experiments to use to compute QC statistics.
Alternatively NULL
, in which case alternative experiments are not
used. Default FALSE
.
Logical scalar indicating whether the nested
DataFrame-class in the output should be flattened. Default
TRUE
.
A numeric scalar specifying the lower detection limit
for expression. Default 0
A BiocParallelParam object specifying whether the QC
calculations should be parallelized. Default
BiocParallel::SerialParam()
.
A SingleCellExperiment object with cell QC metrics added to the colData slot.
This function allows multiple ways to import mitochondrial genes and quantify
their expression in cells. mitoGeneLocation
is required for all
methods to point to the location within inSCE object that stores the
mitochondrial gene IDs or Symbols. The various ways mito genes can be
specified are:
A combination of mitoRef
and mitoIDType
parameters can be used to load pre-built mitochondrial gene sets stored
in the SCTK package. These parameters are used in the
importMitoGeneSet function.
The mitoPrefix
parameter can be used to search for features
matching a particular pattern. The default pattern is an "MT-"
at the beginning of the ID.
The mitoID
parameter can be used to directy supply a vector of
mitochondrial gene IDs or names. Only features that exactly match items
in this vector will be included in the mitochondrial gene set.
addPerCellQC
,
link{plotRunPerCellQCResults}
, runCellQC
data(scExample, package = "singleCellTK")
mito.ix = grep("^MT-", rowData(sce)$feature_name)
geneSet <- list("Mito"=rownames(sce)[mito.ix])
sce <- runPerCellQC(sce, geneSetList = geneSet)
#> Sat Mar 18 10:31:07 2023 ... Running 'perCellQCMetrics'
#> Sat Mar 18 10:31:07 2023 ...... Attempting to find mitochondrial genes by identifying features in 'rownames' that match mitochondrial genes from reference 'human' and ID type 'ensembl'.