Gets a list of MSigDB gene sets stores it in the metadata of the SingleCellExperiment object. These gene sets can be used in downstream quality control and analysis functions in singleCellTK.
importGeneSetsFromMSigDB(
inSCE,
categoryIDs,
species = "Homo sapiens",
mapping = c("gene_symbol", "human_gene_symbol", "entrez_gene"),
by = "rownames",
verbose = TRUE,
noMatchError = TRUE
)
Input SingleCellExperiment object.
Character vector containing the MSigDB gene set ids.
The column ID
in the table returned by getMSigDBTable()
shows
the list of possible gene set IDs that can be obtained.
Character. Species available can be found using the function
msigdbr_show_species
. Default "Homo sapiens"
.
Character. One of "gene_symbol", "human_gene_symbol", or
"entrez_gene". Gene identifiers to be used for MSigDB gene sets. IDs
denoted by the by
parameter must be either in gene symbol or
Entrez gene id format to match IDs from MSigDB.
Character. Describes the
location within inSCE
where the gene identifiers in
the MSigDB gene sets should be mapped. If set to "rownames"
then the
features will be searched for among rownames(inSCE)
. This can also be
set to one of the column names of rowData(inSCE)
in which case the
gene identifies will be mapped to that column in the rowData
of inSCE
. See featureIndex for more information.
Default "rownames"
.
Boolean. Whether to display progress. Default TRUE
.
Boolean. Show an error if a collection does not have
any matching features. Default TRUE
.
A SingleCellExperiment object
with gene set from collectionName
output stored to the
metadata slot.
The gene identifiers in gene sets from MSigDB will be retrieved
using the msigdbr
package. They will be mapped to the IDs in
inSCE
using the by
parameter and
stored in a GeneSetCollection object from package
GSEABase. This object is stored in
metadata(inSCE)$sctk$genesets
, which can be accessed in downstream
analysis functions such as runCellQC.
importGeneSetsFromList for importing from lists, importGeneSetsFromGMT for importing from GMT files, and GeneSetCollection objects.
data(scExample)
sce <- importGeneSetsFromMSigDB(inSCE = sce,
categoryIDs = "H",
species = "Homo sapiens",
mapping = "gene_symbol",
by = "feature_name")
#> Sat Mar 18 10:28:15 2023 .. Importing 'H' gene sets (n = 50)
#> Sat Mar 18 10:28:15 2023 .... Completed 50 gene sets for 1
#> Sat Mar 18 10:28:15 2023 .. Matching gene sets to 'feature_name'