Gets a list of MSigDB gene sets stores it in the metadata of the SingleCellExperiment object. These gene sets can be used in downstream quality control and analysis functions in singleCellTK.

importGeneSetsFromMSigDB(
  inSCE,
  categoryIDs,
  species = "Homo sapiens",
  mapping = c("gene_symbol", "human_gene_symbol", "entrez_gene"),
  by = "rownames",
  verbose = TRUE,
  noMatchError = TRUE
)

Arguments

inSCE

Input SingleCellExperiment object.

categoryIDs

Character vector containing the MSigDB gene set ids. The column ID in the table returned by getMSigDBTable() shows the list of possible gene set IDs that can be obtained.

species

Character. Species available can be found using the function msigdbr_show_species. Default "Homo sapiens".

mapping

Character. One of "gene_symbol", "human_gene_symbol", or "entrez_gene". Gene identifiers to be used for MSigDB gene sets. IDs denoted by the by parameter must be either in gene symbol or Entrez gene id format to match IDs from MSigDB.

by

Character. Describes the location within inSCE where the gene identifiers in the MSigDB gene sets should be mapped. If set to "rownames" then the features will be searched for among rownames(inSCE). This can also be set to one of the column names of rowData(inSCE) in which case the gene identifies will be mapped to that column in the rowData of inSCE. See featureIndex for more information. Default "rownames".

verbose

Boolean. Whether to display progress. Default TRUE.

noMatchError

Boolean. Show an error if a collection does not have any matching features. Default TRUE.

Value

A SingleCellExperiment object with gene set from collectionName output stored to the metadata slot.

Details

The gene identifiers in gene sets from MSigDB will be retrieved using the msigdbr package. They will be mapped to the IDs in inSCE using the by parameter and stored in a GeneSetCollection object from package GSEABase. This object is stored in metadata(inSCE)$sctk$genesets, which can be accessed in downstream analysis functions such as runCellQC.

See also

importGeneSetsFromList for importing from lists, importGeneSetsFromGMT for importing from GMT files, and GeneSetCollection objects.

Author

Joshua D. Campbell

Examples

data(scExample)
sce <- importGeneSetsFromMSigDB(inSCE = sce,
                                categoryIDs = "H",
                                species = "Homo sapiens",
                                mapping = "gene_symbol",
                                by = "feature_name")
#> Sat Mar 18 10:28:15 2023 .. Importing 'H' gene sets (n = 50)
#> Sat Mar 18 10:28:15 2023 .... Completed 50 gene sets for 1
#> Sat Mar 18 10:28:15 2023 .. Matching gene sets to 'feature_name'