Converts a list of gene sets stored in a GMT file into a GeneSetCollection and stores it in the metadata of the SingleCellExperiment object. These gene sets can be used in downstream quality control and analysis functions in singleCellTK.
importGeneSetsFromGMT(
inSCE,
file,
collectionName = "GeneSetCollection",
by = "rownames",
sep = "\t",
noMatchError = TRUE
)
Input SingleCellExperiment object.
Character. Path to GMT file. See getGmt for more information on reading GMT files.
Character. Name of collection to add gene sets to.
If this collection already exists in inSCE
, then these gene sets will
be added to that collection. Any gene sets within the collection with the
same name will be overwritten. Default GeneSetCollection
.
Character, character vector, or NULL. Describes the
location within inSCE
where the gene identifiers in
geneSetList
should be mapped. If set to "rownames"
then the
features will be searched for among rownames(inSCE)
. This can also be
set to one of the column names of rowData(inSCE)
in which case the
gene identifies will be mapped to that column in the rowData
of inSCE
. by
can be a vector the same length as
the number of gene sets in the GMT file and the elements of the vector
can point to different locations within inSCE
. Finally, by
can be NULL
. In this case, the location of the gene identifiers
in inSCE
should be saved in the description (2nd column)
of the GMT file. See featureIndex for more information.
Default "rownames"
.
Character. Delimiter of the GMT file. Default "\t"
.
Boolean. Show an error if a collection does not have
any matching features. Default TRUE
.
A SingleCellExperiment object
with gene set from collectionName
output stored to the
metadata slot.
The gene identifiers in gene sets in the GMT file will be
mapped to the rownames of inSCE
using the by
parameter and
stored in a GeneSetCollection object from package
GSEABase. This object is stored in
metadata(inSCE)$sctk$genesets
, which can be accessed in downstream
analysis functions such as runCellQC.
importGeneSetsFromList for importing from lists, importGeneSetsFromCollection for importing from GeneSetCollection objects, and importGeneSetsFromMSigDB for importing MSigDB gene sets.
data(scExample)
# GMT file containing gene symbols for a subset of human mitochondrial genes
gmt <- system.file("extdata/mito_subset.gmt", package = "singleCellTK")
# "feature_name" is the second column in the GMT file, so the ids will
# be mapped using this column in the 'rowData' of 'sce'. This
# could also be accomplished by setting by = "feature_name" in the
# function call.
sce <- importGeneSetsFromGMT(inSCE = sce, file = gmt, by = NULL)