R/importAnnData.R
importAnnData.Rd
This function reads in one or more Python AnnData files in the .h5ad format and returns a single SingleCellExperiment object containing all the AnnData samples by concatenating their counts matrices and related information slots.
importAnnData(
sampleDirs = NULL,
sampleNames = NULL,
delayedArray = FALSE,
class = c("Matrix", "matrix"),
rowNamesDedup = TRUE
)
Folder containing the .h5ad file. Can be one of -
Default current working directory
.
Full path to the directory containing the .h5ad file.
E.g sampleDirs = '/path/to/sample'
A vector of folder paths for the samples to import.
E.g. sampleDirs = c('/path/to/sample1', '/path/to/sample2','/path/to/sample3')
importAnnData will return a single SCE object containing all the samples
with the sample name appended to each colname in colData
The prefix/name of the .h5ad file without the .h5ad extension
e.g. if 'sample.h5ad' is the filename, pass sampleNames = 'sample'
.
Can be one of -
Default sample
.
A vector of samples to import. Length of vector must be equal to length of sampleDirs vector
E.g. sampleDirs = c('sample1', 'sample2','sample3')
importAnnData will return a single SCE object containing all the samples
with the sample name appended to each colname in colData
Boolean. Whether to read the expression matrix as
DelayedArray object. Default FALSE
.
Character. The class of the expression matrix stored in the SCE
object. Can be one of "Matrix" (as returned by
readMM function), or "matrix" (as returned by
matrix function). Default "Matrix"
.
Boolean. Whether to deduplicate rownames. Default
TRUE
.
A SingleCellExperiment
object.
importAnnData
converts scRNA-seq data in the AnnData format to the
SingleCellExperiment
object. The .X slot in AnnData is transposed to the features x cells
format and becomes the 'counts' matrix in the assay slot. The .vars AnnData slot becomes the SCE rowData
and the .obs AnnData slot becomes the SCE colData. Multidimensional data in the .obsm AnnData slot is
ported over to the SCE reducedDims slot. Additionally, unstructured data in the .uns AnnData slot is
available through the SCE metadata slot.
There are 2 currently known minor issues -
Anndata python module depends on another python module h5pyto read hd5 format files.
If there are errors reading the .h5ad files, such as "ValueError: invalid shape in fixed-type tuple."
the user will need to do downgrade h5py by running pip3 install --user h5py==2.9.0
Additionally there might be errors in converting some python objects in the unstructured data slots.
There are no known R solutions at present. Refer https://github.com/rstudio/reticulate/issues/209
file.path <- system.file("extdata/annData_pbmc_3k", package = "singleCellTK")
if (FALSE) {
sce <- importAnnData(sampleDirs = file.path,
sampleNames = 'pbmc3k_20by20')
}