This function reads in one or more Python AnnData files in the .h5ad format and returns a single SingleCellExperiment object containing all the AnnData samples by concatenating their counts matrices and related information slots.

importAnnData(
  sampleDirs = NULL,
  sampleNames = NULL,
  delayedArray = FALSE,
  class = c("Matrix", "matrix"),
  rowNamesDedup = TRUE
)

Arguments

sampleDirs

Folder containing the .h5ad file. Can be one of -

  • Default current working directory.

  • Full path to the directory containing the .h5ad file. E.g sampleDirs = '/path/to/sample'

  • A vector of folder paths for the samples to import. E.g. sampleDirs = c('/path/to/sample1', '/path/to/sample2','/path/to/sample3') importAnnData will return a single SCE object containing all the samples with the sample name appended to each colname in colData

sampleNames

The prefix/name of the .h5ad file without the .h5ad extension e.g. if 'sample.h5ad' is the filename, pass sampleNames = 'sample'. Can be one of -

  • Default sample.

  • A vector of samples to import. Length of vector must be equal to length of sampleDirs vector E.g. sampleDirs = c('sample1', 'sample2','sample3') importAnnData will return a single SCE object containing all the samples with the sample name appended to each colname in colData

delayedArray

Boolean. Whether to read the expression matrix as DelayedArray object. Default FALSE.

class

Character. The class of the expression matrix stored in the SCE object. Can be one of "Matrix" (as returned by readMM function), or "matrix" (as returned by matrix function). Default "Matrix".

rowNamesDedup

Boolean. Whether to deduplicate rownames. Default TRUE.

Value

A SingleCellExperiment object.

Details

importAnnData converts scRNA-seq data in the AnnData format to the SingleCellExperiment object. The .X slot in AnnData is transposed to the features x cells format and becomes the 'counts' matrix in the assay slot. The .vars AnnData slot becomes the SCE rowData and the .obs AnnData slot becomes the SCE colData. Multidimensional data in the .obsm AnnData slot is ported over to the SCE reducedDims slot. Additionally, unstructured data in the .uns AnnData slot is available through the SCE metadata slot. There are 2 currently known minor issues - Anndata python module depends on another python module h5pyto read hd5 format files. If there are errors reading the .h5ad files, such as "ValueError: invalid shape in fixed-type tuple." the user will need to do downgrade h5py by running pip3 install --user h5py==2.9.0 Additionally there might be errors in converting some python objects in the unstructured data slots. There are no known R solutions at present. Refer https://github.com/rstudio/reticulate/issues/209

Examples

file.path <- system.file("extdata/annData_pbmc_3k", package = "singleCellTK")
if (FALSE) {
sce <- importAnnData(sampleDirs = file.path,
                     sampleNames = 'pbmc3k_20by20')
}