SingleCellTK (SCTK) is equipped with the generic heatmap plotting
functionality, which helps visualizing the feature expression matrices,
with flexible methods of annotation and grouping on either cells or
features. Our base function is a wrapper of ComplexHeatmap
[1].
To view detailed instructions on how to use heatmap, please select
‘Interactive Analysis’ for using heatmap in shiny application or
‘Console Analysis’ for using it on R console from the tabs below:
Entry of the Panel
The UI for plotting a generic heatmap is constructed with 5 sections: data input, data subsetting, annotation adding, parameter setting, and finally, plotting.
Basic Workflow
The data used for plotting a heatmap can only be a feature expression
matrix, technically called assay
in R language. This will
be set at the selection input “Assay to Plot”.
There is a second selection called “Import from analysis” which is used for the fast setting of plotting a differential expression (DE) analysis or marker detection specific style. The detail of this feature will be introduced later.
Similar to the condition definition in the DE analysis panel, here SCTK also adopts a data table to do a filtering and select the cells and features, in order to maintain the maximum flexibility. By default, all the cell annotation available will be displayed as shown in the screenshot above, as well as the feature annotation.
Generally, to make a selection, users will first find the annotations which can define the cells or features of interests, apply the filter, and then press “Add all filtered” to finish the selection. When there is no applicable annotation, we also allow manually entering a list of cell/feature identifiers to specify the contents to be plotted.
To make the operation cleaner, user can choose to only display the
annotation classes of interests at the multi-selection input
“Columns to display”. The outcome will look like the
screenshot above. When the annotation of interests is categorical (in
terms of R language, a character
or logical
vector or a factor
), filters can be applied by selecting
the categories of interest, at the boxes under the variable names.
The screenshots above demonstrates how to filer by limiting a range
to a continuous data (in R, a numeric
or
integer
vector).
Finally, if there is no applicable annotation to use, SCTK provides a third tab that allows direct pasting a list of unique cell/feature identifiers. As shown above.
The identifiers pasted to the text box should be able to uniquely
identify the cells or features, and they have to be able to be found in
the background information (i.e. the background
SingleCellExperiment
(SCE) object). However, they do not
necessarily have to be the default identifiers. That is to say,
identifiers that can be found in existing annotation are also
acceptable. To accomplish this, users only need to select what class of
annotation they intend to use at the selection input “Match
input cell/feature identifiers by”. Note that the option “Row
Names” indicates the default identifier when no annotation is
available.
For example, usually in a “cellRanger” style data, the default feature identifiers would usually be “Ensembl IDs”, while “gene symbols” will be embedded in the feature annotation. When users only have a list of genes symbol, they can directly use this list, and select the annotation for “gene symbol”, instead of going for some other external tools to transfer the “symbols” into “IDs”.
In this sub-panel, users can attach the annotations to be shown in the heatmap legend, and changes in colors are allowed. As mentioned previously, there are two types of values - categorical and continuous.
For categorical information, the color setting will look like the screenshot above. Each category will be displayed with a color input. Users can click on the color box to change the color for a specific category if needed.
Note that when categorical annotation is wanted but too many categories are detected, SCTK will not display any color selection but set it with default color scheme, in order to avoid a UI “overflow”.
While for a continuous value, users will still be asked if the value is categorical or continuous. The reason is that, technically, sometimes there are annotations generated in a continuous numeric data type but indicating categorical information, such as clusters labeled with integers. If users select “Categorical”, the similar color selections will be displayed as in the previous screenshot (Refer to the previous folded paragraph). If “Continuous” is selected, a different color selection UI will be shown, as the screenshot above. Here users will be asked to set two colors for a color gradient.
This is the last sub-panel for the generic heatmap setting before users can generate a plot.
After going over all the settings shown in the screenshots of the previous sections, users can confirm to generate a heatmap by clicking on the button “Plot Heatmap”. A heatmap in the screenshot above will be generated at the bottom region of the UI.
Additional Usage
The documentation above is for the general usage of SCTK Heatmap page. While heatmap is an important visualization method for many types of analysis, SCTK allows importing some types analysis results. Currently, DE analysis and marker detection results are supported. Here, we will present an example of importing the DE analysis result.
First, users need to go back to the top of the Heatmap panel, and
find “Import from analysis” selection input. On
selecting "Differential Expression"
, the session will scan
the background data and show a list of the analysis information for
users to choose from. After selecting the analysis of interests, click
“Import”.
The import functionality will automatically fill most of the options afterwards, including making up some temporary DE specific annotations and subset the data accordingly.
Note that the features automatically selected are all of the ones
stored to the result. In other words, this is affected by the parameters
used when running the DE analysis. However, sometimes users might not
want to plot all of them but apply some further filters instead. For
example, using features with the absolute value of Log2FC greater than
1
usually produces cleaner figure, than setting that bound
to 0.25
. To apply this filter in the UI, users need to
click on the input box below the column title in the table. If the
column is for categorical information, then just click on the categories
of interests; else if it is numeric, users can either manually drag the
sliders or type a “formula” into the input box. The formula follows the
rule: {low}...{high}
, where one of the two bounds can be
omitted for a one-end range. For example if users want to apply a filter
described above for Log2FC values, they need to first click
“Clear selection”, type 1...
, click
“Add all filtered”, then type ...-1
, and
click “Add all filtered” again.
Other heatmap settings will also be automatically filled for a DE specific heatmap.
To present the usage of plotSCEHeatmap()
, we would like
to use a small example provided with SCTK.
library(singleCellTK)
data("scExample") # This imports SCE object "sce"
sce
## class: SingleCellExperiment
## dim: 200 390
## metadata(0):
## assays(1): counts
## rownames(200): ENSG00000251562 ENSG00000205542 ... ENSG00000204472
## ENSG00000133872
## rowData names(2): feature_ID feature_name
## colnames(390): pbmc_4k_GACCAATTCCTAGAAC-1 pbmc_4k_AAATGCCGTTTCGCTC-1
## ... pbmc_4k_CTGAAGTTCCACTCCA-1 pbmc_4k_CATCGAACAGCCAGAA-1
## colData names(4): cell_barcode column_name sample type
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
“Raw” plotting
The minimum setting for plotSCEHeatmap()
is the input
SCE object and the data matrix to plot (default
"logcounts"
). In this way, all cells and features will be
presented while no annotation or legend (except the main color scheme)
will be shown.
plotSCEHeatmap(sce, useAssay = "counts")
Subsetting
SCTK allows relatively flexible approaches to select the cells/features to plot.
The basic way to subset the heatmap is to directly use an index
vector that can subset the input SCE object to featureIndex
and cellIndex
, including numeric
, and
logical
vectors, which are widely used, and
character
vector containing the row/col names. Of course,
user can directly use a subsetted SCE object as input.
# Make up random downsampling numeric vector
featureSubset <- sample(nrow(sce), 50)
cellSubset <- sample(ncol(sce), 50)
plotSCEHeatmap(inSCE = sce, useAssay = "counts", featureIndex = featureSubset, cellIndex = cellSubset)
rowData
/colData
In a more complex situation, where users might only have a set of
identifiers which are not inside the row/col names (i.e. unable to
directly subset the SCE object), we provide another approach. The
subset, in this situation, can be accessed via specifying a vector that
contains the identifiers users have, to featureIndexBy
or
cellIndexBy
. This specification allows directly giving one
column name of rowData
or colData
.
subsetFeatureName <- sample(rowData(sce)$feature_name, 50)
subsetCellBarcode <- sample(sce$cell_barcode, 50)
plotSCEHeatmap(inSCE = sce, useAssay = "counts", featureIndex = subsetFeatureName, featureIndexBy = "feature_name", cellIndex = subsetCellBarcode, cellIndexBy = "cell_barcode")
Adding Annotations
As introduced before, we allow directly using column names of
rowData
or colData
to attach color bar
annotations. To make use of this functionality, pass a
character
vector to rowDataName
or
colDataName
.
# Make up arbitrary annotation,
rowRandLabel <- c(rep('aa', 100), rep('bb', 100))
rowData(sce)$randLabel <- rowRandLabel
colRandLabel <- c(rep('cc', 195), rep('dd', 195))
colData(sce)$randLabel <- colRandLabel
plotSCEHeatmap(inSCE = sce, useAssay = "counts", featureIndex = featureSubset, cellIndex = cellSubset, rowDataName = "randLabel", colDataName = c("type", "randLabel"))
Fully customized annotation is also supported, though it can be
complexed for users. For the labeling, it is more recommanded to insert
the information into rowData
or colData
and
then make use. For coloring, information should be passed to
featureAnnotationColor
or cellAnnotationColor
.
The argument must be a list
object with names matching the
annotation classes (such as "randLabel"
and
"type"
); each inner object under a name must be a named
vector, with colors as the values and existing categories as the names.
The working instance looks like this:
## [1] "For row annotation"
## $randLabel
## aa bb
## "#FF4D4D" "#4DFFFF"
## [1] "For column annotation"
## $sample
## pbmc_4k
## "#FF4D4D"
##
## $type
## Singlet Doublet EmptyDroplet
## "#4DFFFF" "#FFC04D" "#4D4DFF"
##
## $randLabel
## cc dd
## "#FFFF4D" "#BA4DFF"
Others
1. Grouping/Splitting In some cases, it might be
better to do a “semi-heatmap” (i.e. split the rows/columns first and
cluster them within each group) to visualize some expression pattern,
such as evaluating the differential expression. For this need, use
rowSplitBy
or colSplitBy
, and the arguments
must be a character
vector that is a subset of the
specified annotation.
plotSCEHeatmap(inSCE = sce, useAssay = "counts", featureIndex = featureSubset, cellIndex = cellSubset, rowDataName = "randLabel", colDataName = c("type", "randLabel"), rowSplitBy = "randLabel", colSplitBy = "type")
2. Cell/Feature Labeling Text labels of features or
cells can be added via rowLabel
or colLabel
.
Use TRUE
or FALSE
to specify whether to show
the rownames
or colnames
of the subsetted SCE
object. Additionally, giving a single string of a column name of
rowData
or colData
can enable the labeling of
the annotation. Furthermore, users can directly throw a character vector
to the parameter, with the same length of either the full SCE object or
the subsetted.
3. Dendrograms The dendrograms for features or cells
can be removed by passing FALSE
to rowDend
or
colDend
.
4. Row/Column titles The row title
("Genes"
) and column title ("Cells"
) can be
changed or removed by passing a string or NULL
to
rowTitle
or colTitle
, respectively.
plotSCEHeatmap(inSCE = sce, useAssay = "counts", featureIndex = featureSubset, cellIndex = cellSubset, rowLabel = "feature_name", colLabel = as.character(1:50), colDend = FALSE, rowTitle = "Downsampled features")
There are still some parameters not mentioned here, but they are not
frequently used. Please refer to ?plotSCEHeatmap
as well as
?ComplexHeatmap::Heatmap
.