
Differential module eigengene (DME) analysis
Source:vignettes/differential_MEs.Rmd
differential_MEs.Rmd
Compiled: 27-02-2024
Source: vignettes/differential_MEs.Rmd
In this tutorial, we demonstrate how to perform differential
expression analysis for our co-expression network modules. We refer to
this analysis as “differential module eigengene” (DME) analysis, which
reveals modules that are up- or down-regulated in specified groups of
cells. For this analysis we take advantage of Seurat’s
FindMarkers
function to perform differential testing, so we
have access to all of the same options.
First we load the single-cell dataset processed through the basics tutorial and the required R libraries for this tutorial.
# single-cell analysis package
library(Seurat)
# plotting and data science packages
library(tidyverse)
library(cowplot)
library(patchwork)
library(ggrepel)
# co-expression network analysis packages:
library(WGCNA)
library(hdWGCNA)
# using the cowplot theme for ggplot
theme_set(theme_cowplot())
# set random seed for reproducibility
set.seed(12345)
# re-load the Zhou et al snRNA-seq dataset processed with hdWGCNA
seurat_obj <- readRDS('Zhou_2020_hdWGCNA.rds')
DME analysis comparing two groups
Here we discuss how to perform DME testing between two different
groups. We use the hdWGCNA function FindDMEs
, the syntax of
which is similar to the Seurat function FindMarkers
. Here
we use the Mann-Whitney U test, also known as the Wilcoxon test, to
compare two groups, but other tests can be specified with the
test.use
parameter.
Since the tutorial dataset only contains control brain samples, we
will use sex to define our two groups. FindDMEs
requires a
list of barcodes for group1 and for group2. Further, we are only going
to compare cells from the INH cluster since that is the group that we
performed network analysis on.
group1 <- seurat_obj@meta.data %>% subset(cell_type == 'INH' & msex == 0) %>% rownames
group2 <- seurat_obj@meta.data %>% subset(cell_type == 'INH' & msex != 0) %>% rownames
head(group1)
[1] "TCTTCGGCAAGACGTG-11" "ATTATCCGTTGATTCG-11" "CCACCTAGTCCAAGTT-11"
[4] "AGCGTATGTTAAGAAC-11" "GTAACTGTCCAGATCA-11" "CGATTGATCTTTACAC-11"
Next, we run the FindDMEs
function.
DMEs <- FindDMEs(
seurat_obj,
barcodes1 = group1,
barcodes2 = group2,
test.use='wilcox',
wgcna_name='INH'
)
head(DMEs)
p_val avg_log2FC pct.1 pct.2 p_val_adj module
INH-M18 4.714924e-23 0.40637874 0.894 0.729 8.486863e-22 INH-M18
INH-M16 5.257311e-08 -0.18257946 0.850 0.935 9.463160e-07 INH-M16
INH-M2 7.565615e-05 -0.18746938 0.661 0.738 1.361811e-03 INH-M2
INH-M15 1.496899e-03 -0.06603535 0.969 0.977 2.694417e-02 INH-M15
INH-M10 1.513458e-02 -0.07774661 0.975 0.980 2.724224e-01 INH-M10
INH-M17 2.034644e-02 0.24035946 0.589 0.557 3.662360e-01 INH-M17
We can visualize the results using the hdWGNCA functions
PlotDMEsLollipop
or PlotDMEsVolcano
. First we
make a lollipop plot to visualize the DME results.
PlotDMEsLollipop(
seurat_obj,
DMEs,
wgcna_name='INH',
pvalue = "p_val_adj"
)
This plot shows the fold-change for each of the modules, and the size of each dot corresponds to the number of genes in that module. An “X” is placed over each point that does not reach statistical significance.
For PlotDMEsLollipop
, if we have other self-defined
columns in the DMEs data frame, we can use group.by
parameter to supply the column names and comparison arguments for one
comparison group or a list of comparison groups for plotting. For
example:
PlotDMEsLollipop(
seurat_obj,
DMEs,
wgcna_name='INH',
group.by = "Comparisons",
comparison = c("group1_vs_control", "group2_vs_control"),
pvalue = "p_val_adj"
)
Alternatively, we can use PlotDMEsVolcano
to make a
volcano plot to show the effect size and the significance level
together.
PlotDMEsVolcano(
seurat_obj,
DMEs,
wgcna_name = 'INH'
)
Looping through multiple clusters
We can run DME analysis in a loop to perform the comparison in
multiple groups. Here we will loop through the 5 INH subclusters and run
FindDMEs
in each cluster.
# list of clusters to loop through
clusters <- c("INH1 VIP+", "INH2 PVALB+", "INH3 SST+", 'INH4 LAMP5+', "INH5 PVALB+")
# set up an empty dataframe for the DMEs
DMEs <- data.frame()
# loop through the clusters
for(cur_cluster in clusters){
# identify barcodes for group1 and group2 in eadh cluster
group1 <- seurat_obj@meta.data %>% subset(annotation == cur_cluster & msex == 0) %>% rownames
group2 <- seurat_obj@meta.data %>% subset(annotation == cur_cluster & msex != 0) %>% rownames
# run the DME test
cur_DMEs <- FindDMEs(
seurat_obj,
barcodes1 = group1,
barcodes2 = group2,
test.use='wilcox',
pseudocount.use=0.01, # we can also change the pseudocount with this param
wgcna_name = 'INH'
)
# add the cluster info to the table
cur_DMEs$cluster <- cur_cluster
# append the table
DMEs <- rbind(DMEs, cur_DMEs)
}
Here we will use ggplot2 to make a heatmap showing the DME effect sizes. Since we are just using ggplot2 here you can easily customize this heatmap to suit your needs.
# get the modules table:
modules <- GetModules(seurat_obj)
mods <- levels(modules$module); mods <- mods[mods != 'grey']
# make a copy of the DME table for plotting
plot_df <- DMEs
# set the factor level for the modules so they plot in the right order:
plot_df$module <- factor(as.character(plot_df$module), levels=mods)
# set a min/max threshold for plotting
maxval <- 0.5; minval <- -0.5
plot_df$avg_log2FC <- ifelse(plot_df$avg_log2FC > maxval, maxval, plot_df$avg_log2FC)
plot_df$avg_log2FC <- ifelse(plot_df$avg_log2FC < minval, minval, plot_df$avg_log2FC)
# add significance levels
plot_df$Significance <- gtools::stars.pval(plot_df$p_val_adj)
# change the text color to make it easier to see
plot_df$textcolor <- ifelse(plot_df$avg_log2FC > 0.2, 'black', 'white')
# make the heatmap with geom_tile
p <- plot_df %>%
ggplot(aes(y=cluster, x=module, fill=avg_log2FC)) +
geom_tile()
# add the significance levels
p <- p +
geom_text(label=plot_df$Significance, color=plot_df$textcolor)
# customize the color and theme of the plot
p <- p +
scale_fill_gradient2(low='purple', mid='black', high='yellow') +
RotatedAxis() +
theme(
panel.border = element_rect(fill=NA, color='black', size=1),
axis.line.x = element_blank(),
axis.line.y = element_blank(),
plot.margin=margin(0,0,0,0)
) + xlab('') + ylab('') +
coord_equal()
p
One-versus-all DME analysis
Similar to the Seurat function FindAllMarkers
, we can
perform a one-versus-all DME test using the function
FindAllDMEs
when specifying a column to group cells. Here
we will group.by
each cell type for the one-versus-all
test.
group.by = 'cell_type'
DMEs_all <- FindAllDMEs(
seurat_obj,
group.by = 'cell_type',
wgcna_name = 'INH'
)
head(DMEs_all)
The output looks similar to FindDMEs
, but there is an
extra column called group
containing the information for
each cell grouping.
p_val avg_log2FC pct.1 pct.2 p_val_adj module group
EX.INH-M1 0 -10.313763 0.003 0.621 0 INH-M1 EX
EX.INH-M2 0 1.007115 0.670 0.445 0 INH-M2 EX
EX.INH-M3 0 2.641423 0.729 0.205 0 INH-M3 EX
EX.INH-M4 0 1.978553 0.910 0.246 0 INH-M4 EX
EX.INH-M5 0 2.447565 0.865 0.185 0 INH-M5 EX
EX.INH-M6 0 2.448315 0.877 0.209 0 INH-M6 EX
Now we can plot the results with PlotDMEsVolcano
p <- PlotDMEsVolcano(
seurat_obj,
DMEs_all,
wgcna_name = 'INH',
plot_labels=FALSE,
show_cutoff=FALSE
)
# facet wrap by each cell type
p + facet_wrap(~group, ncol=3)
Using ModuleScores instead of MEs
We can also perform differential analysis using module expression
scores instead of module eigengenes. These must first be calculated
using the function ModuleExprScore
, as shown below.
seurat_obj <- ModuleExprScore(
seurat_obj,
n_genes = 25,
method='UCell'
)
Next we can perform differential analysis using these module scores
by specifying features = 'ModuleScores'
in the
FindDMEs
function.
group1 <- seurat_obj@meta.data %>% subset(cell_type == 'INH' & msex == 0) %>% rownames
group2 <- seurat_obj@meta.data %>% subset(cell_type == 'INH' & msex != 0) %>% rownames
DMEs_scores <- FindDMEs(
seurat_obj,
features = 'ModuleScores',
barcodes1 = group1,
barcodes2 = group2,
test.use='wilcox',
wgcna_name='tutorial'
)
p <- PlotDMEsLollipop(
seurat_obj,
DMEs,
wgcna_name = 'tutorial',
pvalue = "p_val_adj"
)
p