This function takes a Seurat object and constructs averaged 'metacells' based on neighboring cells in provided groupings, such as cluster or cell type.
Usage
MetacellsByGroups(
seurat_obj,
group.by = c("seurat_clusters"),
ident.group = "seurat_clusters",
k = 25,
reduction = "pca",
dims = NULL,
assay = NULL,
slot = "counts",
layer = "counts",
mode = "average",
cells.use = NULL,
min_cells = 100,
max_shared = 15,
target_metacells = 1000,
max_iter = 5000,
verbose = FALSE,
wgcna_name = NULL
)
Arguments
- seurat_obj
A Seurat object
- group.by
A character vector of Seurat metadata column names representing groups for which metacells will be computed.
- k
Number of nearest neighbors to aggregate. Default = 50
- reduction
A dimensionality reduction stored in the Seurat object. Default = 'pca'
- dims
A vector represnting the dimensions of the reduction to use. Either specify the names of the dimensions or the indices. Default = NULL to include all dims.
- assay
Assay to extract data for aggregation. Default = 'RNA'
- slot
Slot to extract data for aggregation. Default = 'counts'. Slot is used with Seurat v4 instead of layer.
- layer
Layer to extract data for aggregation. Default = 'counts'. Layer is used with Seurat v5 instead of slot.
- mode
determines how to make gene expression profiles for metacells from their constituent single cells. Options are "average" or "sum".
- min_cells
the minimum number of cells in a particular grouping to construct metacells
the maximum number of cells to be shared across two metacells
- target_metacells
the maximum target number of metacells to construct
- max_iter
the maximum number of iterations in the metacells bootstrapping loop
- verbose
logical indicating whether to print additional information
- wgcna_name
name of the WGCNA experiment
- name
A string appended to resulting metalcells. Default = 'agg'
Details
MetacellsByGroups merges transcriptomically similar cells into "metacells". Given a dimensionally-reduced representation of the input dataset, this algorithm first uses KNN to identify similar cells. A bootstrapped sampling procedure is then used to group together similar cells until convergence is reached. Importantly, this procedure is done in a context-specific manner based on the provided group.by parameters. Typically this means that metacells will be constructed separately for each biological replicate, cell type or cell state, disease condition, etc. The metacell representation is considerably less sparse than the original single-cell dataset, which is preferable for co-expression network analysis or othter analyses that rely on correlations.