A wrapper function to compute chromatin-state–based scores from scATAC-seq data stored in a Seurat object. This function orchestrates the full pipeline: (1) annotates peaks with ChromHMM states (or accepts pre-annotated peaks), (2) filters peaks (stoplist, nonstandard chromosomes, and low-coverage filtering), (3) constructs and normalizes a chromatin state count/fraction matrix per cell, (4) computes an erosion score (active vs. repressive balance), and (5) computes an entropy-based plasticity score.
Usage
RunChromatic(
seurat_obj,
chromHMM_states,
peaks_gr = NULL,
stoplist = NULL,
remove_nonstandard_chromosomes = TRUE,
min_overlap = 50,
min_overlap_frac = 0.25,
filter_features = TRUE,
skip_annotation = FALSE,
min_cells = 100,
min_counts = 100,
state_signs = NULL,
covariates = NULL,
state_col = "name",
active_patterns = c("TssA", "TssFlnk", "Tx", "EnhA", "EnhG", "EnhWk"),
repressive_patterns = c("ReprPC", "Quies", "Het"),
pseudocount = 0.5,
z_group_by = NULL,
z_group_name = NULL,
assay = "ATAC"
)Arguments
- seurat_obj
Seurat object containing scATAC-seq data. Must have peaks as features (rows) and cells as columns in the
assayspecified.- chromHMM_states
A
GRangesobject containing ChromHMM state annotations. Should have a column with state labels specified bystate_col.- peaks_gr
(Optional) A
GRangesobject of peaks. IfNULL(default), peaks are extracted from the input Seurat object. If provided, will be used directly.- stoplist
(Optional) A
GRangesobject of regions to exclude (e.g. blacklisted regions). If provided, peaks overlapping these regions will be removed.- remove_nonstandard_chromosomes
Logical; if TRUE (default), non-standard chromosomes (e.g. scaffolds) will be removed from both peaks and chromHMM annotations.
- min_overlap
Numeric; minimum overlap width (bp) required for a peak to be assigned to a ChromHMM state. Default = 50.
- min_overlap_frac
Numeric or
NULL; minimum fraction of a peak's length that must overlap a ChromHMM state for assignment. Default = 0.25.- filter_features
Logical; if TRUE (default), peaks are filtered to exclude features with low coverage using
ExcludeUncommonPeaks.- skip_annotation
Logical; if TRUE, skips ChromHMM annotation step and assumes that the provided
peaks_gr(or peaks fromseurat_obj) already contain anannotationcolumn. Default = FALSE.- min_cells
Integer; minimum number of cells required for a peak to be kept (passed to
ExcludeUncommonPeaks). Default = 100.- min_counts
Integer; minimum total counts required for a peak to be kept (passed to
ExcludeUncommonPeaks). Default = 100.- state_signs
Named vector indicating the “sign” (active/repressive) of each chromatin state. If NULL (default), will be generated automatically from
chromHMM_statesusingChromatinStateSigns()and the provided patterns.- covariates
(Optional) Character vector of covariate column names from
seurat_obj@meta.datato regress out from the scores (e.g. TSS.enrichment, nCount_ATAC).- state_col
Character; name of the metadata column in
chromHMM_statescontaining the state label. Default ="name".- active_patterns
Character vector of regex patterns used to identify active states (passed to
ChromatinStateSigns()). Default includes "TssA", "TssFlnk", "Tx", "EnhA", "EnhG", "EnhWk".- repressive_patterns
Character vector of regex patterns used to identify repressive states (passed to
ChromatinStateSigns()). Default includes "ReprPC", "Quies", "Het".- pseudocount
Numeric; pseudocount to add before calculating fractions (used by scoring functions). Default = 0.5.
- z_group_by
(Optional) Column name in
seurat_obj@meta.dataspecifying a group to use as reference for baseline Z-scoring. IfNULL, Z-scores are computed across all cells.- z_group_name
(Optional) Name of the group (within
z_group_by) to use as reference for baseline Z-scoring.- assay
Character; name of the Seurat assay containing scATAC data. Default = 'ATAC'.
Value
A list containing:
- peaks_gr
GenomicRangesof the filtered and annotated peaks.- state_counts
Matrix of raw chromatin-state counts per cell.
- state_frac
Matrix of state fractions per cell.
- state_CLR
Matrix of CLR-normalized state values.
- state_z
Matrix of Z-scored CLR values (optionally baseline-scaled).
- scores
Data frame containing erosion and entropy scores per cell (optionally covariate-regressed).
Details
The workflow consists of:
Peak annotation with ChromHMM states (unless
skip_annotation=TRUE).Filtering peaks using stoplists, standard chromosomes, and minimum overlap rules.
Feature-level filtering for low-coverage peaks.
Construction of a cell-by-state matrix and normalization via fractions, CLR, and Z-scores (with optional reference group scaling).
Calculation of the erosion score (balance between active and repressive states).
Calculation of the entropy score (plasticity of chromatin states per cell).