Skip to contents

A wrapper function to compute chromatin-state–based scores from scATAC-seq data stored in a Seurat object. This function orchestrates the full pipeline: (1) annotates peaks with ChromHMM states (or accepts pre-annotated peaks), (2) filters peaks (stoplist, nonstandard chromosomes, and low-coverage filtering), (3) constructs and normalizes a chromatin state count/fraction matrix per cell, (4) computes an erosion score (active vs. repressive balance), and (5) computes an entropy-based plasticity score.

Usage

RunChromatic(
  seurat_obj,
  chromHMM_states,
  peaks_gr = NULL,
  stoplist = NULL,
  remove_nonstandard_chromosomes = TRUE,
  min_overlap = 50,
  min_overlap_frac = 0.25,
  filter_features = TRUE,
  skip_annotation = FALSE,
  min_cells = 100,
  min_counts = 100,
  state_signs = NULL,
  covariates = NULL,
  state_col = "name",
  active_patterns = c("TssA", "TssFlnk", "Tx", "EnhA", "EnhG", "EnhWk"),
  repressive_patterns = c("ReprPC", "Quies", "Het"),
  pseudocount = 0.5,
  z_group_by = NULL,
  z_group_name = NULL,
  assay = "ATAC"
)

Arguments

seurat_obj

Seurat object containing scATAC-seq data. Must have peaks as features (rows) and cells as columns in the assay specified.

chromHMM_states

A GRanges object containing ChromHMM state annotations. Should have a column with state labels specified by state_col.

peaks_gr

(Optional) A GRanges object of peaks. If NULL (default), peaks are extracted from the input Seurat object. If provided, will be used directly.

stoplist

(Optional) A GRanges object of regions to exclude (e.g. blacklisted regions). If provided, peaks overlapping these regions will be removed.

remove_nonstandard_chromosomes

Logical; if TRUE (default), non-standard chromosomes (e.g. scaffolds) will be removed from both peaks and chromHMM annotations.

min_overlap

Numeric; minimum overlap width (bp) required for a peak to be assigned to a ChromHMM state. Default = 50.

min_overlap_frac

Numeric or NULL; minimum fraction of a peak's length that must overlap a ChromHMM state for assignment. Default = 0.25.

filter_features

Logical; if TRUE (default), peaks are filtered to exclude features with low coverage using ExcludeUncommonPeaks.

skip_annotation

Logical; if TRUE, skips ChromHMM annotation step and assumes that the provided peaks_gr (or peaks from seurat_obj) already contain an annotation column. Default = FALSE.

min_cells

Integer; minimum number of cells required for a peak to be kept (passed to ExcludeUncommonPeaks). Default = 100.

min_counts

Integer; minimum total counts required for a peak to be kept (passed to ExcludeUncommonPeaks). Default = 100.

state_signs

Named vector indicating the “sign” (active/repressive) of each chromatin state. If NULL (default), will be generated automatically from chromHMM_states using ChromatinStateSigns() and the provided patterns.

covariates

(Optional) Character vector of covariate column names from seurat_obj@meta.data to regress out from the scores (e.g. TSS.enrichment, nCount_ATAC).

state_col

Character; name of the metadata column in chromHMM_states containing the state label. Default = "name".

active_patterns

Character vector of regex patterns used to identify active states (passed to ChromatinStateSigns()). Default includes "TssA", "TssFlnk", "Tx", "EnhA", "EnhG", "EnhWk".

repressive_patterns

Character vector of regex patterns used to identify repressive states (passed to ChromatinStateSigns()). Default includes "ReprPC", "Quies", "Het".

pseudocount

Numeric; pseudocount to add before calculating fractions (used by scoring functions). Default = 0.5.

z_group_by

(Optional) Column name in seurat_obj@meta.data specifying a group to use as reference for baseline Z-scoring. If NULL, Z-scores are computed across all cells.

z_group_name

(Optional) Name of the group (within z_group_by) to use as reference for baseline Z-scoring.

assay

Character; name of the Seurat assay containing scATAC data. Default = 'ATAC'.

Value

A list containing:

peaks_gr

GenomicRanges of the filtered and annotated peaks.

state_counts

Matrix of raw chromatin-state counts per cell.

state_frac

Matrix of state fractions per cell.

state_CLR

Matrix of CLR-normalized state values.

state_z

Matrix of Z-scored CLR values (optionally baseline-scaled).

scores

Data frame containing erosion and entropy scores per cell (optionally covariate-regressed).

Details

The workflow consists of:

  1. Peak annotation with ChromHMM states (unless skip_annotation=TRUE).

  2. Filtering peaks using stoplists, standard chromosomes, and minimum overlap rules.

  3. Feature-level filtering for low-coverage peaks.

  4. Construction of a cell-by-state matrix and normalization via fractions, CLR, and Z-scores (with optional reference group scaling).

  5. Calculation of the erosion score (balance between active and repressive states).

  6. Calculation of the entropy score (plasticity of chromatin states per cell).

Examples

if (FALSE) { # \dontrun{
output <- RunChromatic(
    seurat_obj = seurat_obj,
    chromHMM_states = chromHMM_states,
)
} # }