Skip to contents

Filters out peaks that are rarely observed across cells or have low total counts in a peak-by-cell matrix. This is a simple feature selection step prior to downstream analyses (e.g., entropy or erosion scores).

Usage

ExcludeUncommonPeaks(
  peaks_gr,
  peaks_mat,
  min_cells = 100,
  min_counts = 100,
  verbose = TRUE
)

Arguments

peaks_gr

A GRanges object containing the genomic coordinates of peaks. Rows should correspond to peaks in peaks_mat.

peaks_mat

A numeric matrix of peak counts (rows = peaks, columns = cells).

min_cells

Integer; minimum number of cells in which a peak must be detected (nonzero counts) to be retained. Default = 100.

min_counts

Integer; minimum total counts across all cells for a peak to be retained. Default = 100.

verbose

Logical; if TRUE, print a message summarizing how many peaks were retained/filtered. Default = TRUE.

Value

A named list with:

  • peaks_gr — the filtered GRanges object of peaks.

  • peaks_mat — the filtered peak-by-cell count matrix.

Details

The function applies two filters:

  1. A peak must be present (nonzero) in at least min_cells cells.

  2. A peak must have at least min_counts total counts across all cells.

Peaks passing both filters are retained in the output. This step helps reduce noise and memory usage in large peak-by-cell matrices.

See also

Examples

if (FALSE) { # \dontrun{
filtered <- ExcludeUncommonPeaks(
  peaks_gr = peaks_gr,
  peaks_mat = peaks_mat,
  min_cells = 200,
  min_counts = 500,
  verbose = TRUE
)
filtered$peaks_gr
filtered$peaks_mat
} # }