Clarification on high Nfail counts in modkit pileup output despite valid coverage

Hi modkit team,

I am analyzing CpG methylation using modkit pileup on Oxford Nanopore data
(mapped BAMs, reference-guided).

Across multiple samples, I observe that the majority of reads contributing to
coverage end up in the Nfail column, even when valid_coverage ≥ 3.

Some details:
- Genome: Plasmodium falciparum 3D7
- Command used (example):

  modkit pileup \
    --reference PlasmoDB-64_Pfalciparum3D7_Genome.fasta \
    --modified-bases C \
    --combine-mods \
    input.bam output.bed

- In the resulting bedMethyl files:
  - valid_coverage is often high (≥3 for many CpGs)
  - count_modified and count_canonical are low
  - most reads appear to be classified as Nfail
  - fraction of methylated CpGs among covered sites is very low

This behavior is consistent across samples and across different coverage cutoffs.

My questions:
1. Is a high Nfail count expected behavior in cases of low-confidence or low-level CpG methylation?
2. Which filters most commonly cause reads to be classified as Nfail?
   (e.g. modification probability threshold, basecall quality, alignment flags, context mismatch)
3. Is there a recommended way to summarize or interpret datasets where
   valid coverage exists but most reads fail confidence filters?
4. Would adjusting parameters like min-mod-prob be appropriate to explore this further?

The behavior seems biologically plausible for this organism, but I would like
to confirm that my interpretation of Nfail is correct.

Thanks for the great tool, and for any clarification!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarification on high Nfail counts in modkit pileup output despite valid coverage #577

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Clarification on high Nfail counts in modkit pileup output despite valid coverage #577

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions