Hello,
I am passing my BAM file, which has been pre-processed already, to the snap.pp.make_fragment_file and all duplicate-flagged reads are silently dropped before SnapATAC2's own dedup pass. The results that I got are the following:
frac_duplicates in the returned stats is always 0.0
- Column 5 (
count) in the output fragments file is always 1 for every row
What can I do in order to retain the duplicates as well, not only the primary reads? Is there a parameter I missed, or a recommended workflow for users whose BAMs come from a pipeline that already runs MarkDuplicates? I'd like column 5 and frac_duplicates to reflect the real per-fragment read counts.
Environment
- SnapATAC2 2.9.0, Python 3.11
- Long-read single-end scATAC BAM file (ONT), Picard MarkDuplicates with
--BARCODE_TAG CB
Hello,
I am passing my BAM file, which has been pre-processed already, to the snap.pp.make_fragment_file and all duplicate-flagged reads are silently dropped before SnapATAC2's own dedup pass. The results that I got are the following:
frac_duplicatesin the returned stats is always0.0count) in the output fragments file is always1for every rowWhat can I do in order to retain the duplicates as well, not only the primary reads? Is there a parameter I missed, or a recommended workflow for users whose BAMs come from a pipeline that already runs MarkDuplicates? I'd like column 5 and
frac_duplicatesto reflect the real per-fragment read counts.Environment
--BARCODE_TAG CB