Skip to content

Feature request – add support for CSI (*.csi) BAM indices #24

@jantusan

Description

@jantusan

Hello,

Thank you for developing BAMscale, it has become my go-to tool for generating bigwigs. While processing a large wheat ChIP-seq dataset I ran into a limitation that I hope could be addressed (or perhaps you already have a workaround):

Summary

When a BAM is indexed with CSI (needed for large chromosomes or many contigs), BAMscale fails with cannot find *.bai.

Steps to reproduce

samtools index -c sample.bam    # creates sample.bam.csi
BAMscale scale --bam sample.bam --binsize 10
# ERROR: cannot find sample.bam.bai

Expected behaviour

  • Automatically load sample.bam.csi, or
  • Allow specifying the index path (e.g. --index sample.bam.csi).

Feasibility notes (from HTSlib docs)

  • HTSlib loads BAI or CSI transparently via sam_index_load() after opening with hts_open/sam_open. See the sam.h API docs.
  • Explicit index path is supported in HTSlib ≥1.10 using the ##idx## syntax (e.g. sample.bam##idx##/path/to/sample.bam.csi) or via hts_idx_load2(fn, fnidx). See the 1.10 release notes and hts.h.
  • Background: BAI indexes are limited to chromosomes ≤512 Mbp, hence CSI for large genomes.

Environment

  • BAMscale v0.0.9
  • samtools 1.21

Reference

Thanks for considering this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions