Skip to content

Alter code to use pysam's bam indexing to speed up parsing. #2

@MikeDacre

Description

@MikeDacre

Per @carloartieri:

pysam's value comes from it being able to handle indexed BAM files. You can restrict the analysis directly to reads overlapping SNPs (if you want) and then use the pysam.read object to extract the variant calls directly and much more rapidly.

Possible issue:

  • indexed bams are sorted by coordinate, and mate pairs are not guaranteed to be adjacent in the file. Is it possible to split the file for multiplexing despite this?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions