Skip to content

Make sure we throw out chimeric reads #14

@petercombs

Description

@petercombs

There's at least some reads in my data set where there are multiple SNPs per read, and those SNPs disagree as to the parental origin of the read. It's rare—suggesting it's probably a sequencing error—but we ought to deal with it properly.

Probably the cleanest thing to do is just toss any read that is ambiguous, but one could imagine if there are 3 or more SNPs going with the consensus.

Also, I'm not sure if this should be a separate issue or not, but if there's a sequencing error that has neither the annotated reference or alternate allele, that should probably be tossed as well.

screen shot 2016-05-09 at 12 47 48 pm

In the attached screenshot, red reads are melanogaster, blue reads are simulans, and grey reads are at least somewhat ambiguous—there's one read with both mel and sim SNPs, and another with an unannotated allele.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions