Skip to content

General question about usage #20

@HLHsieh

Description

@HLHsieh

Hi Mark and Bida,

Thank you for your prompt response regarding the generation of motif sets. I opened a new issue to ask additional questions about the usage.

  1. According to the help manual, vamos includes four subcommands:
vamos --contig [-b in.bam] [-r vntrs_region_motifs.bed] [-o output.vcf] [-s sample_name] [-t threads]
vamos --read [-b in.bam] [-r vntrs_region_motifs.bed] [-o output.vcf] [-s sample_name] [-t threads] [-p phase_flank]
vamos --somatic [-b in.bam] [-r vntrs_region_motifs.bed] [-o output.vcf] [-s sample_name] [-t threads] [-p phase_flank]
vamos -m [verison of efficient motif set]

So far, only contig and read have been introduced in the documentation. I would like to ask whether somatic is intended for detecting somatic instability. I would greatly appreciate it if you could provide more details on the purpose and use cases of somatic and m subcommands.

  1. Does vamos provide read-level information corresponding to each haplotype, as Mark mentioned that the tool partitions reads prior to annotating tandem repeats? If so, is it possible to examine the supporting reads for each allele? This would help in refining the motif set and improving the annotation by rerunning with additional or more appropriate motifs.

  2. I applied vamos to several mock samples, but the genotyping results did not match the expected values.

For example, in a mock sample with 16 RU, I executed the following command: vamos --read -b C9ORF72_1.sorted.bam -r C9ORF72.tsv -s C9ORF72_1 -o C9ORF72_1.vcf

The corresponding C9ORF72.tsv file was:

chr9	27573528	27573546	GGCCCC

However, the VCF output was:

#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	C9ORF72_1
chr9	27573528	.	N	<VNTR>	.	PASS	END=27573546;RU=GGCCCC;SVTYPE=VNTR;ALTANNO_H1=0-0-0-0-0-0;LEN_H1=6;	GT	1/1

In this case, the reported length (LEN_H1=6) is inconsistent with the expected repeat count of 16 units.
I am wondering if this discrepancy could be due to an issue with my input or parameter settings. Could you please advise whether there is anything I might need to modify or check?

  1. In the VCF output, I noticed that the ID column always shows a dot (.). Is there any way to customize or assign a specific identifier to each VNTR record in the output?

  2. As mentioned earlier, the tool is currently limited to diploid genotyping. Does this imply that mosaicism detection is also not supported under the current framework? If so, are there any recommended strategies or alternative approaches for detecting or visualizing mosaic repeat structures using vamos?

I would greatly appreciate any insights or suggestions regarding the questions above.

Best,
Hsin

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions