-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Hi Mark and Bida,
Thank you for your prompt response regarding the generation of motif sets. I opened a new issue to ask additional questions about the usage.
- According to the help manual, vamos includes four subcommands:
vamos --contig [-b in.bam] [-r vntrs_region_motifs.bed] [-o output.vcf] [-s sample_name] [-t threads]
vamos --read [-b in.bam] [-r vntrs_region_motifs.bed] [-o output.vcf] [-s sample_name] [-t threads] [-p phase_flank]
vamos --somatic [-b in.bam] [-r vntrs_region_motifs.bed] [-o output.vcf] [-s sample_name] [-t threads] [-p phase_flank]
vamos -m [verison of efficient motif set]
So far, only contig and read have been introduced in the documentation. I would like to ask whether somatic is intended for detecting somatic instability. I would greatly appreciate it if you could provide more details on the purpose and use cases of somatic and m subcommands.
-
Does vamos provide read-level information corresponding to each haplotype, as Mark mentioned that the tool partitions reads prior to annotating tandem repeats? If so, is it possible to examine the supporting reads for each allele? This would help in refining the motif set and improving the annotation by rerunning with additional or more appropriate motifs.
-
I applied vamos to several mock samples, but the genotyping results did not match the expected values.
For example, in a mock sample with 16 RU, I executed the following command: vamos --read -b C9ORF72_1.sorted.bam -r C9ORF72.tsv -s C9ORF72_1 -o C9ORF72_1.vcf
The corresponding C9ORF72.tsv file was:
chr9 27573528 27573546 GGCCCC
However, the VCF output was:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT C9ORF72_1
chr9 27573528 . N <VNTR> . PASS END=27573546;RU=GGCCCC;SVTYPE=VNTR;ALTANNO_H1=0-0-0-0-0-0;LEN_H1=6; GT 1/1
In this case, the reported length (LEN_H1=6) is inconsistent with the expected repeat count of 16 units.
I am wondering if this discrepancy could be due to an issue with my input or parameter settings. Could you please advise whether there is anything I might need to modify or check?
-
In the VCF output, I noticed that the ID column always shows a dot (.). Is there any way to customize or assign a specific identifier to each VNTR record in the output?
-
As mentioned earlier, the tool is currently limited to diploid genotyping. Does this imply that mosaicism detection is also not supported under the current framework? If so, are there any recommended strategies or alternative approaches for detecting or visualizing mosaic repeat structures using vamos?
I would greatly appreciate any insights or suggestions regarding the questions above.
Best,
Hsin