[maker-devel] Annotation of a new variant within a species

Carson Holt carsonhh at gmail.com
Fri Aug 3 14:16:30 MDT 2018


> 1. Regarding transcripts data, how should I use transcripts from other variants of the same species? Namely, should I use the est or the altest parameter? What is the actual difference in behavior?

Use est=. The alt_est option is for distant relationships (so distant that nucleotides won’t match but amino acids still do). It translates all transcripts to amino acids in six reading frames before alignment (very expensive computationally and more prone to spurious alignment). So different stains will still match in nucleotide space.


> 2. Is there a way to incorporate gene models (in gff format) from the reference annotation? I expect high similarity in my assembled variants, but not identity in terms of content and coordinates, so neither pred_gff nor model_gff sound like what I need, as far as I understand.

model_gff is what you want to always keep a model, and pred_gff is what you want to only keep models supported by evidence. But reguardless of which you choose, the GFF3 must be in the same coordinate space as what you are annotating. So you will have to lift over genes onto the new assembly and make a new GFF3. You can do that with a separate MAKER run where you provide the gene models to est= as fasta files, use est2genome=1, and add this option est_forward=1 (won’t already be there). It’s not perfect but it will produce a GFF3 with gene models based entirely on alignment of the old models. You can then give that GFF3 to model_gff or pred_gff for future runs.

—Carson






More information about the maker-devel mailing list