[maker-devel] Maker decision of genes when having a long transcript over several proteins

Carson Holt carsonhh at gmail.com
Mon Sep 30 11:00:57 MDT 2019


> I am using Maker for gene annotations and I separate different tracks in the GFF to visualize a genomic region in a genome browser. Both Augustus and SNAP predict four short genes at this region. There is one long EST hit and three short protein hit. Maker annotates it as a long gene, I am wondering if it is indicated by EST hit? If annotating three or four short genes is the “correct” way, is it any settings I should change instead of removing the EST evidence?

MAKER just provides hints to the gene predictors (i.e. and set or protein increases the probability that awn exon/intron should exist where indicated), the predictor then identifies if the hint is workable or not.  You can manually curate where you thing incorrect evidence tipped the scales in favor of an incorrect models probability using Apollo. You can also try EVM post processing where you can specify different weights to give to specific evidence types.



> I also examine another genomic region where Augustus and SNAP predict a gene; there is no EST hit, but one protein hit. At this case, Maker does not annotate a gene at this region. Is it because a protein hit is not enough for annotating a region and it always requires EST hit?


The protein hit may not be in the same reading frame as the gene predictions (in which case it does not count as evidence support) or it may be too small a fraction of the original protein in which case it may be filtered out as a spurious low complexity alignment.


—Carson








More information about the maker-devel mailing list