[maker-devel] Maker protein match & tandem similar genes
Carson Holt
carsonhh at gmail.com
Wed Aug 23 12:21:44 MDT 2017
> Thanks Carson, I appreciate your insights. Has been interesting to learn about the the whole genome annotation process. Makes me realize that is is really not a solved area, but I’m glad that maker exists and is easy enough to use for someone who isn’t an expert in it. Is there somewhere I could put your response on the Maker documentation wiki?
We don’t have an externally editable wiki, but the mailing list is archived on google groups and is searchable.
> As I was mentioning earlier in the thread, the ab-initio predictor (augustus) was making errors sublte errors (splice donor site being ~12 nt downstream than supported), despite being trained (I trained through BUSCO, for ease), and having an aligned transcript “hint” which had the correct structure. I believe the maker configuration was correct. Beyond troubleshooting the augustus training, which seems a bit complicated, and doing manual curation / fixing of the gene models (which seems to be a bandaid over my potentially misconfigured augustus training?), going with a purely est2genome=1 approach seems to be a nice way to do it. Better in my opinion to have a known unknown (obvious errors, fragmented genes that are supported by transcript evidence), that unknown unknowns (subtle errors in exon-exon junctions from augustus).
Related to this, I just got off of a conference call where we were looking at Augustus behavior, and a student did an experiment where they introduced early stop codons into 100 genes, then let Augustus predict again. 80% of the time Augustus altered splicing patterns to try and jump over the stop codon, 11% of the time it would truncate the transcript, and 9% of the time it would refuse to call anything. So when you see splicing errors, it is usually because something is affecting the ORF somewhere, so it alters splicing to extend the ORF to get the maximum scoring bonus by capturing downstream parts of features en hints
> A quick question: Could you confirm / deny that Maker doesn’t annotate non-coding RNA genes? E.g. I’ve picked up some rRNAs and ncRNAs in my de novo transcriptome, but my understanding is that est2genome and the ab-inition approach requires that an ORF be present, hence no non-coding RNA genes (beyond the tRNAs and whatnot that can be specifically included)
MAKER only annotates tRNA’s and whatever snoscan annotates. It does not annotated any other non-coding features.
—Carson
More information about the maker-devel
mailing list