[maker-devel] prokaryotic genome annotation

Carson Holt carsonhh at gmail.com
Sun Jan 31 13:43:21 MST 2016


MAKER doesn’t support alternate codon usage yet.

—Carson


> On Jan 29, 2016, at 3:12 AM, Panos Sapou <sapuizait at gmail.com> wrote:
> 
> Dear all
> 
> I am trying to annotate a new spiroplasma strain and I would like to know if there is a way to change the stop codons (not take into account 'tga')
> 
> cause eitherwise I get too many premature stop codons and fragmented genes that are not real
> 
> Best
> Panos
> 
> On 27 January 2016 at 14:14, Panos Sapou <sapuizait at gmail.com <mailto:sapuizait at gmail.com>> wrote:
> Dear all
> 
> I recently started using maker for the annotation of my prokaryotic genomes and even if i managed to get some nice results I would like to check with you if what I did was right and also ask you a couple of questions about the procedure
> 
> I also apologize in advance if I ask sth silly since I am a newbie in bionformatics and I might ask very basic stuff
> 
> 
> I have only available DNA sequences, I have no ESTs and no proteins
> 
> 1) I started by using the protein2genome option and as reference I used the Uniref50 database. Then I generated a merged gff file (similar procedure like the one in the tutorial maker)
> 
> 2) I used Genemark.S and I created a model by using the gmsn.pl <http://gmsn.pl/> command and as input the assembled contigs of my bacteria
> 
> 3) after finishing the above 2 steps I run maker again by using as input the gff file from step 1: #-------Re-annotation using maker derived GFF3: maker_gff=input.gff
> and I also set
> protein_pass=1
> is that correct? do you think it helps?
> and at the #-----gene prediction I used the hmm.mod file generated in step 2 
> 
> my questions:
> Do the above sound correct?
> 
> it is in my understanding that I can only use genemark for prokaryotic genomes, is that correct?
> 
> when I run maker the second time (step 3) should I set protein2genome=1 or 0? or just having the gff file (from step 1) in the re-annotation options is enough? and thefore prediction based on the protein2genome has already been done?
> 
> Also if I use a gff file (from step 1) will it make any difference if I set protein2genome=1 and use an extra (different) database? (I was wondering if it will improve the results?)
> 
> finally regarding the choice of the database: would you advise me to use uniref or the proteomes of closely related bacteria (I have downloaded and created a single fasta from appx 100 proteomes of closely related bacteria)
> 
> thank you in advance 
> and once again I apologize if it is pretty basic what I am asking, just wanted to make sure...
> 
> 
> Best
> Panos
> 
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20160131/c2cd44e5/attachment-0003.html>


More information about the maker-devel mailing list