[maker-devel] prokaryotic genome annotation
Carson Holt
carsonhh at gmail.com
Sun Jan 31 13:43:21 MST 2016
MAKER doesn’t support alternate codon usage yet.
—Carson
> On Jan 29, 2016, at 3:12 AM, Panos Sapou <sapuizait at gmail.com> wrote:
>
> Dear all
>
> I am trying to annotate a new spiroplasma strain and I would like to know if there is a way to change the stop codons (not take into account 'tga')
>
> cause eitherwise I get too many premature stop codons and fragmented genes that are not real
>
> Best
> Panos
>
> On 27 January 2016 at 14:14, Panos Sapou <sapuizait at gmail.com <mailto:sapuizait at gmail.com>> wrote:
> Dear all
>
> I recently started using maker for the annotation of my prokaryotic genomes and even if i managed to get some nice results I would like to check with you if what I did was right and also ask you a couple of questions about the procedure
>
> I also apologize in advance if I ask sth silly since I am a newbie in bionformatics and I might ask very basic stuff
>
>
> I have only available DNA sequences, I have no ESTs and no proteins
>
> 1) I started by using the protein2genome option and as reference I used the Uniref50 database. Then I generated a merged gff file (similar procedure like the one in the tutorial maker)
>
> 2) I used Genemark.S and I created a model by using the gmsn.pl <http://gmsn.pl/> command and as input the assembled contigs of my bacteria
>
> 3) after finishing the above 2 steps I run maker again by using as input the gff file from step 1: #-------Re-annotation using maker derived GFF3: maker_gff=input.gff
> and I also set
> protein_pass=1
> is that correct? do you think it helps?
> and at the #-----gene prediction I used the hmm.mod file generated in step 2
>
> my questions:
> Do the above sound correct?
>
> it is in my understanding that I can only use genemark for prokaryotic genomes, is that correct?
>
> when I run maker the second time (step 3) should I set protein2genome=1 or 0? or just having the gff file (from step 1) in the re-annotation options is enough? and thefore prediction based on the protein2genome has already been done?
>
> Also if I use a gff file (from step 1) will it make any difference if I set protein2genome=1 and use an extra (different) database? (I was wondering if it will improve the results?)
>
> finally regarding the choice of the database: would you advise me to use uniref or the proteomes of closely related bacteria (I have downloaded and created a single fasta from appx 100 proteomes of closely related bacteria)
>
> thank you in advance
> and once again I apologize if it is pretty basic what I am asking, just wanted to make sure...
>
>
> Best
> Panos
>
>
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20160131/c2cd44e5/attachment-0003.html>
More information about the maker-devel
mailing list