<div dir="ltr"><div><div><div><div>Dear all<br><br></div>I am trying to annotate a new spiroplasma strain and I would like to know if there is a way to change the stop codons (not take into account 'tga')<br><br></div>cause eitherwise I get too many premature stop codons and fragmented genes that are not real<br><br></div>Best<br></div>Panos<br></div><div class="gmail_extra"><br><div class="gmail_quote">On 27 January 2016 at 14:14, Panos Sapou <span dir="ltr"><<a href="mailto:sapuizait@gmail.com" target="_blank">sapuizait@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div><div><div><div><div><div><div><div><div><div><div><div><div><div><div><div>Dear all<br><br></div>I recently started using maker for the annotation of my prokaryotic genomes and even if i managed to get some nice results I would like to check with you if what I did was right and also ask you a couple of questions about the procedure<br><br></div>I also apologize in advance if I ask sth silly since I am a newbie in bionformatics and I might ask very basic stuff<br><br><br></div>I have only available DNA sequences, I have no ESTs and no proteins<br><br>1) I started by using the protein2genome option and as reference I used the Uniref50 database. Then I generated a merged gff file (similar procedure like the one in the tutorial maker)<br><br></div>2) I used Genemark.S and I created a model by using the <a href="http://gmsn.pl" target="_blank">gmsn.pl</a> command and as input the assembled contigs of my bacteria<br><br></div>3) after finishing the above 2 steps I run maker again by using as input the gff file from step 1: #-------Re-annotation using maker derived GFF3: maker_gff=input.gff<br></div>and I also set<br></div>protein_pass=1<br></div>is that correct? do you think it helps?<br></div>and at the #-----gene prediction I used the hmm.mod file generated in step 2 <br><br></div>my questions:<br></div>Do the above sound correct?<br><br></div><div>it is in my understanding that I can only use genemark for prokaryotic genomes, is that correct?<br></div><div><br></div>when I run maker the second time (step 3) should I set protein2genome=1 or 0? or just having the gff file (from step 1) in the re-annotation options is enough? and thefore prediction based on the protein2genome has already been done?<br><br></div>Also if I use a gff file (from step 1) will it make any difference if I set protein2genome=1 and use an extra (different) database? (I was wondering if it will improve the results?)<br><br></div>finally regarding the choice of the database: would you advise me to use uniref or the proteomes of closely related bacteria (I have downloaded and created a single fasta from appx 100 proteomes of closely related bacteria)<br></div><div><br></div>thank you in advance <br>and once again I apologize if it is pretty basic what I am asking, just wanted to make sure...<br><br><br></div>Best<span class="HOEnZb"><font color="#888888"><br></font></span></div><span class="HOEnZb"><font color="#888888">Panos<font face="Tahoma" size="2" color="black"><span style="font-size:10pt" dir="ltr"></span></font><br><div><div><div><div><br></div></div></div></div></font></span></div>

</blockquote></div><br></div>