[maker-devel] iterative Maker2

Fri Dec 12 07:10:46 MST 2014

Hi all,

I am a relatively new user to Maker2, and I'm looking for advise on running many iterations of the same dataset in Maker2.

I have a relatively small genome (~124 MB) from a wasp that is assembled into ~1,500 scaffold. I have run several iterations of Maker2 by re-generating .hmms in SNAP and feeding them into the next round, and my gene predictions keep increasing (in number and in size).  The only thing that changes at each round is the .hmm.
This is the evidence that I give is:

-          de novo assembled ESTs from a different strain of the same species (70,000 contigs... I am currently working on improving this assembly with the hope that this will be helpful here)

-          610 proteins extracted from the genome scaffolds using CEGMA and HaMSTr

For my 1st iteration, I used the Nasonia .hmm from SNAP, and the est2genome/protein2genome option.

For the 2nd, 3rd and 4th rounds I have used .hmms generated from the previous round, all without the est2genome/protein2genome option. All other files are the same as in the original run.

As I understand it, after the second round, nothing should change in Maker2. But the differences are obvious between runs. Some entirely new exons are annotated. For example,  just counting "exon" in the .gff file gives me 73,000 after the third iteration and 96,000 after the fourth! Actually the biggest leap in this number is between the third and fourth round. I can also see that many features are longer when I look at the files in Geneious.

Is this sort of change possible after the second round of Maker2? Is there something I have done wrong in my runs, or am a understanding this output incorrectly?

Thank you,
Alice

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20141212/0df64686/attachment-0001.html>