[maker-devel] different exons predicted in different maker rounds

roscito roscito at mpi-cbg.de
Mon Oct 5 06:06:57 MDT 2015


Dear all,

First of all, I'd like to thank everyone in this forum for all the tips and comments on the best strategies for running MAKER, they have been really helpful so far.
However, I still don't fully understand the behaviour of MAKER when ran iteratively, and I compare the predictions from each round. Let me explain:

My input data are the following:
- the repeat-masked genome of a vertebrate (~2Gb);
- mRNA data for this species mapped to the genome with tophat2 and assembled into transcripts with cufflinks;
- exonerate-mapped proteins in gff3 format to the reference genome, from closely related species (global alignment)

For the first round of MAKER, I provided both cufflinks and exonerate-mapped proteins with the options est2genome and protein2genome = 1. From maker output, I generated the SNAP .hmm file (as the instructions in http://gmod.org/wiki/MAKER_Tutorial) and provided it as input to the second round of MAKER.
For this second round I still gave cufflinks + exonerated proteins, but switched both est2genome ad protein2genome to 0. After finished, I generated SNAP .hmm once more and provided it for the 3rd and final round of MAKER, along with cufflinks and exonerated-mapped prots and est/prot2genome=0

As sort of a sanity check, I went on and ran a 4th round of MAKER with the SNAP .hmm file from round3, cufflinks and exonerated-mapped prots and est/prot2genome=0, and this time specifying alt_splice=1.
For all the rounds, I also specified single_exon=1.


I loaded the gene predictions from each round plus the cufflink transcripts and the exonerated proteins to the genome browser to visually inspect the output. I saw a few strange cases where MAKER doesn't seem to use the protein/mRNA evidences for the gene predictions, and I would greatly appreciate any feedback/ideas on what I could possible be doing wrong. Here are a few screenshots so you know what I'm talking about:

In this first example, MAKER misses a conserved exon for which there is both protein and mRNA evidence, and only if I specify alt_splice I get the exon 'back'.



In this second example, MAKER completely ignores lots of exons, all conserved across vertebrates, and supported by protein/mRNA evidence.



In the third example, there is no prediction from round1, the one from round2 matches the protein/mRNA evidence, and then in the final round3 and 4, an extra exon appears.




(hope you'l be able to see the images above)
As I said, I would greatly appreciate any feedback on these strange cases. Perhaps I'm missing some parameter(s)?

Thanks a lot.
All the best,
Juliana
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20151005/901af781/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: example1.png
Type: image/png
Size: 53178 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20151005/901af781/attachment-0006.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: example2.png
Type: image/png
Size: 55134 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20151005/901af781/attachment-0007.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: example3.png
Type: image/png
Size: 67598 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20151005/901af781/attachment-0008.png>


More information about the maker-devel mailing list