[maker-devel] Fewer gene models output with a superset of EST evidence
Bob Zimmermann
robert.zimmermann at univie.ac.at
Thu Oct 19 09:25:08 MDT 2017
Hi Maker Developers,
I have been playing around with several data sets as input to annotate our newly reassembled genome. We have 3 RNA seq datasets which have been assembled into de novo transcripts using Trinity. These are input into the maker pipeline along with protein evidence. What is strange is that when I run maker with the de novo transcripts from a single set, I optain more maker transcripts than when I run with a combined set (1619 vs 1450 on one chromosome) and they are longer (median transcript length 1619 vs 1450, IQR 872-2160 vs 667-2026). It might make sense if they were more and shorter if the additional evidence was joining transcripts, but this would indicate that it is not the case.
Therefore I’m trying to understand the algorithm. From what I understand if it finds evidence for an ab initio prediction for which the internal splice junctions agree, then it is considered for improvement. Why, then, if my combined set is a strict superset of the single set, do i get more transcripts with the single set?
Thanks for your help!
Best,
Bob
—
Department of Molecular Evolution and Development
Universität Wien
Althanstraße 14 (UZA I), Zimmer 2.019
1090 Vienna
Austria
+43 1 427757002
More information about the maker-devel
mailing list