[maker-devel] How sensitive is MAKER to redundant/partial transcripts?

Carson Holt carsonhh at gmail.com
Thu Jul 12 14:38:33 MDT 2018


MAKER will automatically collapse redundant evidence. The only thing you may need to worry about with too many datasets is background transcription. With more datasets you will have more spurious assemblies from background transcription (if you sequence deep enough everything is transcribed at some level). You should also look at the results in a browser like apollo, you may find that some datasets are more noisy than others and it would be beneficial to drop them especially if they are redundant. So always do a  visual review of results.

—Carson



> On Jul 4, 2018, at 6:28 AM, Lior Glick <liorglck at gmail.com> wrote:
> 
> Dear MAKER users,
> 
> I am new to MAKER and would like your advice.
> I am planning to annotate multiple genomes of tomato variants and wild relatives. To this end, I have been working on generating a diverse transcripts data set to be used as input for MAKER (along with protein sequences and the 'official' tomato annotation). My transcripts set was generated by collecting multiple available RNA-Seq results from SRA, covering diverse variants, conditions and tissues, and assembling them into transcripts using Trinity. My goal is to have a data set as diverse and broad as possible.
> Now I have ~30 fasta files of transcripts, originating from different studies. Of course, many of the transcripts are redundant and/or partial. I am exploring ways to merge the multiple data sets into a non-redundant one, while also stitching partial transcripts into longer ones based on overlaps.
> However, this turns out to be not-so-trivial and I am wandering if this is really necessary in order to get a good annotation? Maybe I can just concatenate all my transcriptome assembly results, and MAKER will handle redundant and partial transcripts?
> Can someone clarify how this works, and try to assess if an annotation based on a merged data set should be superior to one that didn't undergo such a process? If someone has actual experience with such data, that  would be really helpful, but any advice would be highly appreciated.
> 
> Thanks a lot and best regards,
> Lior
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org





More information about the maker-devel mailing list