<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class="">Michael can you answer the second question (Michael wrote the protocol, so I CC’d him).</div><div class=""><br class=""></div><div class="">With respect to the first question. Expression level is not necessarily relevant to the annotation process (so no MAKER does not look at read coverage). Instead we use the transcript assemblies to identify introns via splice aware alignment (yes it is the introns and not the exons we care about). Trinity has a nice option called jaccard_clip which avoids false merging of neighboring transcripts (mostly occurs in fungi where UTR can overlap). Merging of transcripts will cause extra introns to be assigned as hints as well as potential overextension of UTR during final polishing steps. The jaccard_clip option is the main reason we recommend Trinity. If Stringtie has a similar option, then it can be used as well.</div><div class=""><br class=""></div><div class="">Thanks,</div><div class="">Carson</div><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><br class=""><div><blockquote type="cite" class=""><div class="">On May 4, 2017, at 12:37 AM, Salim Bougouffa <<a href="mailto:mjfi2sb3@gmail.com" class="">mjfi2sb3@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">Hi,<div class=""><br class=""></div><div class="">I am attempting to annotate a plant genome. I have a couple of questions:</div><div class=""><br class=""></div><div class=""><b class="">1) RNA-seq assembly</b></div><div class="">a) I assembled my RNA-seq data using Trinity and StringTie. The two produce drastically different numbers. When I compare the two assemblies for each sample using TransRate, StringTie produces a higher score. for most of the assemblies. I see in all of the threads that you recommend Trinity but doesn't trinity produce way too many transcripts (even after chucking out the "bad" ones using transrate).</div><div class="">b) During hint creation in MAKER, does it take into account that different transcripts have different read coverage (expression levels). I guess my question is should I filter transcripts that have a small read coverage.</div><div class=""><br class=""></div><div class=""><b class="">2) Repeat Masking </b></div><div class="">I am following the advanced repeat library construction tutorial (<a href="http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced" class="">http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced</a>). The initial steps find 15 sequences for the LTR and 159 for MITE. But, when I get to the perl DIR_CRL/CRL_Step4.pl step, both output files (Inner_Seq_For_BLAST.fasta, lLTRs_Seq_For_BLAST.fasta) are empty.</div><div class=""><br class=""></div><div class="">a) are these numbers normal because I was expecting a lot more than 16 for the LTR? </div><div class="">b) I don't get any errors when I run CRL_Step4.pl yet no output. What's going on?!</div><div class=""><br class=""></div><div class="">Many thanks,</div><div class="">/SB</div></div><div dir="ltr" class="">-- <br class=""></div><div data-smartmail="gmail_signature" class=""><p dir="ltr" class="">____________________________<br class="">
Sent from Inbox Mobile</p>
</div>
_______________________________________________<br class="">maker-devel mailing list<br class=""><a href="mailto:maker-devel@box290.bluehost.com" class="">maker-devel@box290.bluehost.com</a><br class="">http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org<br class=""></div></blockquote></div><br class=""></div></body></html>