<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; color: rgb(0, 0, 0); font-size: 14px; font-family: Calibri, sans-serif; "><div>Yes. Barry gave a good overview. The correct_est_fusion option basically clips UTR when there are two neighboring genes that only overlap in the UTR (so you still get both gene models). Since the primary effect of falsely merged mRNA-seq is overly long UTR this tends to fix many cases. Of course avoiding merging the mRNA-seq reads in the first place also works. So using Trinity's extra options to control that together with the correct_est_option option in MAKER is probably the way to go.</div><div><br></div><div>I think you can lower pred_flank to 100, but below that you might start to get weird behavior from the gene predictors (they need some upstream and downstream sequence or the HMMs don't work well).</div><div><br></div><div>Thanks,</div><div>Carson</div><div><br></div><div><br></div><span id="OLK_SRC_BODY_SECTION"><div style="font-family:Calibri; font-size:11pt; text-align:left; color:black; BORDER-BOTTOM: medium none; BORDER-LEFT: medium none; PADDING-BOTTOM: 0in; PADDING-LEFT: 0in; PADDING-RIGHT: 0in; BORDER-TOP: #b5c4df 1pt solid; BORDER-RIGHT: medium none; PADDING-TOP: 3pt"><span style="font-weight:bold">From: </span> Barry Moore <<a href="mailto:barry.moore@genetics.utah.edu">barry.moore@genetics.utah.edu</a>><br><span style="font-weight:bold">Date: </span> Tuesday, 21 May, 2013 7:54 PM<br><span style="font-weight:bold">To: </span> <<a href="mailto:Sean.Li@csiro.au">Sean.Li@csiro.au</a>><br><span style="font-weight:bold">Cc: </span> <<a href="mailto:maker-devel@yandell-lab.org">maker-devel@yandell-lab.org</a>><br><span style="font-weight:bold">Subject: </span> Re: [maker-devel] Fused gene problem, improvement in the Maker 2.27?<br></div><div><br></div><div><base href="x-msg://337/"><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Hi Sean,<div><br></div><div>I think you want to be careful with dropping the pred_flank parameter too low. This controls how much flanking sequence (for a given cluster of evidence) MAKER will pass to the gene predictor. Some (maybe all?) of the gene predictors have an initial state in their HMM for intergenic sequence and if you do not have some intergenic sequence for them to consider first they can't transition to their next state. The correct_est_fusion option can help (at the cost of losing some UTR annotations) - Carson will likely give you a better description of the intricacies of the correct_est_fusion.</div><div><br></div><div>Don't know how you are assembling your RNASeq, but there is an option in Trinity - I forget the name - that will instruct Trinity to be more restrictive in merging neighboring clusters of reads into a longer transcript and this can help as well.</div><div><br></div><div>B</div><div><br></div><div><div><div>On May 21, 2013, at 1:36 AM, <<a href="mailto:Sean.Li@csiro.au">Sean.Li@csiro.au</a>></div><div> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><span class="Apple-style-span" style="border-collapse: separate; font-family: Helvetica; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div lang="EN-AU" link="blue" vlink="purple"><div class="WordSection1" style="page: WordSection1; "><div style="margin-top: 0cm; margin-right: 0cm; margin-left: 0cm; margin-bottom: 0.0001pt; font-size: 10.5pt; font-family: Consolas; "><span style="font-size: 11pt; color: rgb(31, 73, 125); font-family: Calibri, sans-serif; ">Hi Carson,<o:p></o:p></span></div><div style="margin-top: 0cm; margin-right: 0cm; margin-left: 0cm; margin-bottom: 0.0001pt; font-size: 10.5pt; font-family: Consolas; "><span style="font-size: 11pt; color: rgb(31, 73, 125); font-family: Calibri, sans-serif; "><o:p> </o:p></span></div><div style="margin-top: 0cm; margin-right: 0cm; margin-left: 0cm; margin-bottom: 0.0001pt; font-size: 10.5pt; font-family: Consolas; "><span style="font-size: 11pt; color: rgb(31, 73, 125); font-family: Calibri, sans-serif; ">We are currently working on the annotation of Helicoverpa genome project. Maker has been chosen as the preliminary tool for the task. By checking the annotation results by using maker 2.10, we saw some loci have the fusion problem: two separate neighbour genes are likely to be fused together and regarded as a single candidate output by maker. If we go further by looking at the outputs from each individual de novo algorithm, e.g. augustus or snap, the prediction was correct. We are also using RNA-Seq assembly from cufflinks and some protein evidence data from closely related insects. <o:p></o:p></span></div><div style="margin-top: 0cm; margin-right: 0cm; margin-left: 0cm; margin-bottom: 0.0001pt; font-size: 10.5pt; font-family: Consolas; "><span style="font-size: 11pt; color: rgb(31, 73, 125); font-family: Calibri, sans-serif; "><o:p> </o:p></span></div><div style="margin-top: 0cm; margin-right: 0cm; margin-left: 0cm; margin-bottom: 0.0001pt; font-size: 10.5pt; font-family: Consolas; "><span style="font-size: 11pt; color: rgb(31, 73, 125); font-family: Calibri, sans-serif; ">We noticed that the parameters “pred_flank” in maker v2.10 and “correct_est_fusion” in maker v2.27 might be useful for maker to decide when to merge models or not. If possible, can you please explain what these two parameters can do with the predicted genes, RNA-Seq and protein evidence?<o:p></o:p></span></div><div style="margin-top: 0cm; margin-right: 0cm; margin-left: 0cm; margin-bottom: 0.0001pt; font-size: 10.5pt; font-family: Consolas; "><span style="font-size: 11pt; color: rgb(31, 73, 125); font-family: Calibri, sans-serif; "><o:p> </o:p></span></div><div style="margin-top: 0cm; margin-right: 0cm; margin-left: 0cm; margin-bottom: 0.0001pt; font-size: 10.5pt; font-family: Consolas; "><span style="font-size: 11pt; color: rgb(31, 73, 125); font-family: Calibri, sans-serif; ">Also, our current plan is to install maker 2.27, train the algorithms to predict UTRs, enlarge the protein evidence datasets and input our previous annotations as model_gff. We are facing with an critical question: in which way we could effectively improve the gene fusing problem? 1) setting the pred_flank lower than 100? 2) turn the correct_est_fusion on? 3) anything else? <o:p></o:p></span></div><div style="margin-top: 0cm; margin-right: 0cm; margin-left: 0cm; margin-bottom: 0.0001pt; font-size: 10.5pt; font-family: Consolas; "><span style="font-size: 11pt; color: rgb(31, 73, 125); font-family: Calibri, sans-serif; "><o:p> </o:p></span></div><div style="margin-top: 0cm; margin-right: 0cm; margin-left: 0cm; margin-bottom: 0.0001pt; font-size: 10.5pt; font-family: Consolas; "><span style="font-size: 11pt; color: rgb(31, 73, 125); font-family: Calibri, sans-serif; ">Thank you.<o:p></o:p></span></div><div style="margin-top: 0cm; margin-right: 0cm; margin-left: 0cm; margin-bottom: 0.0001pt; font-size: 10.5pt; font-family: Consolas; "><span style="font-size: 11pt; color: rgb(31, 73, 125); font-family: Calibri, sans-serif; "><o:p> </o:p></span></div><div style="margin-top: 0cm; margin-right: 0cm; margin-left: 0cm; margin-bottom: 0.0001pt; font-size: 10.5pt; font-family: Consolas; "><span style="font-size: 11pt; color: rgb(31, 73, 125); font-family: Calibri, sans-serif; ">With best regards,<o:p></o:p></span></div><div style="margin-top: 0cm; margin-right: 0cm; margin-left: 0cm; margin-bottom: 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif; "><span style="color: rgb(31, 73, 125); font-family: Calibri, sans-serif; ">Xi (Sean) Li, Ph. D.<br><br>Bioinformatics Analyst, Bioinformatics Core,<br>CSIRO Mathematics, Informatics and Statistics<br>Phone:<span class="Apple-converted-space"> </span><a href="tel:%2B61%202%206216%207138" target="_blank" style="color: blue; text-decoration: underline; ">+61 2 6216 7138</a><br>Address: GPO Box 664, Canberra, ACT 2601</span><span style="color: rgb(31, 73, 125); font-family: Calibri, sans-serif; "><o:p></o:p></span></div><div style="margin-top: 0cm; margin-right: 0cm; margin-left: 0cm; margin-bottom: 0.0001pt; font-size: 10.5pt; font-family: Consolas; "><span style="font-size: 11pt; color: rgb(31, 73, 125); font-family: Calibri, sans-serif; "><o:p> </o:p></span></div><div style="margin-top: 0cm; margin-right: 0cm; margin-left: 0cm; margin-bottom: 0.0001pt; font-size: 10.5pt; font-family: Consolas; "><span style="font-size: 11pt; color: rgb(31, 73, 125); font-family: Calibri, sans-serif; "><o:p> </o:p></span></div><div style="margin-top: 0cm; margin-right: 0cm; margin-left: 0cm; margin-bottom: 0.0001pt; font-size: 10.5pt; font-family: Consolas; "><span style="font-size: 11pt; color: rgb(31, 73, 125); font-family: Calibri, sans-serif; "><o:p> </o:p></span></div><div style="margin-top: 0cm; margin-right: 0cm; margin-left: 0cm; margin-bottom: 0.0001pt; font-size: 10.5pt; font-family: Consolas; "><o:p> </o:p></div><div style="margin-top: 0cm; margin-right: 0cm; margin-left: 0cm; margin-bottom: 0.0001pt; font-size: 12pt; font-family: 'Times New Roman', serif; "><span style="font-size: 11pt; color: rgb(31, 73, 125); font-family: Calibri, sans-serif; "><o:p> </o:p></span></div></div>_______________________________________________<br>maker-devel mailing list<br><a href="mailto:maker-devel@box290.bluehost.com" style="color: blue; text-decoration: underline; ">maker-devel@box290.bluehost.com</a><br><a href="http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org" style="color: blue; text-decoration: underline; ">http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org</a><br></div></span></blockquote></div><br><div><span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; "><div><span class="Apple-style-span" style="font-family: Arial; font-size: 12px; "><div>Barry Moore</div><div>Research Scientist</div><div>Dept. of Human Genetics</div><div>University of Utah</div><div>Salt Lake City, UT 84112</div><div>--------------------------------------------</div><div>(801) 585-3543</div><div><br class="khtml-block-placeholder"></div></span></div><div><br></div></span><br class="Apple-interchange-newline"></div><br></div></div></div>_______________________________________________
maker-devel mailing list
<a href="mailto:maker-devel@box290.bluehost.com">maker-devel@box290.bluehost.com</a>
<a href="http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org">http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org</a>
</span></body></html>