<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; color: rgb(0, 0, 0); font-size: 14px; font-family: Calibri, sans-serif;"><div>You are probably getting the wrong proteins from your scripts because you are not taking into account the 5' and 3' UTR in the transcript.</div><div><br></div><div>For example</div><div>>snap_masked-contig-processed-gene-0.2-mRNA-1 transcript offset:261 AED:0.25 eAED:0.25 QI:261|0.4|0.83|0.83|0.8|0.83|6|22|240 </div><div><br></div><div>The 5' UTR is 261bp and the 3' UTR is 22bp long. Both would have to be trimmed before translating the transcript into a protein. Once they are trimmed you can use frame 0 for the translation.</div><div><br></div><div>The fasta_tool that comes with MAKER can be used to quickly trim the UTR.</div><div><br></div><div>Example:</div><div>fasta_tool maker_transcripts.fasta --trim_maker_utr</div><div><br></div><div>Then you can try your other scripts again.</div><div><br></div><div>Thanks,</div><div>Carson</div><div><br></div><div><br></div><span id="OLK_SRC_BODY_SECTION"><div style="font-family:Calibri; font-size:11pt; text-align:left; color:black; BORDER-BOTTOM: medium none; BORDER-LEFT: medium none; PADDING-BOTTOM: 0in; PADDING-LEFT: 0in; PADDING-RIGHT: 0in; BORDER-TOP: #b5c4df 1pt solid; BORDER-RIGHT: medium none; PADDING-TOP: 3pt"><span style="font-weight:bold">From: </span> Diana Garnica Moreno <<a href="mailto:diana.garnica@anu.edu.au">diana.garnica@anu.edu.au</a>><br><span style="font-weight:bold">Date: </span> Monday, March 24, 2014 at 5:11 PM<br><span style="font-weight:bold">To: </span> "<a href="mailto:maker-devel@yandell-lab.org">maker-devel@yandell-lab.org</a>" <<a href="mailto:maker-devel@yandell-lab.org">maker-devel@yandell-lab.org</a>><br><span style="font-weight:bold">Subject: </span> [maker-devel] Problem extracting fasta from a GFF file generated with MAKER<br></div><div><br></div><div><meta http-equiv="Content-Type" content="text/html; charset=us-ascii"><style type="text/css" style="display:none"><!--P{margin-top:0;margin-bottom:0;} .ms-cui-menu {background-color:#ffffff;border:1px rgb(171, 171, 171) solid;font-family:'Segoe UI WPC', 'Segoe UI', Tahoma, 'Microsoft Sans Serif', Verdana, sans-serif;font-size:11pt;color:rgb(51, 51, 51);} .ms-cui-menusection-title {display:none;} .ms-cui-ctl {vertical-align:text-top;text-decoration:none;color:rgb(51, 51, 51);} .ms-cui-ctl-on {background-color:rgb(223, 237, 250);opacity: 0.8;} .ms-cui-img-cont-float {display:inline-block;margin-top:2px} .ms-cui-smenu-inner {padding-top:0px;} .ms-owa-paste-option-icon {margin: 2px 4px 0px 4px;vertical-align:sub;padding-bottom: 2px;display:inline-block;} .ms-rtePasteFlyout-option:hover {background-color:rgb(223, 237, 250) !important;opacity:1 !important;} .ms-rtePasteFlyout-option {padding:8px 4px 8px 4px;outline:none;} .ms-cui-menusection {float:left; width:85px;height:24px;overflow:hidden}.wf {speak:none; font-weight:normal; font-variant:normal; text-transform:none; -webkit-font-smoothing:antialiased; vertical-align:middle; display:inline-block;}.wf-family-owa {font-family:'o365Icons'}@font-face { font-family:'o365IconsIE8'; src:url('prem/15.0.898.11/resources/styles/office365icons.ie8.eot?#iefix') format('embedded-opentype'), url('prem/15.0.898.11/resources/styles/office365icons.ie8.woff') format('woff'), url('prem/15.0.898.11/resources/styles/office365icons.ie8.ttf') format('truetype'); font-weight:normal; font-style:normal;}@font-face { font-family:'o365IconsMouse'; src:url('prem/15.0.898.11/resources/styles/office365icons.mouse.eot?#iefix') format('embedded-opentype'), url('prem/15.0.898.11/resources/styles/office365icons.mouse.woff') format('woff'), url('prem/15.0.898.11/resources/styles/office365icons.mouse.ttf') format('truetype'); font-weight:normal; font-style:normal;}.wf-family-owa {font-family:'o365IconsMouse'}.ie8 .wf-family-owa {font-family:'o365IconsIE8'}.ie8 .wf-owa-play-large:before {content:'\e254';}.notIE8 .wf-owa-play-large:before {content:'\e054';}.ie8 .wf-owa-play-large {color:#FFFFFF/*$WFWhiteColor*/;}.notIE8 .wf-owa-play-large {border-color:#FFFFFF/*$WFWhiteColor*/; width:1.4em; height:1.4em; border-width:.1em; border-style:solid; border-radius:.8em; text-align:center; box-sizing:border-box; -moz-box-sizing:border-box; padding:0.1em; color:#FFFFFF/*$WFWhiteColor*/;}.ie8 .wf-size-play-large {width:40px; height:40px; font-size:30px}.notIE8 .wf-size-play-large {width:40px; height:40px; font-size:30px}--></style><div><div style="font-size:12pt;color:#000000;background-color:#FFFFFF;font-family:Calibri,Arial,Helvetica,sans-serif;"><p>Hi there,<br></p><p><br></p><p>We recently assembled a fungal genome using MAKER and we got the gene models. and the corresponding transcripts, predicted proteins and GFF files. However, the predicted proteins do not have the stop codon included so I do not know which proteins are complete
and which ones are incomplete at the 3' end. To solve that I have used different programs to extract the fasta sequence of the CDSs given the gff file and the genome sequence. The problem is that with the tools I have tested I get the right sequence for some
of the proteins and wrong sequences for others (with multiple stop codons for example). I am not sure why it happens and since it happens with different tools (different python scripts and even gffread from cufflink) I do not know where is the problem. Could
you please give me some advice on how to extract the right sequences with the stop codons included?<br></p><p><br></p><p>Thanks!<br></p><p><br></p><p>Diana<br></p></div></div></div>
_______________________________________________
maker-devel mailing list
<a href="mailto:maker-devel@box290.bluehost.com">maker-devel@box290.bluehost.com</a>
<a href="http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org">http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org</a>
</span></body></html>