<html>

<head>

<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">

</head>

<body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; color: rgb(0, 0, 0); font-size: 14px; font-family: Calibri, sans-serif; ">

<div>SNAP appears to be making small ab initio calls all over the place.  You may need to retrain it, or just drop it completely from your analysis and just use augustus.</div>

<div><br>

</div>

<div>Try running with protein2genome and then training SNAP off of those results.  Also make sure your repeat masking is sufficient.  You may be getting too many SANP predictions if that is the case.</div>

<div><br>

</div>

<div>Thanks,</div>

<div>Carson</div>

<div><br>

</div>

<div><br>

</div>

<div><br>

</div>

<span id="OLK_SRC_BODY_SECTION">

<div style="font-family:Calibri; font-size:11pt; text-align:left; color:black; BORDER-BOTTOM: medium none; BORDER-LEFT: medium none; PADDING-BOTTOM: 0in; PADDING-LEFT: 0in; PADDING-RIGHT: 0in; BORDER-TOP: #b5c4df 1pt solid; BORDER-RIGHT: medium none; PADDING-TOP: 3pt">

<span style="font-weight:bold">From: </span>Sivaranjani Namasivayam <<a href="mailto:ranjani@uga.edu">ranjani@uga.edu</a>><br>

<span style="font-weight:bold">Date: </span>Tuesday, October 1, 2013 5:01 PM<br>

<span style="font-weight:bold">To: </span>"<a href="mailto:maker-devel-bounces@yandell-lab.org">maker-devel-bounces@yandell-lab.org</a>" <<a href="mailto:maker-devel-bounces@yandell-lab.org">maker-devel-bounces@yandell-lab.org</a>><br>

<span style="font-weight:bold">Subject: </span>Incorrect gene model, antisense transcripts<br>

</div>

<div><br>

</div>

<div dir="ltr"><style id="owaParaStyle" type="text/css">P {margin-top:0;margin-bottom:0;}</style>

<div ocsi="0" fpstyle="1">

<div style="direction: ltr;font-family: Tahoma;color: #000000;font-size: 10pt;">Hello,<br>

<br>

I am working on annotating a genome. I have RNAseq data assembled with Cufflinks and Trinity. I am using to predict gene models. Below is the procedure I used<br>

<br>

- For the first round of prediction I provided the following : EST evidence: assembled RNAseq, Protein evidence: proteins from related organism, Gene predictor: augustus trained on a related organism.<br>

- The number of genes I got was much less than expected (Expect atleast 10K genes, got ~4K genes)<br>

- Used the gene models from the first round of annotations to train SNAP (default parameters) and provided the HMM for the second round of annotation (used augustus also). All other data I passed as such in the gff3, except the gene models.I also included the

 some manually curated genes in the EST evidence. <br>

<br>

- The number of genes were much higher, ~11k, but many seemed incorrect and appear to have been generated from the SNAP models. I am attaching a screen shot from Apollo showing this.<br>

<br>

Would you have any suggestions on why this might be happening and how I can fix?<br>

I am essentially looking for gene models for my assembled RNAseq, can I achieve this by setting est2genome to 1? Would I be losing/misrepresenting any data if I do this.<br>

<br>

Also, I recently noticed my RNAseq data has assembled antisense transcripts. MAKER seems to be predicting a coding region for these antisense transcripts in some cases, (although the coding region predicted is much smaller than that of the sense gene model.)

 Is there a way to tell MAKER not to predict gene models on both strands at the same loci.<br>

<br>

Appreciate your help.<br>

<br>

Thanks,<br>

Ranjani<br>

<br>

<br>

<br>

<br>

<br>

</div>

</div>

</div>

</span>

</body>

</html>