[maker-devel] Unexpected results with correct_est_fusion

Carson Holt carsonhh at gmail.com
Wed Aug 28 07:09:06 MDT 2013


Could you pick one contig where the number of genes shift dramatically and
upload that contig fasta together with your control files and any evidence
datasets used to one of our servers (I'm going to send you connection
details in a separate e-mail).  I can then run with and without
correct_est_fusion to see if there is anything unexpected going on.

--Carson



From:  Benjamin Rubin <brubin at fieldmuseum.org>
Date:  Tuesday, August 27, 2013 10:59 AM
To:  Carson Holt <carsonhh at gmail.com>
Cc:  <maker-devel at yandell-lab.org>
Subject:  Re: [maker-devel] Unexpected results with correct_est_fusion

Hi Carson,

I increased pred_flank to 200 and reran MAKER with correct_est_fusion, but I
still only get ~5,000 genes (5,082 instead of the 5,020 with pred_flank at
100). This is using only the first round with SNAP and Augustus trained on
the CEGMA genes. Is there anything else that I might be doing wrong? I have
attached my control file in case that could be useful.

Thanks for the help!
Ben


On Mon, Aug 26, 2013 at 2:00 PM, Carson Holt <carsonhh at gmail.com> wrote:
> The correct_est_fusion option just clips UTR on overlapping genes.   I suspect
> the real problem is setting pred_flank too low.  If your lead in sequence to a
> gene is too short, ab initio predictors won't call it.  So you are probably
> getting empty reports from SNAP/Augustus for the hint based predictions.  Try
> increasing pred_flank to at least 150.  Setting pred_flank too low will also
> limit how far MAKER  will walk out along the edges initial alignments during
> the polishing step (exonerate).  So setting it too low may also be causing you
> to lose some EST and protein alignments.
> 
> --Carson
> 
> 
> From:  Benjamin Rubin <brubin at fieldmuseum.org>
> Date:  Monday, August 26, 2013 2:20 PM
> To:  <maker-devel at yandell-lab.org>
> Subject:  [maker-devel] Unexpected results with correct_est_fusion
> 
> Hello developers,
> 
> I am using MAKER 2.28 to annotate an ant genome. I provide protein sequence
> evidence from all seven of the other sequenced ant genomes and a de novo
> assembled transcriptome as EST evidence. I assembled the transcriptome using
> Trinity with the jaccard_clip option turned on to reduce gene fusions. Despite
> using this set of hopefully non-fused ESTs, I still have substantial fusion
> problems with the final annotation. Therefore, I reduced pred_flank to 100 and
> turned on correct_est_fusion. However, correct_est_fusion leads to the
> prediction of a much smaller number of genes (~5,000 instead of ~14,000). I am
> initially training both SNAP and Augustus using CEGMA genes and then
> retraining based on the first round of annotation. Both rounds of annotation
> yield the same low number (~5,000) of genes. It may also be worth mentioning
> that the number of exons is also far lower when using correct_est_fusion
> (~26,000 instead of ~90,000).
> 
> Is this the expected behavior of correct_est_fusion? I was surprised that it
> reduced the predicted number of genes by such a large margin. I am concerned
> that I am using it incorrectly. Do you have any other suggestions for reducing
> gene merging?
> 
> Thanks,
> Ben
> 
> -- 
> _____________________________________________________
> Benjamin ER Rubin
> PhD Candidate
> Committee on Evolutionary Biology
> University of Chicago
> http://www.moreaulab.org/Benjamin_Rubin.html
> 
> Division of Insects
> Zoology Department
> Field Museum of Natural History
> 1400 South Lake Shore Drive
> Chicago, IL 60605
> USA
> Office: (312) 665-7776 <tel:%28312%29%20665-7776>
> _______________________________________________ maker-devel mailing list
> maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/mak
> er-devel_yandell-lab.org



-- 
_____________________________________________________
Benjamin ER Rubin
PhD Candidate
Committee on Evolutionary Biology
University of Chicago
http://www.moreaulab.org/Benjamin_Rubin.html

Division of Insects
Zoology Department
Field Museum of Natural History
1400 South Lake Shore Drive
Chicago, IL 60605
USA
Office: (312) 665-7776


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20130828/084473c6/attachment-0003.html>


More information about the maker-devel mailing list