[maker-devel] Unexpected results with correct_est_fusion

Benjamin Rubin brubin at fieldmuseum.org
Tue Aug 27 08:59:50 MDT 2013


Hi Carson,

I increased pred_flank to 200 and reran MAKER with correct_est_fusion, but
I still only get ~5,000 genes (5,082 instead of the 5,020 with pred_flank
at 100). This is using only the first round with SNAP and Augustus trained
on the CEGMA genes. Is there anything else that I might be doing wrong? I
have attached my control file in case that could be useful.

Thanks for the help!
Ben


On Mon, Aug 26, 2013 at 2:00 PM, Carson Holt <carsonhh at gmail.com> wrote:

> The correct_est_fusion option just clips UTR on overlapping genes.   I
> suspect the real problem is setting pred_flank too low.  If your lead in
> sequence to a gene is too short, ab initio predictors won't call it.  So
> you are probably getting empty reports from SNAP/Augustus for the hint
> based predictions.  Try increasing pred_flank to at least 150.  Setting
> pred_flank too low will also limit how far MAKER  will walk out along the
> edges initial alignments during the polishing step (exonerate).  So setting
> it too low may also be causing you to lose some EST and protein alignments.
>
> --Carson
>
>
> From: Benjamin Rubin <brubin at fieldmuseum.org>
> Date: Monday, August 26, 2013 2:20 PM
> To: <maker-devel at yandell-lab.org>
> Subject: [maker-devel] Unexpected results with correct_est_fusion
>
> Hello developers,
>
> I am using MAKER 2.28 to annotate an ant genome. I provide protein
> sequence evidence from all seven of the other sequenced ant genomes and a
> *de novo* assembled transcriptome as EST evidence. I assembled the
> transcriptome using Trinity with the jaccard_clip option turned on to
> reduce gene fusions. Despite using this set of hopefully non-fused ESTs, I
> still have substantial fusion problems with the final annotation.
> Therefore, I reduced pred_flank to 100 and turned on correct_est_fusion.
> However, correct_est_fusion leads to the prediction of a much smaller
> number of genes (~5,000 instead of ~14,000). I am initially training both
> SNAP and Augustus using CEGMA genes and then retraining based on the first
> round of annotation. Both rounds of annotation yield the same low number
> (~5,000) of genes. It may also be worth mentioning that the number of exons
> is also far lower when using correct_est_fusion (~26,000 instead of
> ~90,000).
>
> Is this the expected behavior of correct_est_fusion? I was surprised that
> it reduced the predicted number of genes by such a large margin. I am
> concerned that I am using it incorrectly. Do you have any other suggestions
> for reducing gene merging?
>
> Thanks,
> Ben
>
> --
> _____________________________________________________
> Benjamin ER Rubin
> PhD Candidate
> Committee on Evolutionary Biology
> University of Chicago
> http://www.moreaulab.org/Benjamin_Rubin.html
>
> Division of Insects
> Zoology Department
> Field Museum of Natural History
> 1400 South Lake Shore Drive
> Chicago, IL 60605
> USA
> Office: (312) 665-7776
> _______________________________________________ maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>



-- 
_____________________________________________________
Benjamin ER Rubin
PhD Candidate
Committee on Evolutionary Biology
University of Chicago
http://www.moreaulab.org/Benjamin_Rubin.html

Division of Insects
Zoology Department
Field Museum of Natural History
1400 South Lake Shore Drive
Chicago, IL 60605
USA
Office: (312) 665-7776
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20130827/1a773f5e/attachment-0003.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: maker_opts.ctl
Type: application/octet-stream
Size: 4811 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20130827/1a773f5e/attachment-0003.obj>


More information about the maker-devel mailing list