[maker-devel] question regarding alternate splicing annotation
Walter Eckalbar
weckalba at asu.edu
Mon Jun 4 12:23:18 MDT 2012
Hi Carson,
Thanks for the quick reply. I have already trained SNAP and Augustus based
on the de novo assembly from the same RNA-seq data used to generate the
cufflinks annotations. Those SNAP and Augustus predictions were part of my
initial annotation, along with the de novo assembled transcripts, previous
reference annotations, and related species protein alignments. I had
broken this up because of run time issues, and thought it might speed
things along. I will switch gears to add in the SNAP and Augustus ab
initio predictions.
As you might infer, I have a great deal of formats this transcript data
could come in (ie outputs from cufflinks for each sample, cuffmerge,
tophat, trinity, or raw). Do you have any suggestions for what might be a
good balance between speed and completeness? I also can not for life of me
get Maker install on our cluster, but I do have Augustus and SNAP installed
there. I have a massive amount of RNA-seq data I'm trying to incorporate,
so I'm confident plenty of alternative splicing could be found, but I'm
hitting time issues due to the scale (ie four day wall limit on the cluster
and only having an 8-core in house).
I've tried other programs to do this, but obviously cufflinks gives you way
too much, and I'm finding EVM, while fast, is too happy to shorten gene
models based on partial transcript evidence, which requires way more manual
correction than we are capable of doing.
I'll start with just letting Maker run SNAP and Augustus, and let you know
how it goes.
Walter
On 4 June 2012 11:02, Carson Holt <carsonhh at gmail.com> wrote:
> Using GFF3 pass-through options alone won't allow for the alternate splice
> prediction to work. You have to also allow gene predictors like SNAP and
> Augustus to run. MAKER uses mutually exclusive EST data to produce
> separate hint files in some cases that can produce alternate splice forms
> from the ab initio predictors. The EST evidence must be very long in
> general or they will not produce alternate forms. These alternate splice
> model can then compete against your existing gene models based on scoring
> statistics MAKER produces and potentially replace them. This may not be
> what you want though. The alternate splice prediction works better De Novo
> than for re-annotation.
>
> The alternate splicing option still needs more work, but I would
> appreciate any feedback.
>
> Thanks,
> Carson
>
>
>
>
> From: Walter Eckalbar <weckalba at asu.edu>
> Date: Monday, 4 June, 2012 1:41 PM
> To: <maker-devel at yandell-lab.org>
> Subject: [maker-devel] question regarding alternate splicing annotation
>
> Hi Maker developers,
>
> I am trying to expand on some current annotations that are already quite
> good, but only predict protein coding sequence and one isoform per gene, to
> add UTRs and alternative splice forms from cufflinks data. To do this I
> put the current annotations in both the model_gff andusing the gff_field,
> plus the cufflinks gff3 for the ests (as I noticed was suggested in a
> previous email). I've left everything else as default, except changing
> alt_splice=1. I am watching the progress of the *.gff.ann files, but I'm
> not noticing alternate splicing being added, while UTRs are being picked up
> (exons being added, etc.). This is a vertebrate genome, so run times are
> fairly long and I just wanted to double check if I wasn't missing
> something. Will maker go back through a second step to annotate
> alternative splicing? Or should I be trying something a little different.
>
> Thanks,
>
> Walter
> _______________________________________________ maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20120604/7e44220c/attachment-0003.html>
More information about the maker-devel
mailing list