[maker-devel] MAKER
Keilwagen, Jens
jens.keilwagen at julius-kuehn.de
Fri Sep 22 14:15:23 MDT 2017
Hi Carson,
Thanks a lot for the information.
Just to be sure that I understand you right: It is impossible to obtain MAKER results based on RNA-seq and homology that differ from purely homology-based MAKER results?
Could you confirm that?
Thanks a lot and best regards, Jens
> -----Ursprüngliche Nachricht-----
> Von: Carson Hinton Holt [mailto:carson.holt at genetics.utah.edu]
> Gesendet: Freitag, 22. September 2017 22:04
> An: Keilwagen, Jens
> Cc: Maker Mailing List
> Betreff: Re: MAKER
>
> MAKER won’t produce est2genome results for est_gff. This is partially
> because est2genome results are only used for training gene predictors.
> So you are essentially just getting protein2genome results from your
> runs. Once you get a gene predictor trained you will see a difference,
> as it will use the intron/exon structure of alignments as hints to
> improve gene predictor performance.
>
> —Carson
>
>
> > On Sep 21, 2017, at 1:57 AM, Keilwagen, Jens <jens.keilwagen at julius-
> kuehn.de> wrote:
> >
> > Hi Carson,
> >
> > I have tried the proposed options for a small example (yeast).
> >
> > I had
> > - proteins (fasta) from another yeast and
> > - transcript annotation (gff) from cufflinks and StringTie
> >
> > I'd like to compare the maker results for
> > - proteins and StringTie
> > Vs.
> > - proteins and cufflinks
> >
> > I used the default options, except:
> > genome=<genome fasta>
> >
> > protein=<protein fasta>
> > est_gff=<transcript gff>
> >
> > est2genome=1
> > protein2genome=1
> >
> > (An example is attached.)
> >
> > Then I ran maker:
> >
> > maker -RM_off -c 24
> > find . -type f -name *.gff -exec cat {} + | grep maker >
> > filtered-maker-prediction.gff
> >
> > (The run seems to be okay. There were no FAILED, ... in the log. Cf.
> > attachment)
> >
> > Each maker run was started in a separate subdirectory.
> > However, I realized that both maker runs yielded almost the same
> result (just one minor edit). This made me curious.
> > As far as I understood the files, I received the (filtered?)
> exonerate predictions for the proteins (from the other yeast). Is this
> correct? Why did I not receive any predictions (purely) based on the
> RNA-seq data? Did I something wrong?
> >
> > I'm looking forward to your reply.
> >
> > Best regards, Jens
> >
> >
> >> -----Ursprüngliche Nachricht-----
> >> Von: Carson Hinton Holt [mailto:carson.holt at genetics.utah.edu]
> >> Gesendet: Dienstag, 19. September 2017 23:37
> >> An: Keilwagen, Jens
> >> Betreff: Re: MAKER
> >>
> >> MAKER cannot use the BAM directly, but you can use something like
> >> stringtie or trinity to assemble a transcript fasta that can be
> given
> >> to the est= option.
> >>
> >> Ab initio gene prediction is only enabled if you specify an hmm or
> >> species file to use. If all you want is homology based annotation,
> >> you can try the est2genome and protein2genome options. Note the
> final
> >> models may be partial if the alignments do not cover the gene end to
> >> end.
> >>
> >> —Carson
> >>
> >>
> >>
> >>> On Sep 18, 2017, at 4:02 AM, Keilwagen, Jens
> <jens.keilwagen at julius-
> >> kuehn.de> wrote:
> >>>
> >>> Hi Carson,
> >>>
> >>> thanks a lot for your last email that .
> >>>
> >>> I was asked to do homology-based gene prediction using RNA-seq and
> >> Maker was proposed as one option.
> >>> Hence I'd like to ask how to do that in the best possible way.
> >>> I have mapped RNA-seq data (SAM/BAM) and a fasta of proteins from a
> >> related species. How can I integrate the RNA-seq data?
> >>>
> >>> Is it possible to deactivate ab-initio gene prediction by Augustus
> >>> or
> >> SNAP?
> >>>
> >>> Thanks a lot in advance.
> >>>
> >>> Bets regards, Jens
> >>>
> >>>> -----Ursprüngliche Nachricht-----
> >>>> Von: Carson Holt [mailto:carson.holt at genetics.utah.edu]
> >>>> Gesendet: Donnerstag, 18. Februar 2016 19:03
> >>>> An: Keilwagen, Jens
> >>>> Cc: Mark Yandell
> >>>> Betreff: Re: MAKER
> >>>>
> >>>> GeMoMa sounds like an interesting tool. If it produces GFF3, you
> >>>> could give the GFF3 results to the pred_gff= option in MAKER
> (comma
> >>>> separated lists accepted). The GFF3 file of predictions must be in
> >>>> the same coordinate space as the assembly being annotated (genome=
> >> option).
> >>>> Whatever you give to pred_gff will be treated as a raw predictions
> >> by
> >>>> MAKER and will only be accepted as a final model if there are
> >>>> evidence alignments (protein/EST) that support the model, and if
> >>>> there are multiple alternate models at the same locus, only the
> >> model
> >>>> that is best supported by the protein/transcript evidence is kept.
> >>>>
> >>>> You can also set the keep_preds=1 option when using pred_gff. This
> >>>> will cause even raw predictions with no evidence support to be
> >> maintained.
> >>>> In the event of multiple models with no evidence support, the
> model
> >>>> best matching the consensus of alternate models will be
> maintained.
> >>>>
> >>>> Alternatively you can use the model_gff= options (comma separated
> >>>> list
> >>>> ok) to input the GFF3 file. model_gff features are given higher
> >>>> confidence than pred_gff. At least one model will always be kept
> >>>> regardless of evidence support (same rules as pred_gff selection
> >>>> for which model to keep when there are multiple). But model_gff
> >>>> will
> >> also
> >>>> affect how evidence clusters are determined compared to pred_gff
> >>>> (model_gff features are allowed to merge bridging evidence
> >> clusters).
> >>>> MAKER will also go to extra lengths to pull forward existing names
> >>>> and other data in the GFF3 for model_gff features.
> >>>>
> >>>> If you do not have GFF3 files in the right coordinate space, but
> do
> >>>> have protein fasta or transcript fasta for the GeMoMa predictions,
> >>>> you can supply these to the protein= and transcript= options in
> >> MAKER
> >>>> together with est2genome=1 or protein2genome=1. This will cause
> >> MAKER
> >>>> to place the models using exonerate. You would probably also need
> >>>> to add est_forward=1 to the control files to have MAKER try and
> >>>> derive model names from the name of evidence alignments they were
> >>>> derived from if you go this route.
> >>>>
> >>>> You can also try treating the GFF3 predictions as hints to
> >>>> traditional ab initio gene finders like SNAP or Augustus by giving
> >>>> them to the est_gff= or protein_gff= options (i.e. make GeMoMa
> >>>> predictions inform the behavior of predictors like SNAP and
> >>>> Augustus). Might be interesting. You would have to alter results
> to
> >>>> be match/match_part
> >>>> GFF3 features to give them to the est_gff or protein_gff options.
> >>>>
> >>>> Let me know if you have any more questions, and I’ll do my best to
> >>>> help.
> >>>>
> >>>> Thanks,
> >>>> Carson
> >>>>
> >>>>
> >>>>
> >>>>> On Feb 18, 2016, at 10:22 AM, Mark Yandell
> >>>> <myandell at genetics.utah.edu> wrote:
> >>>>>
> >>>>>
> >>>>> Mark Yandell
> >>>>> Professor of Human Genetics
> >>>>> H.A. & Edna Benning Presidential Endowed Chair Co-director USTAR
> >>>>> Center for Genetic Discovery Eccles Institute of Human Genetics
> >>>>> University of Utah
> >>>>> 15 North 2030 East, Room 2100
> >>>>> Salt Lake City, UT 84112-5330
> >>>>> ph:801-587-7707
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 2/18/16, 8:34 AM, "Keilwagen, Jens"
> >>>>> <jens.keilwagen at jki.bund.de>
> >>>> wrote:
> >>>>>
> >>>>>> Dear Prof. Yandell,
> >>>>>>
> >>>>>> we have published a homology-based gene prediction program
> today:
> >>>>>>
> https://nar.oxfordjournals.org/content/early/2016/02/17/nar.gkw09
> >>>>>> 2 and I'd like to ask how we can use MAKER to combine
> predictions
> >>>>>> of GeMoMa using different reference organisms, i.e. we try to
> >>>>>> predict the genes of an target organism (e.g. wheat) using the
> >>>>>> annotated genes of other reference organisms (e.g. grasses).
> >>>>>> GeMoMa returns
> >>>> for
> >>>>>> each reference organism a GFF with the predicted gene models in
> >> the
> >>>> target organism.
> >>>>>>
> >>>>>> It would be great if you or someone from your team could give us
> >>>> some
> >>>>>> hints or point us to correct paragraph in the documentation.
> >>>>>>
> >>>>>> Thanks a lot and best regards, Jens
> >>>>>>
> >>>>>> ---
> >>>>>>
> >>>>>> Dr. Jens Keilwagen
> >>>>>>
> >>>>>> Julius Kühn-Institut (JKI) - Federal Research Centre for
> >> Cultivated
> >>>>>> Plants
> >>>>>> Institute for Biosafety in Plant Biotechnology
> >>>>>>
> >>>>>> Erwin-Baur-Straße 27
> >>>>>> 06484 Quedlinburg
> >>>>>> Germany
> >>>>>>
> >>>>>> Phone: ++49 (0)3946 47 510
> >>>>>> EMail: jens.keilwagen at jki.bund.de
> >>>>>>
> >>>>>>
> >>>>>
> >>>
> >
> > <maker_opts.ctl><slurm-278767.out>
More information about the maker-devel
mailing list