[maker-devel] (no subject)

Panos Ioannidis panos.ioannidis at gmail.com
Mon Jul 14 01:20:50 MDT 2014


Daniel, thanks for the info.

Regarding (3), the only reason I think of running BLASTs separately is
because I'm currently not able to run Maker on our cluster due to a problem
in the Perl "forks" library. And it looks like there isn't much I can do
about it; I tried Perlbrew but it doesn't work when I try to install
versions <5.18 (the problem in forks occurs on 5.18 and later versions).
Our admin also tried to change the code in the forks.pm file as per
Carson's suggestion in another thread, but that didn't work either... As a
result I'm running Maker on my workstation (really slooow) till a solution
is found and since BLAST is a time-consuming step I was thinking of running
it separately.


On Fri, Jul 11, 2014 at 4:08 PM, Daniel Ence <dence at genetics.utah.edu>
wrote:

>  Hi Panos,
>
>  1) You'll only use est2genome and protein2genome for creating models
> that will be used for training the ab-initio predictors (like SNAP).
> Sometimes that means one run of MAKER for training; sometimes that means
> two runs of MAKER. You usually don't gain any accuracy after the second
> round of training. It's ok to use both EST and protein data for this
> training step.
>
>  2) If you're using both ESTs and protein sequence to train your
> ab-initio predictors, then both est2genome and protein2genome should be set
> to 1.
>
>  3) If you want to pass Blast results to MAKER, you'll need to pass those
> results as GFF3, but MAKER will install and run blast for you, and does a
> good job of keeping track of all those results and making them accessible
> to you in the end, so it's going to be a lot of work to do those blasts on
> your own outside of MAKER. I seriously suggest that you use blast internal
> to maker.
>
>  Daniel Ence
> Graduate Student
> Eccles Institute of Human Genetics
> University of Utah
> 15 North 2030 East, Room 2100
> Salt Lake City, UT 84112-5330
>   ------------------------------
> *From:* maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of
> Panos Ioannidis [panos.ioannidis at gmail.com]
> *Sent:* Friday, July 11, 2014 5:56 AM
> *To:* maker-devel
> *Subject:* [maker-devel] (no subject)
>
>   I got back to my annotations this past week and have a couple of
> questions!
>
>  1) Since my organism isn't closely related with any other that's already
> sequenced, I will have to run maker twice (according to the tutorial). So
> for the first run I see that some people use only the ESTs and some others
> use ESTs and a protein database (CEGMA, Uniref50, Swiss-Prot, etc). I guess
> that the ESTs will give better models, but for the cases where genes aren't
> covered by an EST, it's okay to have a protein database to detect them as
> well. Am I right? What do you think?
>
>  2) In case I use both ESTs and a protein database how should I set the
> est2genome and protein2genome parameters in the maker_opts.ctl file?
> Should they both equal to "1"?
>
>  3) I've been thinking of running the BLAST searches separately and
> giving Maker directly the results. I guess that in this case, I'll have to
> first convert the BLAST output to a gff3 file and give it to the
> protein_gff parameter, right?
>
>  Thanks,
> Panos
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20140714/90053c88/attachment-0003.html>


More information about the maker-devel mailing list