[maker-devel] Maker_opts.ctl

Nguyen, Anh-Dao (NIH/NHGRI) [C] nguyenan at mail.nih.gov
Wed Jul 16 12:30:10 MDT 2014


Thanks Daniel for your quick response.

I did not use the parameter file of other organism when running Augustus.
I created the parameter file for the genome following their instructions.
There were multiple steps to train and run Augustus (Creating gene
structures for training AUGUSTUS with CEGMA => parameter file will be
created; Creating Hints for AUGUSTUS from ESTs/cDNA sequences;
Incorporating Illumina RNAseq into AUGUSTUS with GSNAP, etc.)
As I mentioned the reason why I ran Augustus separately, because Augustus
has not trained that genome (no parameter file exists). Otherwise I would
run Augustus inside MAKER.
 
You suggested to use rm_gff option to specify RepeatMasker output (sure I
will convert them to .gff3 formatted files). Can I submit two RM .gff3
files, separated by comma?

Anh-Dao


On 7/16/14 2:13 PM, "Daniel Ence" <dence at genetics.utah.edu> wrote:

>Hi Anh-Dao, 
>
>In the maker_opts.ctl file, there are options for est and protein
>evidence. You¹ll put all of your fasta est files together in a command
>separated list in the ³est" option, and all of your fasta protein files
>in a command separated list for the ³protein² option.
>
>You¹ll specify the SNAP and Genemark files in their respective options in
>the control file and pass the augustus and fgenesh predictions in the
>³pred_gff² option.
>
>If you have the RepeatMasker output in gff3 format you can give it to
>maker with the ³rm_gff² option.
>
>If you¹ve converted the cufflinks output to gff3, you can give it to
>maker with the ³est_gff² option. I¹m pretty sure Trinity only gives fasta
>output, so you would put that in the ³est² option, along with all the
>other est fasta files.
>
>If Augustus isn¹t trained for your particular organism, then you can use
>another organism that augustus is already trained for. The list of
>species that augustus has parameter files for is in the README.txt that
>came with Augustus. I really recommend that you run Augustus from inside
>maker, because then you get all the benefits of maker passing ext-based
>hints to augustus at runtime, which can really improve Augustus¹
>predictive ability.
>
>When you ran the augustus gene prediction separately, did you use another
>organism¹s parameter file?
>
>Thanks,
>Daniel
>
>
>On Jul 16, 2014, at 11:15 AM, Nguyen, Anh-Dao (NIH/NHGRI) [C]
><nguyenan at mail.nih.gov> wrote:
>
>> Hi,
>> 
>> I would like to conduct a genome annotation and have the following data:
>> - Two separate RepeatMasker outputs (using -lib and -species options)
>> - ESTs and RACE (fasta)
>> - proteins (fasta)
>> - proteins of related organisms (fasta)
>> - SNAP's .hmm file (ran CEGMA, then used cegma2zff.pl to convert to ZFF
>>format, etc. )
>> - GeneMark's .hmm file (es.mod file from running gm_es.pl)
>> - FGENESH++ and Augustus gene predictions. I wrote scripts to convert
>>the outputs to .gff3 files. The reason why I ran Augustus gene
>>prediction separately, because the genome has never been trained for
>>Augustus.
>> - Cufflinks and Trinity from RNA-Seq
>> 
>> Could you please let me know how can I specify parameters in the
>>maker_opts.ctl file?
>> Or do you have other suggestions to re-do the data listed above?
>> 
>> Thanks.
>> Anh-Dao
>> 
>> _______________________________________________
>> maker-devel mailing list
>> maker-devel at box290.bluehost.com
>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>





More information about the maker-devel mailing list