[maker-devel] Maker_opts.ctl

Carson Holt carsonhh at gmail.com
Wed Jul 16 12:43:33 MDT 2014


You can use comma separated lists.

--Carson


On 7/16/14, 12:42 PM, "Daniel Ence" <dence at genetics.utah.edu> wrote:

>Hi Anh-Dao, so as I understand it, the process of training and running
>augustus will create a set of “param” file that Augustus can use later
>on. If that’s true, then you can just copy those files to the
>“config/species” folder of your augustus installation and then augustus
>(when you call it from inside maker) can use those parameters when it
>runs. 
>
>Did you end up with a gff3 file or with files like “exon_prob”,
>“utr_probs” from augustus? Or did you have both?
>
>I’m pretty sure that you can’t use a comma-separated list for the rm_gff.
>You could concatenate the two files and then pass the one file to maker,
>but you also might need to have it sorted by genomic location. Carson
>could confirm that for me.
>
>~Daniel
>
>
>On Jul 16, 2014, at 12:30 PM, Nguyen, Anh-Dao (NIH/NHGRI) [C]
><nguyenan at mail.nih.gov> wrote:
>
>> Thanks Daniel for your quick response.
>> 
>> I did not use the parameter file of other organism when running
>>Augustus.
>> I created the parameter file for the genome following their
>>instructions.
>> There were multiple steps to train and run Augustus (Creating gene
>> structures for training AUGUSTUS with CEGMA => parameter file will be
>> created; Creating Hints for AUGUSTUS from ESTs/cDNA sequences;
>> Incorporating Illumina RNAseq into AUGUSTUS with GSNAP, etc.)
>> As I mentioned the reason why I ran Augustus separately, because
>>Augustus
>> has not trained that genome (no parameter file exists). Otherwise I
>>would
>> run Augustus inside MAKER.
>> 
>> You suggested to use rm_gff option to specify RepeatMasker output (sure
>>I
>> will convert them to .gff3 formatted files). Can I submit two RM .gff3
>> files, separated by comma?
>> 
>> Anh-Dao
>> 
>> 
>> On 7/16/14 2:13 PM, "Daniel Ence" <dence at genetics.utah.edu> wrote:
>> 
>>> Hi Anh-Dao, 
>>> 
>>> In the maker_opts.ctl file, there are options for est and protein
>>> evidence. You¹ll put all of your fasta est files together in a command
>>> separated list in the ³est" option, and all of your fasta protein files
>>> in a command separated list for the ³protein² option.
>>> 
>>> You¹ll specify the SNAP and Genemark files in their respective options
>>>in
>>> the control file and pass the augustus and fgenesh predictions in the
>>> ³pred_gff² option.
>>> 
>>> If you have the RepeatMasker output in gff3 format you can give it to
>>> maker with the ³rm_gff² option.
>>> 
>>> If you¹ve converted the cufflinks output to gff3, you can give it to
>>> maker with the ³est_gff² option. I¹m pretty sure Trinity only gives
>>>fasta
>>> output, so you would put that in the ³est² option, along with all the
>>> other est fasta files.
>>> 
>>> If Augustus isn¹t trained for your particular organism, then you can
>>>use
>>> another organism that augustus is already trained for. The list of
>>> species that augustus has parameter files for is in the README.txt that
>>> came with Augustus. I really recommend that you run Augustus from
>>>inside
>>> maker, because then you get all the benefits of maker passing ext-based
>>> hints to augustus at runtime, which can really improve Augustus¹
>>> predictive ability.
>>> 
>>> When you ran the augustus gene prediction separately, did you use
>>>another
>>> organism¹s parameter file?
>>> 
>>> Thanks,
>>> Daniel
>>> 
>>> 
>>> On Jul 16, 2014, at 11:15 AM, Nguyen, Anh-Dao (NIH/NHGRI) [C]
>>> <nguyenan at mail.nih.gov> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> I would like to conduct a genome annotation and have the following
>>>>data:
>>>> - Two separate RepeatMasker outputs (using -lib and -species options)
>>>> - ESTs and RACE (fasta)
>>>> - proteins (fasta)
>>>> - proteins of related organisms (fasta)
>>>> - SNAP's .hmm file (ran CEGMA, then used cegma2zff.pl to convert to
>>>>ZFF
>>>> format, etc. )
>>>> - GeneMark's .hmm file (es.mod file from running gm_es.pl)
>>>> - FGENESH++ and Augustus gene predictions. I wrote scripts to convert
>>>> the outputs to .gff3 files. The reason why I ran Augustus gene
>>>> prediction separately, because the genome has never been trained for
>>>> Augustus.
>>>> - Cufflinks and Trinity from RNA-Seq
>>>> 
>>>> Could you please let me know how can I specify parameters in the
>>>> maker_opts.ctl file?
>>>> Or do you have other suggestions to re-do the data listed above?
>>>> 
>>>> Thanks.
>>>> Anh-Dao
>>>> 
>>>> _______________________________________________
>>>> maker-devel mailing list
>>>> maker-devel at box290.bluehost.com
>>>> 
>>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>>> 
>> 
>
>
>_______________________________________________
>maker-devel mailing list
>maker-devel at box290.bluehost.com
>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org






More information about the maker-devel mailing list