[maker-devel] Maker_opts.ctl

Carson Holt carsonhh at gmail.com
Wed Jul 16 13:32:02 MDT 2014


'all' will use the whole of RepBase, or you can do 'metazoa' like your
previous run.  Then provide the RepeatModeler file to rmlib=

--Carson



On 7/16/14, 1:28 PM, "Nguyen, Anh-Dao (NIH/NHGRI) [C]"
<nguyenan at mail.nih.gov> wrote:

>By default, model_org=all. Can I use the de novo repeat library predicted
>by RepeatModeler for the rmlib option?
>
>Anh-Dao
>
>
>
>On 7/16/14 3:17 PM, "Carson Holt" <carsonhh at gmail.com> wrote:
>
>>No.  You can provide both to MAKER. The options are model_org= and
>>rmlib=.
>> By letting MAKER handle repeat masking it will differentiate repeat
>>types
>>and use soft masking for some and hard masking for others.  This
>>increases
>>sensitivity of evidence alignments while still maintaining specificity.
>>
>>--Carson
>>
>>
>>
>>On 7/16/14, 1:07 PM, "Nguyen, Anh-Dao (NIH/NHGRI) [C]"
>><nguyenan at mail.nih.gov> wrote:
>>
>>>I will run Augustus and FGENESH++ inside of MAKER using the parameter
>>>files for Augustus.
>>>I could also run RepeatMasker inside of MAKER. However, I ran RM using
>>>two
>>>options: -lib (de novo) and -species (known). I got ~ 45% repeats via de
>>>novo and ~ 4% repeats via known options. As I understood, RM inside of
>>>MAKER uses only RepBase repeat library and RepeatRunner protein
>>>database.
>>>
>>>Anh-Dao
>>>
>>>
>>>On 7/16/14 2:36 PM, "Carson Holt" <carsonhh at gmail.com> wrote:
>>>
>>>>When you ran Augustus separately, it should have created the parameters
>>>>needed to run it.  Now you should be able to run it inside of MAKER
>>>>using
>>>>the species name you just created.
>>>>
>>>>I'd also recommend letting MAKER run RepeatMasker for you rather than
>>>>giving it the results as GFF3.
>>>>
>>>>--Carson
>>>>
>>>>
>>>>On 7/16/14, 12:30 PM, "Nguyen, Anh-Dao (NIH/NHGRI) [C]"
>>>><nguyenan at mail.nih.gov> wrote:
>>>>
>>>>>Thanks Daniel for your quick response.
>>>>>
>>>>>I did not use the parameter file of other organism when running
>>>>>Augustus.
>>>>>I created the parameter file for the genome following their
>>>>>instructions.
>>>>>There were multiple steps to train and run Augustus (Creating gene
>>>>>structures for training AUGUSTUS with CEGMA => parameter file will be
>>>>>created; Creating Hints for AUGUSTUS from ESTs/cDNA sequences;
>>>>>Incorporating Illumina RNAseq into AUGUSTUS with GSNAP, etc.)
>>>>>As I mentioned the reason why I ran Augustus separately, because
>>>>>Augustus
>>>>>has not trained that genome (no parameter file exists). Otherwise I
>>>>>would
>>>>>run Augustus inside MAKER.
>>>>> 
>>>>>You suggested to use rm_gff option to specify RepeatMasker output
>>>>>(sure
>>>>>I
>>>>>will convert them to .gff3 formatted files). Can I submit two RM .gff3
>>>>>files, separated by comma?
>>>>>
>>>>>Anh-Dao
>>>>>
>>>>>
>>>>>On 7/16/14 2:13 PM, "Daniel Ence" <dence at genetics.utah.edu> wrote:
>>>>>
>>>>>>Hi Anh-Dao, 
>>>>>>
>>>>>>In the maker_opts.ctl file, there are options for est and protein
>>>>>>evidence. You¹ll put all of your fasta est files together in a
>>>>>>command
>>>>>>separated list in the ³est" option, and all of your fasta protein
>>>>>>files
>>>>>>in a command separated list for the ³protein² option.
>>>>>>
>>>>>>You¹ll specify the SNAP and Genemark files in their respective
>>>>>>options
>>>>>>in
>>>>>>the control file and pass the augustus and fgenesh predictions in the
>>>>>>³pred_gff² option.
>>>>>>
>>>>>>If you have the RepeatMasker output in gff3 format you can give it to
>>>>>>maker with the ³rm_gff² option.
>>>>>>
>>>>>>If you¹ve converted the cufflinks output to gff3, you can give it to
>>>>>>maker with the ³est_gff² option. I¹m pretty sure Trinity only gives
>>>>>>fasta
>>>>>>output, so you would put that in the ³est² option, along with all the
>>>>>>other est fasta files.
>>>>>>
>>>>>>If Augustus isn¹t trained for your particular organism, then you can
>>>>>>use
>>>>>>another organism that augustus is already trained for. The list of
>>>>>>species that augustus has parameter files for is in the README.txt
>>>>>>that
>>>>>>came with Augustus. I really recommend that you run Augustus from
>>>>>>inside
>>>>>>maker, because then you get all the benefits of maker passing
>>>>>>ext-based
>>>>>>hints to augustus at runtime, which can really improve Augustus¹
>>>>>>predictive ability.
>>>>>>
>>>>>>When you ran the augustus gene prediction separately, did you use
>>>>>>another
>>>>>>organism¹s parameter file?
>>>>>>
>>>>>>Thanks,
>>>>>>Daniel
>>>>>>
>>>>>>
>>>>>>On Jul 16, 2014, at 11:15 AM, Nguyen, Anh-Dao (NIH/NHGRI) [C]
>>>>>><nguyenan at mail.nih.gov> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>> 
>>>>>>> I would like to conduct a genome annotation and have the following
>>>>>>>data:
>>>>>>> - Two separate RepeatMasker outputs (using -lib and -species
>>>>>>>options)
>>>>>>> - ESTs and RACE (fasta)
>>>>>>> - proteins (fasta)
>>>>>>> - proteins of related organisms (fasta)
>>>>>>> - SNAP's .hmm file (ran CEGMA, then used cegma2zff.pl to convert to
>>>>>>>ZFF
>>>>>>>format, etc. )
>>>>>>> - GeneMark's .hmm file (es.mod file from running gm_es.pl)
>>>>>>> - FGENESH++ and Augustus gene predictions. I wrote scripts to
>>>>>>>convert
>>>>>>>the outputs to .gff3 files. The reason why I ran Augustus gene
>>>>>>>prediction separately, because the genome has never been trained for
>>>>>>>Augustus.
>>>>>>> - Cufflinks and Trinity from RNA-Seq
>>>>>>> 
>>>>>>> Could you please let me know how can I specify parameters in the
>>>>>>>maker_opts.ctl file?
>>>>>>> Or do you have other suggestions to re-do the data listed above?
>>>>>>> 
>>>>>>> Thanks.
>>>>>>> Anh-Dao
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> maker-devel mailing list
>>>>>>> maker-devel at box290.bluehost.com
>>>>>>> 
>>>>>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.
>>>>>>>o
>>>>>>>r
>>>>>>>g
>>>>>>
>>>>>
>>>>>
>>>>>_______________________________________________
>>>>>maker-devel mailing list
>>>>>maker-devel at box290.bluehost.com
>>>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.or
>>>>>g
>>>>
>>>>
>>>
>>
>>
>






More information about the maker-devel mailing list