[maker-devel] Maker_opts.ctl
Carson Holt
carsonhh at gmail.com
Fri Jul 18 11:04:09 MDT 2014
It should just be 'fgenesh'. If it's not there you can still just give
the GFF3.
--Carson
On 7/17/14, 8:19 AM, "Nguyen, Anh-Dao (NIH/NHGRI) [C]"
<nguyenan at mail.nih.gov> wrote:
>I am not sure which fgenesh executable file should I use.
>
>fgenesh= #location of fgenesh executable
>
>When I run FGENESH++, I need to run the run_pipe.pl script. Sure you need
>to specify a list of other executable programs (such as ppd, ppdn+, etc)
>
>Anh-Dao
>
>
>On 7/16/14 3:32 PM, "Carson Holt" <carsonhh at gmail.com> wrote:
>
>>'all' will use the whole of RepBase, or you can do 'metazoa' like your
>>previous run. Then provide the RepeatModeler file to rmlib=
>>
>>--Carson
>>
>>
>>
>>On 7/16/14, 1:28 PM, "Nguyen, Anh-Dao (NIH/NHGRI) [C]"
>><nguyenan at mail.nih.gov> wrote:
>>
>>>By default, model_org=all. Can I use the de novo repeat library
>>>predicted
>>>by RepeatModeler for the rmlib option?
>>>
>>>Anh-Dao
>>>
>>>
>>>
>>>On 7/16/14 3:17 PM, "Carson Holt" <carsonhh at gmail.com> wrote:
>>>
>>>>No. You can provide both to MAKER. The options are model_org= and
>>>>rmlib=.
>>>> By letting MAKER handle repeat masking it will differentiate repeat
>>>>types
>>>>and use soft masking for some and hard masking for others. This
>>>>increases
>>>>sensitivity of evidence alignments while still maintaining specificity.
>>>>
>>>>--Carson
>>>>
>>>>
>>>>
>>>>On 7/16/14, 1:07 PM, "Nguyen, Anh-Dao (NIH/NHGRI) [C]"
>>>><nguyenan at mail.nih.gov> wrote:
>>>>
>>>>>I will run Augustus and FGENESH++ inside of MAKER using the parameter
>>>>>files for Augustus.
>>>>>I could also run RepeatMasker inside of MAKER. However, I ran RM using
>>>>>two
>>>>>options: -lib (de novo) and -species (known). I got ~ 45% repeats via
>>>>>de
>>>>>novo and ~ 4% repeats via known options. As I understood, RM inside of
>>>>>MAKER uses only RepBase repeat library and RepeatRunner protein
>>>>>database.
>>>>>
>>>>>Anh-Dao
>>>>>
>>>>>
>>>>>On 7/16/14 2:36 PM, "Carson Holt" <carsonhh at gmail.com> wrote:
>>>>>
>>>>>>When you ran Augustus separately, it should have created the
>>>>>>parameters
>>>>>>needed to run it. Now you should be able to run it inside of MAKER
>>>>>>using
>>>>>>the species name you just created.
>>>>>>
>>>>>>I'd also recommend letting MAKER run RepeatMasker for you rather than
>>>>>>giving it the results as GFF3.
>>>>>>
>>>>>>--Carson
>>>>>>
>>>>>>
>>>>>>On 7/16/14, 12:30 PM, "Nguyen, Anh-Dao (NIH/NHGRI) [C]"
>>>>>><nguyenan at mail.nih.gov> wrote:
>>>>>>
>>>>>>>Thanks Daniel for your quick response.
>>>>>>>
>>>>>>>I did not use the parameter file of other organism when running
>>>>>>>Augustus.
>>>>>>>I created the parameter file for the genome following their
>>>>>>>instructions.
>>>>>>>There were multiple steps to train and run Augustus (Creating gene
>>>>>>>structures for training AUGUSTUS with CEGMA => parameter file will
>>>>>>>be
>>>>>>>created; Creating Hints for AUGUSTUS from ESTs/cDNA sequences;
>>>>>>>Incorporating Illumina RNAseq into AUGUSTUS with GSNAP, etc.)
>>>>>>>As I mentioned the reason why I ran Augustus separately, because
>>>>>>>Augustus
>>>>>>>has not trained that genome (no parameter file exists). Otherwise I
>>>>>>>would
>>>>>>>run Augustus inside MAKER.
>>>>>>>
>>>>>>>You suggested to use rm_gff option to specify RepeatMasker output
>>>>>>>(sure
>>>>>>>I
>>>>>>>will convert them to .gff3 formatted files). Can I submit two RM
>>>>>>>.gff3
>>>>>>>files, separated by comma?
>>>>>>>
>>>>>>>Anh-Dao
>>>>>>>
>>>>>>>
>>>>>>>On 7/16/14 2:13 PM, "Daniel Ence" <dence at genetics.utah.edu> wrote:
>>>>>>>
>>>>>>>>Hi Anh-Dao,
>>>>>>>>
>>>>>>>>In the maker_opts.ctl file, there are options for est and protein
>>>>>>>>evidence. You¹ll put all of your fasta est files together in a
>>>>>>>>command
>>>>>>>>separated list in the ³est" option, and all of your fasta protein
>>>>>>>>files
>>>>>>>>in a command separated list for the ³protein² option.
>>>>>>>>
>>>>>>>>You¹ll specify the SNAP and Genemark files in their respective
>>>>>>>>options
>>>>>>>>in
>>>>>>>>the control file and pass the augustus and fgenesh predictions in
>>>>>>>>the
>>>>>>>>³pred_gff² option.
>>>>>>>>
>>>>>>>>If you have the RepeatMasker output in gff3 format you can give it
>>>>>>>>to
>>>>>>>>maker with the ³rm_gff² option.
>>>>>>>>
>>>>>>>>If you¹ve converted the cufflinks output to gff3, you can give it
>>>>>>>>to
>>>>>>>>maker with the ³est_gff² option. I¹m pretty sure Trinity only gives
>>>>>>>>fasta
>>>>>>>>output, so you would put that in the ³est² option, along with all
>>>>>>>>the
>>>>>>>>other est fasta files.
>>>>>>>>
>>>>>>>>If Augustus isn¹t trained for your particular organism, then you
>>>>>>>>can
>>>>>>>>use
>>>>>>>>another organism that augustus is already trained for. The list of
>>>>>>>>species that augustus has parameter files for is in the README.txt
>>>>>>>>that
>>>>>>>>came with Augustus. I really recommend that you run Augustus from
>>>>>>>>inside
>>>>>>>>maker, because then you get all the benefits of maker passing
>>>>>>>>ext-based
>>>>>>>>hints to augustus at runtime, which can really improve Augustus¹
>>>>>>>>predictive ability.
>>>>>>>>
>>>>>>>>When you ran the augustus gene prediction separately, did you use
>>>>>>>>another
>>>>>>>>organism¹s parameter file?
>>>>>>>>
>>>>>>>>Thanks,
>>>>>>>>Daniel
>>>>>>>>
>>>>>>>>
>>>>>>>>On Jul 16, 2014, at 11:15 AM, Nguyen, Anh-Dao (NIH/NHGRI) [C]
>>>>>>>><nguyenan at mail.nih.gov> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I would like to conduct a genome annotation and have the
>>>>>>>>>following
>>>>>>>>>data:
>>>>>>>>> - Two separate RepeatMasker outputs (using -lib and -species
>>>>>>>>>options)
>>>>>>>>> - ESTs and RACE (fasta)
>>>>>>>>> - proteins (fasta)
>>>>>>>>> - proteins of related organisms (fasta)
>>>>>>>>> - SNAP's .hmm file (ran CEGMA, then used cegma2zff.pl to convert
>>>>>>>>>to
>>>>>>>>>ZFF
>>>>>>>>>format, etc. )
>>>>>>>>> - GeneMark's .hmm file (es.mod file from running gm_es.pl)
>>>>>>>>> - FGENESH++ and Augustus gene predictions. I wrote scripts to
>>>>>>>>>convert
>>>>>>>>>the outputs to .gff3 files. The reason why I ran Augustus gene
>>>>>>>>>prediction separately, because the genome has never been trained
>>>>>>>>>for
>>>>>>>>>Augustus.
>>>>>>>>> - Cufflinks and Trinity from RNA-Seq
>>>>>>>>>
>>>>>>>>> Could you please let me know how can I specify parameters in the
>>>>>>>>>maker_opts.ctl file?
>>>>>>>>> Or do you have other suggestions to re-do the data listed above?
>>>>>>>>>
>>>>>>>>> Thanks.
>>>>>>>>> Anh-Dao
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> maker-devel mailing list
>>>>>>>>> maker-devel at box290.bluehost.com
>>>>>>>>>
>>>>>>>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-la
>>>>>>>>>b
>>>>>>>>>.
>>>>>>>>>o
>>>>>>>>>r
>>>>>>>>>g
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>_______________________________________________
>>>>>>>maker-devel mailing list
>>>>>>>maker-devel at box290.bluehost.com
>>>>>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.
>>>>>>>o
>>>>>>>r
>>>>>>>g
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>
More information about the maker-devel
mailing list