[maker-devel] Maker_opts.ctl
Nguyen, Anh-Dao (NIH/NHGRI) [C]
nguyenan at mail.nih.gov
Thu Jul 17 08:19:34 MDT 2014
I am not sure which fgenesh executable file should I use.
fgenesh= #location of fgenesh executable
When I run FGENESH++, I need to run the run_pipe.pl script. Sure you need
to specify a list of other executable programs (such as ppd, ppdn+, etc)
Anh-Dao
On 7/16/14 3:32 PM, "Carson Holt" <carsonhh at gmail.com> wrote:
>'all' will use the whole of RepBase, or you can do 'metazoa' like your
>previous run. Then provide the RepeatModeler file to rmlib=
>
>--Carson
>
>
>
>On 7/16/14, 1:28 PM, "Nguyen, Anh-Dao (NIH/NHGRI) [C]"
><nguyenan at mail.nih.gov> wrote:
>
>>By default, model_org=all. Can I use the de novo repeat library predicted
>>by RepeatModeler for the rmlib option?
>>
>>Anh-Dao
>>
>>
>>
>>On 7/16/14 3:17 PM, "Carson Holt" <carsonhh at gmail.com> wrote:
>>
>>>No. You can provide both to MAKER. The options are model_org= and
>>>rmlib=.
>>> By letting MAKER handle repeat masking it will differentiate repeat
>>>types
>>>and use soft masking for some and hard masking for others. This
>>>increases
>>>sensitivity of evidence alignments while still maintaining specificity.
>>>
>>>--Carson
>>>
>>>
>>>
>>>On 7/16/14, 1:07 PM, "Nguyen, Anh-Dao (NIH/NHGRI) [C]"
>>><nguyenan at mail.nih.gov> wrote:
>>>
>>>>I will run Augustus and FGENESH++ inside of MAKER using the parameter
>>>>files for Augustus.
>>>>I could also run RepeatMasker inside of MAKER. However, I ran RM using
>>>>two
>>>>options: -lib (de novo) and -species (known). I got ~ 45% repeats via
>>>>de
>>>>novo and ~ 4% repeats via known options. As I understood, RM inside of
>>>>MAKER uses only RepBase repeat library and RepeatRunner protein
>>>>database.
>>>>
>>>>Anh-Dao
>>>>
>>>>
>>>>On 7/16/14 2:36 PM, "Carson Holt" <carsonhh at gmail.com> wrote:
>>>>
>>>>>When you ran Augustus separately, it should have created the
>>>>>parameters
>>>>>needed to run it. Now you should be able to run it inside of MAKER
>>>>>using
>>>>>the species name you just created.
>>>>>
>>>>>I'd also recommend letting MAKER run RepeatMasker for you rather than
>>>>>giving it the results as GFF3.
>>>>>
>>>>>--Carson
>>>>>
>>>>>
>>>>>On 7/16/14, 12:30 PM, "Nguyen, Anh-Dao (NIH/NHGRI) [C]"
>>>>><nguyenan at mail.nih.gov> wrote:
>>>>>
>>>>>>Thanks Daniel for your quick response.
>>>>>>
>>>>>>I did not use the parameter file of other organism when running
>>>>>>Augustus.
>>>>>>I created the parameter file for the genome following their
>>>>>>instructions.
>>>>>>There were multiple steps to train and run Augustus (Creating gene
>>>>>>structures for training AUGUSTUS with CEGMA => parameter file will be
>>>>>>created; Creating Hints for AUGUSTUS from ESTs/cDNA sequences;
>>>>>>Incorporating Illumina RNAseq into AUGUSTUS with GSNAP, etc.)
>>>>>>As I mentioned the reason why I ran Augustus separately, because
>>>>>>Augustus
>>>>>>has not trained that genome (no parameter file exists). Otherwise I
>>>>>>would
>>>>>>run Augustus inside MAKER.
>>>>>>
>>>>>>You suggested to use rm_gff option to specify RepeatMasker output
>>>>>>(sure
>>>>>>I
>>>>>>will convert them to .gff3 formatted files). Can I submit two RM
>>>>>>.gff3
>>>>>>files, separated by comma?
>>>>>>
>>>>>>Anh-Dao
>>>>>>
>>>>>>
>>>>>>On 7/16/14 2:13 PM, "Daniel Ence" <dence at genetics.utah.edu> wrote:
>>>>>>
>>>>>>>Hi Anh-Dao,
>>>>>>>
>>>>>>>In the maker_opts.ctl file, there are options for est and protein
>>>>>>>evidence. You¹ll put all of your fasta est files together in a
>>>>>>>command
>>>>>>>separated list in the ³est" option, and all of your fasta protein
>>>>>>>files
>>>>>>>in a command separated list for the ³protein² option.
>>>>>>>
>>>>>>>You¹ll specify the SNAP and Genemark files in their respective
>>>>>>>options
>>>>>>>in
>>>>>>>the control file and pass the augustus and fgenesh predictions in
>>>>>>>the
>>>>>>>³pred_gff² option.
>>>>>>>
>>>>>>>If you have the RepeatMasker output in gff3 format you can give it
>>>>>>>to
>>>>>>>maker with the ³rm_gff² option.
>>>>>>>
>>>>>>>If you¹ve converted the cufflinks output to gff3, you can give it to
>>>>>>>maker with the ³est_gff² option. I¹m pretty sure Trinity only gives
>>>>>>>fasta
>>>>>>>output, so you would put that in the ³est² option, along with all
>>>>>>>the
>>>>>>>other est fasta files.
>>>>>>>
>>>>>>>If Augustus isn¹t trained for your particular organism, then you can
>>>>>>>use
>>>>>>>another organism that augustus is already trained for. The list of
>>>>>>>species that augustus has parameter files for is in the README.txt
>>>>>>>that
>>>>>>>came with Augustus. I really recommend that you run Augustus from
>>>>>>>inside
>>>>>>>maker, because then you get all the benefits of maker passing
>>>>>>>ext-based
>>>>>>>hints to augustus at runtime, which can really improve Augustus¹
>>>>>>>predictive ability.
>>>>>>>
>>>>>>>When you ran the augustus gene prediction separately, did you use
>>>>>>>another
>>>>>>>organism¹s parameter file?
>>>>>>>
>>>>>>>Thanks,
>>>>>>>Daniel
>>>>>>>
>>>>>>>
>>>>>>>On Jul 16, 2014, at 11:15 AM, Nguyen, Anh-Dao (NIH/NHGRI) [C]
>>>>>>><nguyenan at mail.nih.gov> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I would like to conduct a genome annotation and have the following
>>>>>>>>data:
>>>>>>>> - Two separate RepeatMasker outputs (using -lib and -species
>>>>>>>>options)
>>>>>>>> - ESTs and RACE (fasta)
>>>>>>>> - proteins (fasta)
>>>>>>>> - proteins of related organisms (fasta)
>>>>>>>> - SNAP's .hmm file (ran CEGMA, then used cegma2zff.pl to convert
>>>>>>>>to
>>>>>>>>ZFF
>>>>>>>>format, etc. )
>>>>>>>> - GeneMark's .hmm file (es.mod file from running gm_es.pl)
>>>>>>>> - FGENESH++ and Augustus gene predictions. I wrote scripts to
>>>>>>>>convert
>>>>>>>>the outputs to .gff3 files. The reason why I ran Augustus gene
>>>>>>>>prediction separately, because the genome has never been trained
>>>>>>>>for
>>>>>>>>Augustus.
>>>>>>>> - Cufflinks and Trinity from RNA-Seq
>>>>>>>>
>>>>>>>> Could you please let me know how can I specify parameters in the
>>>>>>>>maker_opts.ctl file?
>>>>>>>> Or do you have other suggestions to re-do the data listed above?
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>> Anh-Dao
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> maker-devel mailing list
>>>>>>>> maker-devel at box290.bluehost.com
>>>>>>>>
>>>>>>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab
>>>>>>>>.
>>>>>>>>o
>>>>>>>>r
>>>>>>>>g
>>>>>>>
>>>>>>
>>>>>>
>>>>>>_______________________________________________
>>>>>>maker-devel mailing list
>>>>>>maker-devel at box290.bluehost.com
>>>>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.o
>>>>>>r
>>>>>>g
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
>
More information about the maker-devel
mailing list