[maker-devel] Maker_opts.ctl
Nguyen, Anh-Dao (NIH/NHGRI) [C]
nguyenan at mail.nih.gov
Wed Jul 16 13:07:45 MDT 2014
I will run Augustus and FGENESH++ inside of MAKER using the parameter
files for Augustus.
I could also run RepeatMasker inside of MAKER. However, I ran RM using two
options: -lib (de novo) and -species (known). I got ~ 45% repeats via de
novo and ~ 4% repeats via known options. As I understood, RM inside of
MAKER uses only RepBase repeat library and RepeatRunner protein database.
Anh-Dao
On 7/16/14 2:36 PM, "Carson Holt" <carsonhh at gmail.com> wrote:
>When you ran Augustus separately, it should have created the parameters
>needed to run it. Now you should be able to run it inside of MAKER using
>the species name you just created.
>
>I'd also recommend letting MAKER run RepeatMasker for you rather than
>giving it the results as GFF3.
>
>--Carson
>
>
>On 7/16/14, 12:30 PM, "Nguyen, Anh-Dao (NIH/NHGRI) [C]"
><nguyenan at mail.nih.gov> wrote:
>
>>Thanks Daniel for your quick response.
>>
>>I did not use the parameter file of other organism when running Augustus.
>>I created the parameter file for the genome following their instructions.
>>There were multiple steps to train and run Augustus (Creating gene
>>structures for training AUGUSTUS with CEGMA => parameter file will be
>>created; Creating Hints for AUGUSTUS from ESTs/cDNA sequences;
>>Incorporating Illumina RNAseq into AUGUSTUS with GSNAP, etc.)
>>As I mentioned the reason why I ran Augustus separately, because Augustus
>>has not trained that genome (no parameter file exists). Otherwise I would
>>run Augustus inside MAKER.
>>
>>You suggested to use rm_gff option to specify RepeatMasker output (sure I
>>will convert them to .gff3 formatted files). Can I submit two RM .gff3
>>files, separated by comma?
>>
>>Anh-Dao
>>
>>
>>On 7/16/14 2:13 PM, "Daniel Ence" <dence at genetics.utah.edu> wrote:
>>
>>>Hi Anh-Dao,
>>>
>>>In the maker_opts.ctl file, there are options for est and protein
>>>evidence. You¹ll put all of your fasta est files together in a command
>>>separated list in the ³est" option, and all of your fasta protein files
>>>in a command separated list for the ³protein² option.
>>>
>>>You¹ll specify the SNAP and Genemark files in their respective options
>>>in
>>>the control file and pass the augustus and fgenesh predictions in the
>>>³pred_gff² option.
>>>
>>>If you have the RepeatMasker output in gff3 format you can give it to
>>>maker with the ³rm_gff² option.
>>>
>>>If you¹ve converted the cufflinks output to gff3, you can give it to
>>>maker with the ³est_gff² option. I¹m pretty sure Trinity only gives
>>>fasta
>>>output, so you would put that in the ³est² option, along with all the
>>>other est fasta files.
>>>
>>>If Augustus isn¹t trained for your particular organism, then you can use
>>>another organism that augustus is already trained for. The list of
>>>species that augustus has parameter files for is in the README.txt that
>>>came with Augustus. I really recommend that you run Augustus from inside
>>>maker, because then you get all the benefits of maker passing ext-based
>>>hints to augustus at runtime, which can really improve Augustus¹
>>>predictive ability.
>>>
>>>When you ran the augustus gene prediction separately, did you use
>>>another
>>>organism¹s parameter file?
>>>
>>>Thanks,
>>>Daniel
>>>
>>>
>>>On Jul 16, 2014, at 11:15 AM, Nguyen, Anh-Dao (NIH/NHGRI) [C]
>>><nguyenan at mail.nih.gov> wrote:
>>>
>>>> Hi,
>>>>
>>>> I would like to conduct a genome annotation and have the following
>>>>data:
>>>> - Two separate RepeatMasker outputs (using -lib and -species options)
>>>> - ESTs and RACE (fasta)
>>>> - proteins (fasta)
>>>> - proteins of related organisms (fasta)
>>>> - SNAP's .hmm file (ran CEGMA, then used cegma2zff.pl to convert to
>>>>ZFF
>>>>format, etc. )
>>>> - GeneMark's .hmm file (es.mod file from running gm_es.pl)
>>>> - FGENESH++ and Augustus gene predictions. I wrote scripts to convert
>>>>the outputs to .gff3 files. The reason why I ran Augustus gene
>>>>prediction separately, because the genome has never been trained for
>>>>Augustus.
>>>> - Cufflinks and Trinity from RNA-Seq
>>>>
>>>> Could you please let me know how can I specify parameters in the
>>>>maker_opts.ctl file?
>>>> Or do you have other suggestions to re-do the data listed above?
>>>>
>>>> Thanks.
>>>> Anh-Dao
>>>>
>>>> _______________________________________________
>>>> maker-devel mailing list
>>>> maker-devel at box290.bluehost.com
>>>>
>>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>>>
>>
>>
>>_______________________________________________
>>maker-devel mailing list
>>maker-devel at box290.bluehost.com
>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>
>
More information about the maker-devel
mailing list