[maker-devel] Maker_opts.ctl
Daniel Ence
dence at genetics.utah.edu
Wed Jul 16 12:42:16 MDT 2014
Hi Anh-Dao, so as I understand it, the process of training and running augustus will create a set of “param” file that Augustus can use later on. If that’s true, then you can just copy those files to the “config/species” folder of your augustus installation and then augustus (when you call it from inside maker) can use those parameters when it runs.
Did you end up with a gff3 file or with files like “exon_prob”, “utr_probs” from augustus? Or did you have both?
I’m pretty sure that you can’t use a comma-separated list for the rm_gff. You could concatenate the two files and then pass the one file to maker, but you also might need to have it sorted by genomic location. Carson could confirm that for me.
~Daniel
On Jul 16, 2014, at 12:30 PM, Nguyen, Anh-Dao (NIH/NHGRI) [C] <nguyenan at mail.nih.gov> wrote:
> Thanks Daniel for your quick response.
>
> I did not use the parameter file of other organism when running Augustus.
> I created the parameter file for the genome following their instructions.
> There were multiple steps to train and run Augustus (Creating gene
> structures for training AUGUSTUS with CEGMA => parameter file will be
> created; Creating Hints for AUGUSTUS from ESTs/cDNA sequences;
> Incorporating Illumina RNAseq into AUGUSTUS with GSNAP, etc.)
> As I mentioned the reason why I ran Augustus separately, because Augustus
> has not trained that genome (no parameter file exists). Otherwise I would
> run Augustus inside MAKER.
>
> You suggested to use rm_gff option to specify RepeatMasker output (sure I
> will convert them to .gff3 formatted files). Can I submit two RM .gff3
> files, separated by comma?
>
> Anh-Dao
>
>
> On 7/16/14 2:13 PM, "Daniel Ence" <dence at genetics.utah.edu> wrote:
>
>> Hi Anh-Dao,
>>
>> In the maker_opts.ctl file, there are options for est and protein
>> evidence. You¹ll put all of your fasta est files together in a command
>> separated list in the ³est" option, and all of your fasta protein files
>> in a command separated list for the ³protein² option.
>>
>> You¹ll specify the SNAP and Genemark files in their respective options in
>> the control file and pass the augustus and fgenesh predictions in the
>> ³pred_gff² option.
>>
>> If you have the RepeatMasker output in gff3 format you can give it to
>> maker with the ³rm_gff² option.
>>
>> If you¹ve converted the cufflinks output to gff3, you can give it to
>> maker with the ³est_gff² option. I¹m pretty sure Trinity only gives fasta
>> output, so you would put that in the ³est² option, along with all the
>> other est fasta files.
>>
>> If Augustus isn¹t trained for your particular organism, then you can use
>> another organism that augustus is already trained for. The list of
>> species that augustus has parameter files for is in the README.txt that
>> came with Augustus. I really recommend that you run Augustus from inside
>> maker, because then you get all the benefits of maker passing ext-based
>> hints to augustus at runtime, which can really improve Augustus¹
>> predictive ability.
>>
>> When you ran the augustus gene prediction separately, did you use another
>> organism¹s parameter file?
>>
>> Thanks,
>> Daniel
>>
>>
>> On Jul 16, 2014, at 11:15 AM, Nguyen, Anh-Dao (NIH/NHGRI) [C]
>> <nguyenan at mail.nih.gov> wrote:
>>
>>> Hi,
>>>
>>> I would like to conduct a genome annotation and have the following data:
>>> - Two separate RepeatMasker outputs (using -lib and -species options)
>>> - ESTs and RACE (fasta)
>>> - proteins (fasta)
>>> - proteins of related organisms (fasta)
>>> - SNAP's .hmm file (ran CEGMA, then used cegma2zff.pl to convert to ZFF
>>> format, etc. )
>>> - GeneMark's .hmm file (es.mod file from running gm_es.pl)
>>> - FGENESH++ and Augustus gene predictions. I wrote scripts to convert
>>> the outputs to .gff3 files. The reason why I ran Augustus gene
>>> prediction separately, because the genome has never been trained for
>>> Augustus.
>>> - Cufflinks and Trinity from RNA-Seq
>>>
>>> Could you please let me know how can I specify parameters in the
>>> maker_opts.ctl file?
>>> Or do you have other suggestions to re-do the data listed above?
>>>
>>> Thanks.
>>> Anh-Dao
>>>
>>> _______________________________________________
>>> maker-devel mailing list
>>> maker-devel at box290.bluehost.com
>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>>
>
More information about the maker-devel
mailing list