[maker-devel] Training Augustus

Muriel Gros-Balthazard muriel.grosb at gmail.com
Tue Oct 28 04:04:26 MDT 2014


Hello !

I want to train Augustus for a non model organism and I have several 
questions about it !

I planned to follow the section "Training ab initio Gene predictors".

So first, I need to generate a gene model using EST data.
However, I was wondering how many sequences are necessary ?
Indeed, my genome is 476 Mb and I have milllions of RNA seq data but it 
takes ages if I put all of them !
I tried with 1000 sequences and it takes 30 min but is that enought ? Or 
should I take more ?

Secondly, we then obtain plenty of gff files, should we concatenate them ?

And then, what to do ? Indeed, the help of maker explains for Snap, but 
I want to use Augustus.
I found a script called |autoAug.pl| to train Augustus.
What do you think of it ?

Should I use it that way ?

|autoAug.pl --singleCPU --useexisting --genome=mygenome.fasta 
--species=myspeciesname --cdna=EST.fasta --trainingset=genome.gff3|


where EST.fasta is the file I used earlier to generate the gene model 
and genome.gff3 is the result of the gene model.
However, I don't think that I obtained gff3 file from the first maker run.
So should I generate gff3 from gff ???

Thanks a lot for your help,

Muriel


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20141028/f96e6388/attachment-0002.html>


More information about the maker-devel mailing list