[maker-devel] How to explain the maker results?

Wed May 3 10:10:48 MDT 2017

Use the merged gff3 to train snap, otherwise you won’t have enough models.

Info on training can be found on the wiki —> http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_GMOD_Online_Training_2014#Training_ab_initio_Gene_Predictors <http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_GMOD_Online_Training_2014#Training_ab_initio_Gene_Predictors>

Also you can find additional detailed info by searching the mailing list archives —> http://groups.google.com/group/maker-devel <http://groups.google.com/group/maker-devel>

I’m not sure what you are asking with the last question. Alignment is not a function of training, and will not be affected by the hmm, but 100% coverage and identity is too strict a threshold even for data derived from the same species.

—Carson

> On May 3, 2017, at 9:29 AM, dcg at cau.edu.cn wrote:
> 
> Dear sir:
>     I‘ve been using maker to do my genome annotation. However, I still have something I can't understand:
> 
>     1. After assembly, I have many contigs. Firstly, I set est2genome=1 and protein2genome=1 , with my proteins, ESTs and RNA-seq.. Which way below is correct?
>     1.1 Each contig has its own gff. I just use its own maker_gff file to get a pyu.hmm(be used in snap practice), and then, train the single contig.
>     1.2 I merge all the maker_gff to produce a pyu.hmm(for snap) , and then, use this pyu.hmm to train all the contigs.
>     
>     2. The aim of my project is to find new protein, so I need to guarantee the rigor of my annotation.
>         I  made a plan that the predicted protein should be successfully aligned to the Uniprot(reviewed protein, total number is about 30K) with 100% identity and coverage.
>         However, if I choose method 1.2 as above:
>         After the first step (est2genome=1 and protein2genome=1), about 1600 proteins can be 100% aligned to the Uniprot. After 2 rounds training(est2genome=0 and protein2genome=0), less proteins can be 100% aligned.
>         Is my test method reasonable? Why the final results can't get more well aligned proteins?
>         After training and fasta_merge, the results can be index_all.log.all.maker.proteins.fasta, index_all.log.all.maker.snap_masked.proteins.fasta, index_all.log.all.maker.non_overlapping_ab_initio.proteins.fasta,  which is the final results?
> 
>     
>      I'm looking forward to hearing from you. Thanks!
> Yours sincerely!
> 
>      
> Chao Chao
> dcg at cau.edu.cn <mailto:dcg at cau.edu.cn>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/75918e71/attachment-0003.html>