[maker-devel] Differences in non_overlapping protein file between runs

YannDussert dussert.yann at gmail.com
Fri Mar 10 03:53:36 MST 2017


Hi,

Thank you for your answer.To get my gff with ab-initio predictions, I 
just took the corresponding lines in the maker gff from the previous round.

I can't see any problem with it, it looks like this:

Plvit001        augustus_masked match   66626 70338   0.85    +       . 
ID=Plvit001:hit:12095:4.5.0.0;Name=augustus_masked-Plvit001-abinit-gene-0.7-mRNA-1
Plvit001        augustus_masked match_part      66626   67586 0.85    
+       . 
ID=Plvit001:hsp:27621:4.5.0.0;Parent=Plvit001:hit:12095:4.5.0.0;Target=augustus_masked-Plvit001-abinit-gene-0.7-mRNA-1 
1 961 +;Gap=M961
Plvit001        augustus        match   66626   70338   1 +       . 
ID=Plvit001:hit:12088:4.5.0.0;Name=augustus-Plvit001-abinit-gene-0.0-mRNA-1
Plvit001        augustus        match_part      66626   70096 1       
+       . 
ID=Plvit001:hsp:27610:4.5.0.0;Parent=Plvit001:hit:12088:4.5.0.0;Target=augustus-Plvit001-abinit-gene-0.0-mRNA-1 
1 3471 +;Gap=M3471
Plvit001        augustus_masked match_part      68166   68486 0.85    
+       . 
ID=Plvit001:hsp:27622:4.5.0.0;Parent=Plvit001:hit:12095:4.5.0.0;Target=augustus_masked-Plvit001-abinit-gene-0.7-mRNA-1 
962 1282 +;Gap=M321
Plvit001        augustus_masked match_part      69504   70096 0.85    
+       . 
ID=Plvit001:hsp:27623:4.5.0.0;Parent=Plvit001:hit:12095:4.5.0.0;Target=augustus_masked-Plvit001-abinit-gene-0.7-mRNA-1 
1283 1875 +;Gap=M593
Plvit001        augustus_masked match_part      70174   70338 0.85    
+       . 
ID=Plvit001:hsp:27624:4.5.0.0;Parent=Plvit001:hit:12095:4.5.0.0;Target=augustus_masked-Plvit001-abinit-gene-0.7-mRNA-1 
1876 2040 +;Gap=M165


Best regards,

Yann

On 09/03/2017 18:52, Carson Holt wrote:
> My guess is that there is either an issue with the GFF3 file you supplied, so its features are not overlapping anything.
>
> —Carson
>
>
>> On Mar 6, 2017, at 9:51 AM, YannDussert <dussert.yann at gmail.com> wrote:
>>
>> Hello,
>>
>> First, thank you for developing MAKER, this is a great annotation tool!
>>
>> I am trying to annotate the genome of a biotrophic oomycete with MAKER. After reading multiple posts on this list, I first used RNA-seq data and a protein set from other oomycetes to create a first training set. I then used augustus, snap (both trained with models from the first round) and genemark for ab-initio gene prediction during a second round (masked and unmasked genome). I ran MAKER with the following options: single_exon=1, split_hit=5000, correct_est_fusion=1.
>>
>> After the second round, I had only around 11000 annotated genes (96% completeness with Busco V2), whereas I'm expecting between 13000-17000 genes (numbers from other annotated oomycetes). There was only around 1500 genes in the non_overlapping protein file. After looking at the annotation on a genome browser, one of the problems was apparently gene fusions due to bad protein evidence. Following the advice on another post, I tried running MAKER by passing the ab-initio predictions with pred_gff, to avoid using bad protein hints for gene predictors. I still have around 11000 annotated genes, but now there are 10000 genes in the non_overlapping protein file. Why this difference? I thought that this file included gene predictions not supported by any evidence, did I miss something?
>>
>> Thank you in advance for your answer.
>>
>> Best regards,
>> Yann
>>
>> _______________________________________________
>> maker-devel mailing list
>> maker-devel at box290.bluehost.com
>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170310/af6def62/attachment-0003.html>


More information about the maker-devel mailing list