[maker-devel] Differences in non_overlapping protein file between runs
YannDussert
dussert.yann at gmail.com
Fri Mar 10 03:53:36 MST 2017
Hi,
Thank you for your answer.To get my gff with ab-initio predictions, I
just took the corresponding lines in the maker gff from the previous round.
I can't see any problem with it, it looks like this:
Plvit001 augustus_masked match 66626 70338 0.85 + .
ID=Plvit001:hit:12095:4.5.0.0;Name=augustus_masked-Plvit001-abinit-gene-0.7-mRNA-1
Plvit001 augustus_masked match_part 66626 67586 0.85
+ .
ID=Plvit001:hsp:27621:4.5.0.0;Parent=Plvit001:hit:12095:4.5.0.0;Target=augustus_masked-Plvit001-abinit-gene-0.7-mRNA-1
1 961 +;Gap=M961
Plvit001 augustus match 66626 70338 1 + .
ID=Plvit001:hit:12088:4.5.0.0;Name=augustus-Plvit001-abinit-gene-0.0-mRNA-1
Plvit001 augustus match_part 66626 70096 1
+ .
ID=Plvit001:hsp:27610:4.5.0.0;Parent=Plvit001:hit:12088:4.5.0.0;Target=augustus-Plvit001-abinit-gene-0.0-mRNA-1
1 3471 +;Gap=M3471
Plvit001 augustus_masked match_part 68166 68486 0.85
+ .
ID=Plvit001:hsp:27622:4.5.0.0;Parent=Plvit001:hit:12095:4.5.0.0;Target=augustus_masked-Plvit001-abinit-gene-0.7-mRNA-1
962 1282 +;Gap=M321
Plvit001 augustus_masked match_part 69504 70096 0.85
+ .
ID=Plvit001:hsp:27623:4.5.0.0;Parent=Plvit001:hit:12095:4.5.0.0;Target=augustus_masked-Plvit001-abinit-gene-0.7-mRNA-1
1283 1875 +;Gap=M593
Plvit001 augustus_masked match_part 70174 70338 0.85
+ .
ID=Plvit001:hsp:27624:4.5.0.0;Parent=Plvit001:hit:12095:4.5.0.0;Target=augustus_masked-Plvit001-abinit-gene-0.7-mRNA-1
1876 2040 +;Gap=M165
Best regards,
Yann
On 09/03/2017 18:52, Carson Holt wrote:
> My guess is that there is either an issue with the GFF3 file you supplied, so its features are not overlapping anything.
>
> —Carson
>
>
>> On Mar 6, 2017, at 9:51 AM, YannDussert <dussert.yann at gmail.com> wrote:
>>
>> Hello,
>>
>> First, thank you for developing MAKER, this is a great annotation tool!
>>
>> I am trying to annotate the genome of a biotrophic oomycete with MAKER. After reading multiple posts on this list, I first used RNA-seq data and a protein set from other oomycetes to create a first training set. I then used augustus, snap (both trained with models from the first round) and genemark for ab-initio gene prediction during a second round (masked and unmasked genome). I ran MAKER with the following options: single_exon=1, split_hit=5000, correct_est_fusion=1.
>>
>> After the second round, I had only around 11000 annotated genes (96% completeness with Busco V2), whereas I'm expecting between 13000-17000 genes (numbers from other annotated oomycetes). There was only around 1500 genes in the non_overlapping protein file. After looking at the annotation on a genome browser, one of the problems was apparently gene fusions due to bad protein evidence. Following the advice on another post, I tried running MAKER by passing the ab-initio predictions with pred_gff, to avoid using bad protein hints for gene predictors. I still have around 11000 annotated genes, but now there are 10000 genes in the non_overlapping protein file. Why this difference? I thought that this file included gene predictions not supported by any evidence, did I miss something?
>>
>> Thank you in advance for your answer.
>>
>> Best regards,
>> Yann
>>
>> _______________________________________________
>> maker-devel mailing list
>> maker-devel at box290.bluehost.com
>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170310/af6def62/attachment-0003.html>
More information about the maker-devel
mailing list