[maker-devel] MAKER output has different genes with same name
Matt Simenc
mcsimenc at gmail.com
Mon Jul 18 09:20:44 MDT 2016
Update:
So I isolated a single scaffold to run MAKER on and test different
parameters. With map_forward=1 the duplicates disappeared.
However this does not entirely take care of the issue with the entire
assembly. There are still some duplicates. I tried using the -a command
line option and it reduced the number of duplicate IDs for different
features by 2, but I don't know what to do. It's important if I know maker
is keeping the features in order or if it's possible maker is mixing up
exons and CDSs between different gene and mRNA features.
Thanks!
On Sun, Jul 17, 2016 at 4:39 PM, Matt Simenc <mcsimenc at gmail.com> wrote:
> Hi, I figured out the problem. I needed to use map_forward=1. With that
> set, no duplicates.
>
> Matt
>
> On Sat, Jul 16, 2016 at 10:40 PM, Matt Simenc <mcsimenc at gmail.com> wrote:
>
>> I have been using MAKER to iteratively update previous run's annotations
>> by running ab initios with fresh training and feeding the previous run's
>> GFF using the maker_gff option like this:
>>
>> maker_gff=previous_run.gff
>>
>> est_pass=1
>>
>> altest_pass=1
>>
>> protein_pass=1
>>
>> rm_pass=1
>>
>> model_pass=1
>>
>> pred_pass=1
>>
>> other_pass=0
>>
>>
>> Along the way it seems that non-identical features with the same name,
>> some covering the same region and some not, accumulate. When I use
>> fasta_merge -d ...index.log I get sequences for the duplicates. Am I using
>> the control file options incorrectly? Any suggestions how to select final
>> models? Or should I redo the runs if I had some settings wrong?
>>
>>
>> Here is a snippet of the gff produced by gff3_merge -d ...index.log
>> showing duplicate models:
>>
>> -------------------------------------
>>
>> *Sacu_v1_s0077 maker gene 136647 138568 . - .
>> ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20;Name=snap_masked-Sacu_v1_s0077-abinit-gene-1.20;score=70.704*
>>
>> *Sacu_v1_s0077 maker mRNA 136647 138568 . - .
>> ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20;Name=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1;_AED=1.00;_eAED=1.00;_QI=0|0|0|0|1|1|5|0|158;score=70.704*
>>
>> Sacu_v1_s0077 maker exon 138512 138568 . - .
>> ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:exon:2329;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1
>>
>> Sacu_v1_s0077 maker exon 138297 138361 . - .
>> ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:exon:2328;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1
>>
>> Sacu_v1_s0077 maker exon 137723 137786 . - .
>> ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:exon:2327;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1
>>
>> Sacu_v1_s0077 maker exon 137578 137643 . - .
>> ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:exon:2326;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1
>>
>> Sacu_v1_s0077 maker exon 136647 136871 . - .
>> ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:exon:2325;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1
>>
>> Sacu_v1_s0077 maker CDS 138512 138568 . - 0
>> ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:cds;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1
>>
>> Sacu_v1_s0077 maker CDS 138297 138361 . - 0
>> ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:cds;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1
>>
>> Sacu_v1_s0077 maker CDS 137723 137786 . - 1
>> ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:cds;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1
>>
>> Sacu_v1_s0077 maker CDS 137578 137643 . - 0
>> ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:cds;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1
>>
>> Sacu_v1_s0077 maker CDS 136647 136871 . - 0
>> ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:cds;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1
>>
>> *Sacu_v1_s0077 maker gene 98236 98541 . - .
>> ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20;Name=snap_masked-Sacu_v1_s0077-abinit-gene-1.20;score=18.18,18.18,18.18*
>>
>> *Sacu_v1_s0077 maker mRNA 98236 98541 . - .
>> ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20;Name=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|1|1|0|101;score=18.18,18.18,18.18*
>>
>>
>>
>>
>> *Sacu_v1_s0004 maker gene 4775142 4775554 . + .
>> ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3;Name=snap_masked-Sacu_v1_s0004-abinit-gene-47.3;score=14.976*
>>
>> *Sacu_v1_s0004 maker mRNA 4775142 4775554 . + .
>> ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1;Parent=snap_masked-Sacu_v1_s0004-abinit-gene-47.3;Name=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1;_AED=1.00;_eAED=1.00;_QI=0|0|0|0|1|1|2|0|129;score=14.976*
>>
>> Sacu_v1_s0004 maker exon 4775142 4775330 . + .
>> ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1:exon:204;Parent=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1
>>
>> Sacu_v1_s0004 maker exon 4775354 4775554 . + .
>> ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1:exon:205;Parent=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1
>>
>> Sacu_v1_s0004 maker CDS 4775142 4775330 . + 0
>> ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1:cds;Parent=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1
>>
>> Sacu_v1_s0004 maker CDS 4775354 4775554 . + 0
>> ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1:cds;Parent=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1
>>
>> *Sacu_v1_s0004 maker gene 4767976 4768158 . - .
>> ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3;Name=snap_masked-Sacu_v1_s0004-abinit-gene-47.3;score=-0.624,-0.624,-0.624*
>>
>> *Sacu_v1_s0004 maker mRNA 4767976 4768158 . - .
>> ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1;Parent=snap_masked-Sacu_v1_s0004-abinit-gene-47.3;Name=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|1|1|0|60;score=-0.624,-0.624,-0.624*
>>
>> Sacu_v1_s0004 maker exon 4767976 4768158 . - .
>> ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1:exon:211;Parent=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1
>>
>> Sacu_v1_s0004 maker CDS 4767976 4768158 . - 0
>> ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1:cds;Parent=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1
>>
>> Sacu_v1_s0004 snap_masked match 4775142 4775554 14.976 + .
>> ID=Sacu_v1_s0004:hit:181:4.5.0.47;Name=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1;score=14.976
>>
>>
>> Here the models' headers from the maker.proteins.fasta:
>>
>> -------------------------------------
>>
>> >snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1 protein AED:1.00
>> eAED:1.00 QI:0|0|0|0|1|1|2|0|129
>>
>> >snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1 protein AED:1.00
>> eAED:1.00 QI:0|-1|0|0|-1|1|1|0|60
>>
>> >snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1 protein AED:1.00
>> eAED:1.00 QI:0|0|0|0|1|1|5|0|158
>>
>> >snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1 protein AED:1.00
>> eAED:1.00 QI:0|-1|0|0|-1|1|1|0|101
>>
>>
>>
>> Thanks!
>>
>> Matt
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20160718/e11051db/attachment-0003.html>
More information about the maker-devel
mailing list