[maker-devel] MAKER output has different genes with same name
Carson Holt
carsonhh at gmail.com
Mon Jul 18 09:35:40 MDT 2016
Also from your previous STDERR log you have /state/partition1/ set as your TMP directory. If that is a network mounted location, then you can get duplicate writes from separate threads to output files because of failed locks. The TMP directory should normally be set to /tmp as it is usually a locally mounted disk that will be independent and inaccessible to other nodes.
Duplicate entries are often a sign of this type of IO error, especially if there is a certain degree of randomness to who gets duplicated.
—Carson
> On Jul 18, 2016, at 9:29 AM, Carson Holt <carsonhh at gmail.com> wrote:
>
> Normally a second run should be done in the same directory as opposed to passing in the previous GFF3. Using GFF3 passthrough is meant as a round about way of getting previous results into a new run (for example a previous version of an annotation set where you need to keep the old annoations for some reason and don’t have access to the original data files). You actually lose certain info that was available in the BLAST reports but cannot be recovered from the GFF3 for example.
>
> Both model_pass and pred_pass should probably be set to 0 if you are letting things rerun by providing snaphmm.
>
> Also check your input GFF3 for duplicates, as those will iteratively feed into the next run.
>
> —Carson
>
>
>
>
>> On Jul 18, 2016, at 9:20 AM, Matt Simenc <mcsimenc at gmail.com <mailto:mcsimenc at gmail.com>> wrote:
>>
>> Update:
>>
>> So I isolated a single scaffold to run MAKER on and test different parameters. With map_forward=1 the duplicates disappeared.
>>
>> However this does not entirely take care of the issue with the entire assembly. There are still some duplicates. I tried using the -a command line option and it reduced the number of duplicate IDs for different features by 2, but I don't know what to do. It's important if I know maker is keeping the features in order or if it's possible maker is mixing up exons and CDSs between different gene and mRNA features.
>>
>> Thanks!
>>
>> On Sun, Jul 17, 2016 at 4:39 PM, Matt Simenc <mcsimenc at gmail.com <mailto:mcsimenc at gmail.com>> wrote:
>> Hi, I figured out the problem. I needed to use map_forward=1. With that set, no duplicates.
>>
>> Matt
>>
>> On Sat, Jul 16, 2016 at 10:40 PM, Matt Simenc <mcsimenc at gmail.com <mailto:mcsimenc at gmail.com>> wrote:
>> I have been using MAKER to iteratively update previous run's annotations by running ab initios with fresh training and feeding the previous run's GFF using the maker_gff option like this:
>>
>> maker_gff=previous_run.gff
>>
>> est_pass=1
>>
>> altest_pass=1
>>
>> protein_pass=1
>>
>> rm_pass=1
>>
>> model_pass=1
>>
>> pred_pass=1
>>
>> other_pass=0
>>
>>
>>
>> Along the way it seems that non-identical features with the same name, some covering the same region and some not, accumulate. When I use fasta_merge -d ...index.log I get sequences for the duplicates. Am I using the control file options incorrectly? Any suggestions how to select final models? Or should I redo the runs if I had some settings wrong?
>>
>>
>>
>> Here is a snippet of the gff produced by gff3_merge -d ...index.log showing duplicate models:
>>
>> -------------------------------------
>>
>> Sacu_v1_s0077 maker gene 136647 138568 . - . ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20;Name=snap_masked-Sacu_v1_s0077-abinit-gene-1.20;score=70.704
>>
>> Sacu_v1_s0077 maker mRNA 136647 138568 . - . ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20;Name=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1;_AED=1.00;_eAED=1.00;_QI=0|0|0|0|1|1|5|0|158;score=70.704
>>
>> Sacu_v1_s0077 maker exon 138512 138568 . - . ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:exon:2329;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1
>>
>> Sacu_v1_s0077 maker exon 138297 138361 . - . ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:exon:2328;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1
>>
>> Sacu_v1_s0077 maker exon 137723 137786 . - . ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:exon:2327;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1
>>
>> Sacu_v1_s0077 maker exon 137578 137643 . - . ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:exon:2326;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1
>>
>> Sacu_v1_s0077 maker exon 136647 136871 . - . ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:exon:2325;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1
>>
>> Sacu_v1_s0077 maker CDS 138512 138568 . - 0 ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:cds;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1
>>
>> Sacu_v1_s0077 maker CDS 138297 138361 . - 0 ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:cds;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1
>>
>> Sacu_v1_s0077 maker CDS 137723 137786 . - 1 ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:cds;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1
>>
>> Sacu_v1_s0077 maker CDS 137578 137643 . - 0 ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:cds;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1
>>
>> Sacu_v1_s0077 maker CDS 136647 136871 . - 0 ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:cds;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1
>>
>> Sacu_v1_s0077 maker gene 98236 98541 . - . ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20;Name=snap_masked-Sacu_v1_s0077-abinit-gene-1.20;score=18.18,18.18,18.18
>>
>>
>> Sacu_v1_s0077 maker mRNA 98236 98541 . - . ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20;Name=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|1|1|0|101;score=18.18,18.18,18.18
>>
>>
>>
>>
>>
>>
>>
>> Sacu_v1_s0004 maker gene 4775142 4775554 . + . ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3;Name=snap_masked-Sacu_v1_s0004-abinit-gene-47.3;score=14.976
>>
>> Sacu_v1_s0004 maker mRNA 4775142 4775554 . + . ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1;Parent=snap_masked-Sacu_v1_s0004-abinit-gene-47.3;Name=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1;_AED=1.00;_eAED=1.00;_QI=0|0|0|0|1|1|2|0|129;score=14.976
>>
>> Sacu_v1_s0004 maker exon 4775142 4775330 . + . ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1:exon:204;Parent=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1
>>
>> Sacu_v1_s0004 maker exon 4775354 4775554 . + . ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1:exon:205;Parent=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1
>>
>> Sacu_v1_s0004 maker CDS 4775142 4775330 . + 0 ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1:cds;Parent=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1
>>
>> Sacu_v1_s0004 maker CDS 4775354 4775554 . + 0 ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1:cds;Parent=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1
>>
>> Sacu_v1_s0004 maker gene 4767976 4768158 . - . ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3;Name=snap_masked-Sacu_v1_s0004-abinit-gene-47.3;score=-0.624,-0.624,-0.624
>>
>> Sacu_v1_s0004 maker mRNA 4767976 4768158 . - . ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1;Parent=snap_masked-Sacu_v1_s0004-abinit-gene-47.3;Name=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|1|1|0|60;score=-0.624,-0.624,-0.624
>>
>> Sacu_v1_s0004 maker exon 4767976 4768158 . - . ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1:exon:211;Parent=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1
>>
>> Sacu_v1_s0004 maker CDS 4767976 4768158 . - 0 ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1:cds;Parent=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1
>>
>>
>> Sacu_v1_s0004 snap_masked match 4775142 4775554 14.976 + . ID=Sacu_v1_s0004:hit:181:4.5.0.47;Name=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1;score=14.976
>>
>>
>>
>> Here the models' headers from the maker.proteins.fasta:
>>
>> -------------------------------------
>>
>> >snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1 protein AED:1.00 eAED:1.00 QI:0|0|0|0|1|1|2|0|129
>>
>>
>> >snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1 protein AED:1.00 eAED:1.00 QI:0|-1|0|0|-1|1|1|0|60
>>
>> >snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1 protein AED:1.00 eAED:1.00 QI:0|0|0|0|1|1|5|0|158
>>
>>
>> >snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1 protein AED:1.00 eAED:1.00 QI:0|-1|0|0|-1|1|1|0|101
>>
>>
>>
>>
>>
>> Thanks!
>>
>> Matt
>>
>>
>>
>> _______________________________________________
>> maker-devel mailing list
>> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20160718/e565b76f/attachment-0001.html>
More information about the maker-devel
mailing list