<div dir="ltr"><span style="font-size:12.8000001907349px">Update:</span><div style="font-size:12.8000001907349px"><br></div><div style="font-size:12.8000001907349px">So I isolated a single scaffold to run MAKER on and test different parameters. With map_forward=1 the duplicates disappeared. </div><div style="font-size:12.8000001907349px"><br></div><div style="font-size:12.8000001907349px">However this does not entirely take care of the issue with the entire assembly. There are still some duplicates. I tried using the -a command line option and it reduced the number of duplicate IDs for different features by 2, but I don't know what to do. It's important if I know maker is keeping the features in order or if it's possible maker is mixing up exons and CDSs between different gene and mRNA features.</div><div style="font-size:12.8000001907349px"><br></div><div style="font-size:12.8000001907349px">Thanks!</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sun, Jul 17, 2016 at 4:39 PM, Matt Simenc <span dir="ltr"><<a href="mailto:mcsimenc@gmail.com" target="_blank">mcsimenc@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hi, I figured out the problem. I needed to use map_forward=1. With that set, no duplicates.<span class="HOEnZb"><font color="#888888"><div><br></div><div>Matt</div></font></span></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Jul 16, 2016 at 10:40 PM, Matt Simenc <span dir="ltr"><<a href="mailto:mcsimenc@gmail.com" target="_blank">mcsimenc@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">I have been using MAKER to iteratively update previous run's annotations by running ab initios with fresh training and feeding the previous run's GFF using the maker_gff option like this:<div><br></div><div>







<p>maker_gff=previous_run.gff</p>
<p>est_pass=1</p><p>altest_pass=1</p>
<p>protein_pass=1</p>
<p>rm_pass=1</p>
<p>model_pass=1</p>
<p>pred_pass=1</p>
<p>other_pass=0</p><p><br></p><p>Along the way it seems that non-identical features with the same name, some covering the same region and some not, accumulate. When I use fasta_merge -d ...index.log I get sequences for the duplicates. Am I using the control file options incorrectly? Any suggestions how to select final models? Or should I redo the runs if I had some settings wrong?</p><p><br></p><p>Here is a snippet of the gff produced by gff3_merge -d ...index.log showing duplicate models:</p><p>-------------------------------------<br></p><p><b>Sacu_v1_s0077<span> </span>maker<span>  </span>gene<span>   </span>136647<span> </span>138568<span> </span>.<span>      </span>-<span>      </span>.<span>      </span>ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20;Name=snap_masked-Sacu_v1_s0077-abinit-gene-1.20;score=70.704</b></p><p><b>Sacu_v1_s0077<span>  </span>maker<span>  </span>mRNA<span>   </span>136647<span> </span>138568<span> </span>.<span>      </span>-<span>      </span>.<span>      </span>ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20;Name=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1;_AED=1.00;_eAED=1.00;_QI=0|0|0|0|1|1|5|0|158;score=70.704</b></p><p>Sacu_v1_s0077<span>      </span>maker<span>  </span>exon<span>   </span>138512<span> </span>138568<span> </span>.<span>      </span>-<span>      </span>.<span>      </span>ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:exon:2329;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1</p><p>Sacu_v1_s0077<span>        </span>maker<span>  </span>exon<span>   </span>138297<span> </span>138361<span> </span>.<span>      </span>-<span>      </span>.<span>      </span>ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:exon:2328;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1</p><p>Sacu_v1_s0077<span>        </span>maker<span>  </span>exon<span>   </span>137723<span> </span>137786<span> </span>.<span>      </span>-<span>      </span>.<span>      </span>ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:exon:2327;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1</p><p>Sacu_v1_s0077<span>        </span>maker<span>  </span>exon<span>   </span>137578<span> </span>137643<span> </span>.<span>      </span>-<span>      </span>.<span>      </span>ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:exon:2326;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1</p><p>Sacu_v1_s0077<span>        </span>maker<span>  </span>exon<span>   </span>136647<span> </span>136871<span> </span>.<span>      </span>-<span>      </span>.<span>      </span>ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:exon:2325;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1</p><p>Sacu_v1_s0077<span>        </span>maker<span>  </span>CDS<span>    </span>138512<span> </span>138568<span> </span>.<span>      </span>-<span>      </span>0<span>      </span>ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:cds;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1</p><p>Sacu_v1_s0077<span>      </span>maker<span>  </span>CDS<span>    </span>138297<span> </span>138361<span> </span>.<span>      </span>-<span>      </span>0<span>      </span>ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:cds;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1</p><p>Sacu_v1_s0077<span>      </span>maker<span>  </span>CDS<span>    </span>137723<span> </span>137786<span> </span>.<span>      </span>-<span>      </span>1<span>      </span>ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:cds;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1</p><p>Sacu_v1_s0077<span>      </span>maker<span>  </span>CDS<span>    </span>137578<span> </span>137643<span> </span>.<span>      </span>-<span>      </span>0<span>      </span>ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:cds;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1</p><p>Sacu_v1_s0077<span>      </span>maker<span>  </span>CDS<span>    </span>136647<span> </span>136871<span> </span>.<span>      </span>-<span>      </span>0<span>      </span>ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1:cds;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1</p><p><b>Sacu_v1_s0077<span>     </span>maker<span>  </span>gene<span>   </span>98236<span>  </span>98541<span>  </span>.<span>      </span>-<span>      </span>.<span>      </span>ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20;Name=snap_masked-Sacu_v1_s0077-abinit-gene-1.20;score=18.18,18.18,18.18</b></p><p>




















</p><p><b>Sacu_v1_s0077<span>   </span>maker<span>  </span>mRNA<span>   </span>98236<span>  </span>98541<span>  </span>.<span>      </span>-<span>      </span>.<span>      </span>ID=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1;Parent=snap_masked-Sacu_v1_s0077-abinit-gene-1.20;Name=snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|1|1|0|101;score=18.18,18.18,18.18</b></p><p><br></p><p><br></p><p><br></p><p><b>Sacu_v1_s0004<span> </span>maker<span>  </span>gene<span>   </span>4775142<span>        </span>4775554<span>        </span>.<span>      </span>+<span>      </span>.<span>      </span>ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3;Name=snap_masked-Sacu_v1_s0004-abinit-gene-47.3;score=14.976</b></p><p><b>Sacu_v1_s0004<span>  </span>maker<span>  </span>mRNA<span>   </span>4775142<span>        </span>4775554<span>        </span>.<span>      </span>+<span>      </span>.<span>      </span>ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1;Parent=snap_masked-Sacu_v1_s0004-abinit-gene-47.3;Name=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1;_AED=1.00;_eAED=1.00;_QI=0|0|0|0|1|1|2|0|129;score=14.976</b></p><p>Sacu_v1_s0004<span>      </span>maker<span>  </span>exon<span>   </span>4775142<span>        </span>4775330<span>        </span>.<span>      </span>+<span>      </span>.<span>      </span>ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1:exon:204;Parent=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1</p><p>Sacu_v1_s0004<span> </span>maker<span>  </span>exon<span>   </span>4775354<span>        </span>4775554<span>        </span>.<span>      </span>+<span>      </span>.<span>      </span>ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1:exon:205;Parent=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1</p><p>Sacu_v1_s0004<span> </span>maker<span>  </span>CDS<span>    </span>4775142<span>        </span>4775330<span>        </span>.<span>      </span>+<span>      </span>0<span>      </span>ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1:cds;Parent=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1</p><p>Sacu_v1_s0004<span>      </span>maker<span>  </span>CDS<span>    </span>4775354<span>        </span>4775554<span>        </span>.<span>      </span>+<span>      </span>0<span>      </span>ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1:cds;Parent=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1</p><p><b>Sacu_v1_s0004<span>     </span>maker<span>  </span>gene<span>   </span>4767976<span>        </span>4768158<span>        </span>.<span>      </span>-<span>      </span>.<span>      </span>ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3;Name=snap_masked-Sacu_v1_s0004-abinit-gene-47.3;score=-0.624,-0.624,-0.624</b></p><p><b>Sacu_v1_s0004<span>    </span>maker<span>  </span>mRNA<span>   </span>4767976<span>        </span>4768158<span>        </span>.<span>      </span>-<span>      </span>.<span>      </span>ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1;Parent=snap_masked-Sacu_v1_s0004-abinit-gene-47.3;Name=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|1|1|0|60;score=-0.624,-0.624,-0.624</b></p><p>Sacu_v1_s0004<span>       </span>maker<span>  </span>exon<span>   </span>4767976<span>        </span>4768158<span>        </span>.<span>      </span>-<span>      </span>.<span>      </span>ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1:exon:211;Parent=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1</p><p>Sacu_v1_s0004<span> </span>maker<span>  </span>CDS<span>    </span>4767976<span>        </span>4768158<span>        </span>.<span>      </span>-<span>      </span>0<span>      </span>ID=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1:cds;Parent=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1</p><p>

















</p><p>Sacu_v1_s0004<span>    </span>snap_masked<span>    </span>match<span>  </span>4775142<span>        </span>4775554<span>        </span>14.976<span> </span>+<span>      </span>.<span>      </span>ID=Sacu_v1_s0004:hit:181:4.5.0.47;Name=snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1;score=14.976</p><p><br></p><p>Here the models' headers from the maker.proteins.fasta:</p><p>-------------------------------------<br></p><p>>snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1 protein AED:1.00 eAED:1.00 QI:0|0|0|0|1|1|2|0|129</p><p>








</p><p>>snap_masked-Sacu_v1_s0004-abinit-gene-47.3-mRNA-1 protein AED:1.00 eAED:1.00 QI:0|-1|0|0|-1|1|1|0|60</p><p>>snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1 protein AED:1.00 eAED:1.00 QI:0|0|0|0|1|1|5|0|158</p><p>








</p><p>>snap_masked-Sacu_v1_s0077-abinit-gene-1.20-mRNA-1 protein AED:1.00 eAED:1.00 QI:0|-1|0|0|-1|1|1|0|101</p><p><br></p><p><br></p><p>Thanks!</p><span><font color="#888888"><p>Matt</p>















</font></span></div></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>