<div dir="ltr">Just an add on to this topic.... I have found a suite of gff utilities here which I hope can help me quickly parse the MAKER gff.<div><br></div><div><a href="https://github.com/mamarjan/gff3-pltools">https://github.com/mamarjan/gff3-pltools</a><br></div><div><br></div><div>I'll report back how it goes !</div><div><br></div><div>Best</div><div>L</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Nov 14, 2017 at 5:04 PM, Michael Campbell <span dir="ltr"><<a href="mailto:michael.s.campbell1@gmail.com" target="_blank">michael.s.campbell1@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word">Hi Lancen,<div><br></div><div>Thanks, the name has served me well for a number of years now :)</div><div><br></div><div>So I started a run with your 11 scaffolds. I gave it the protein file that you sent and used all of repbase for masking. All of the scaffolds finished without error. I was hoping it would be something simple that just needed another set of eyes to see, looks like it's not the case for this one.</div><div><br></div><div>To further rule out a data issue I would try running it with the dpp test data that is bundled with MAKER to see if you can get the same error. This data set will run in about a minute. If you are on a cluster I would try running it with and without submitting it you the nodes and with and without mpi.</div><div><br></div><div>One thing that I have done in the past is to make a new directory and run maker there (this doesn't make a lot of sense but when the error doesn't make sense either it seems reasonable). </div><div><br></div><div>As far as rerunning MAKER there are a couple of approaches. If you want it to stop complaining about trying to many times on failed contigs you can increase the number of tries in the opts file. The line looks like this:</div><div><br></div><div>tries=2 #number of times to try a contig if there is a failure for some reason</div><div><br></div><div>If you want to run it elsewhere, but you don't want to have to redo all of the repeat masking and blasting you can use the gff3 output from an earlier run. If you used gff3_merge after the first run finished you got a big gff3 file with all of the gene models and evidence. If you break up that file by the source column you can selectively pass the evidence back to MAKER. If you put all of the repeatmasker and repeatrunner entries into one file and pass it in on this line:</div><div><br></div><div>rm_gff= #pre-identified repeat elements from an external GFF3 file</div><div><br></div><div>you can turn off model_org= and repeat_protein=. This will speed up the next run a lot. Then you can pass in the protein2genome gff3 data on this line:</div><div><br></div><div>protein_gff= #aligned protein homology evidence from an external GFF3 file</div><div><br></div><div>Don't pass the blast gff3 data in. If you pass in gff3 data to maker is assumes that it is polished and will not make any effort to fix alignments. the protein2genome data is polished. est2genome is the equivalent for EST input.</div><div><br></div><div>Clean_up is useful if you are running on a file system that limits the number of files that you can write. It removes all of the intermediate files used in the annotation. This takes away the advantage of rerunning in the same directory. clean_try deletes everything first, and starts again. clean_try is the one that deletes everything and pretends that the first run never happened. </div><div><br></div><div>I ccd the list on this response just Incas anyone else has any ideas or is facing the same error.</div><div><br></div><div>Let me know if any of this helps,</div><div>Mike</div><div><div class="h5"><div><br><div><blockquote type="cite"><div>On Nov 14, 2017, at 10:48 AM, lahcen campbell <<a href="mailto:lahcencampbell@gmail.com" target="_blank">lahcencampbell@gmail.com</a>> wrote:</div><br class="m_-244735716841605874Apple-interchange-newline"><div><div dir="ltr">Hi Michael <div><br></div><div>Nice name btw I have a Michael in my name too :) Lahcen Michael Campbell to be exact haha...anyway... thanks for the reply and offer to help. </div><div><br></div><div>I have attached the file in question below. Its so strange, I had to just leave it alone cause it was making me quite frustrated. Those bugs which there are now common sense solutions are the worst cause very easily you reach a wall. </div><div><br></div><div>Might it have anything at all to do with the Protein homology file I passed in ? Though, note.... the same protein files here have been used in another maker run without issue so I kind of ruled that out already.....but just spitballing at this stage. </div><div><br></div><div><br></div><div>Might I be so cheeky to ask you one more MAKER related question Michael... ? Feel free to ignore it I hate to push but im desperate to figure it out with little time to do so... <br></div><div><br></div><div>I have an issue with a different MAKER analysis. Currently any new run I attempt on this datastore, which has one round successful with 25000 odd genes and double the transcripts. I attempted to run the second round with a SNAP trained hmm (first time passing in SNAP hmm following first round EST/Protein evidence). In this attempt, because we obtained so many genes I thought I would be more stringent by changing the AED to 0.7 from 1.0. Something I see now I didn't approach in the right way... too late now sadly.</div><div><br></div><div>MAKER finishes fine, but now it views all previous scaffolds as FAILED. Nothing seems to change this and now the datastore is for all intents and purposes locked in failed state. It keeps mentioning changes to the opts file which there were, and that the previous runs didn't finish so it must delete them. The results obtained from round 1 are still there though Im pretty sure of that, all blast files etc are still there and populated. </div><div><br></div><div>Can you tell me the main differences either clean_up or clean_try have and which will completely and irreversibly wipe the first run? Something I don't want to repeat, just allow me to progress to the next round. Im hesitant to run them, but I've backed up the datastore incase. My next attempt will be to pass the exact same maker_opts file from the round1 run, with the only change made to clean_try/clean_up....Is this approach misguided ? </div><div><br></div><div>Your help is very much appreciated Michael so thank you, </div><div>Best</div><div>L</div><div><br></div><div><div class="gmail_chip gmail_drive_chip" style="width:396px;height:18px;max-height:18px;background-color:rgb(245,245,245);padding:5px;color:rgb(34,34,34);font-family:arial;font-style:normal;font-weight:bold;font-size:13px;border:1px solid rgb(221,221,221);line-height:1"><a href="https://drive.google.com/file/d/19ooxfIUGygyW9GBY8uBwCYwjAywRWiL_/view?usp=drive_web" style="display:inline-block;max-width:366px;overflow:hidden;text-overflow:ellipsis;white-space:nowrap;text-decoration:none;padding:1px 0px;border:none" target="_blank"><img style="vertical-align:bottom;border:none" src="https://ssl.gstatic.com/docs/doclist/images/icon_10_generic_list.png"> <span dir="ltr" style="color:rgb(17,85,204);text-decoration:none;vertical-align:bottom">Combined_Protein_homology.fa.<wbr>zip</span></a><img style="opacity:0.55;float:right;display:none"></div><div class="gmail_chip gmail_drive_chip" style="width:396px;height:18px;max-height:18px;background-color:rgb(245,245,245);padding:5px;color:rgb(34,34,34);font-family:arial;font-style:normal;font-weight:bold;font-size:13px;border:1px solid rgb(221,221,221);line-height:1"><a href="https://drive.google.com/file/d/1Mwj6Jpf1U9xzQVgxVFqeYyQokFIrDFo5/view?usp=drive_web" style="display:inline-block;max-width:366px;overflow:hidden;text-overflow:ellipsis;white-space:nowrap;text-decoration:none;padding:1px 0px;border:none" target="_blank"><img style="vertical-align:bottom;border:none" src="https://ssl.gstatic.com/docs/doclist/images/icon_10_generic_list.png"> <span dir="ltr" style="color:rgb(17,85,204);text-decoration:none;vertical-align:bottom">SubsampledGenomeFile_n10_<wbr>11MB.fasta</span></a><img style="opacity:0.55;float:right;display:none"></div><br></div><div><br></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Nov 14, 2017 at 3:08 PM, Michael Campbell <span dir="ltr"><<a href="mailto:michael.s.campbell1@gmail.com" target="_blank">michael.s.campbell1@gmail.com</a><wbr>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word">Hi Lahcen,<div><br></div><div>Nothing comes right to mind for what could be causing this error. If you want to compress your FASTA and send it to me I can try and recreate the error and try and debug it.</div><div><br></div><div>Thanks,</div><div>Mike<br><div><blockquote type="cite"><div><div class="m_-244735716841605874h5"><div>On Nov 14, 2017, at 7:15 AM, lahcen campbell <<a href="mailto:lahcencampbell@gmail.com" target="_blank">lahcencampbell@gmail.com</a>> wrote:</div><br class="m_-244735716841605874m_-8577554359614042639Apple-interchange-newline"></div></div><div><div><div class="m_-244735716841605874h5"><div dir="ltr">Hi MAKER community,<div><br></div><div>I was hoping someone could help me. I have a very unusual error with two different versions of maker I have tested so far. This error shouldn't be happening but it occurs time and again no matter what I try. I have tried using 2.31.6_mpich3_icc and 2.31_mpi<wbr>ch3</div><div><br></div><div>Note that version 2.31.6_mpich3_icc is one I have used countless times and produced final MAKER annotations without issue. So its not that this version has issues to date. </div><div><br></div><div>Basically, this is a brand new MAKER analysis, I am only trying to train SNAP in this first round. I am following the MakerTutorial as documented this time around and I can't get past the initial SNAP train stage. </div><div><br></div><div>I have a single genome file with, 10 Long scaffolds making up just under 11MB (subsampled from my original full length assembly) of sequence data in which to train SNAP. The fasta file is not corrupted, and has been generated in various ways in order to test formatting issues etc. </div><div><br></div><div>I have only edited the maker_opts file and changed:<br></div><div><br></div><div><b>genome=</b></div><div><b>protein=</b></div><div><b>protein2genome=1</b></div><div><br></div><div>But see attached my maker CTL files. </div><div><br></div><div>The error consistently returned to me:</div><div><br></div><div><div><b><i>Skipping the contig because it is too short!!</i></b></div><div><b><i>SeqID: contig_WHATEVER</i></b></div><div><b><i>Length: 0</i></b></div></div><div><br></div><div><i>The sequences are no where near too short. This was verified independently outside maker to be sure. </i></div><div><i><br></i></div><div><i>The headers are as follows:</i></div><div><div><br></div><div>>tig00000458 len=2889428 reads=4143 covStat=1793.77 gappedBases=no class=contig suggestRepeat=no suggestCircular=no</div><div>>tig00000159 len=3515005 reads=5100 covStat=2143.94 gappedBases=no class=contig suggestRepeat=no suggestCircular=no</div><div>>tig00006117 len=1009519 reads=1168 covStat=804.93 gappedBases=no class=contig suggestRepeat=no suggestCircular=no</div><div>>tig00000419 len=2633986 reads=3938 covStat=1519.93 gappedBases=no class=contig suggestRepeat=no suggestCircular=no</div><div>>tig00027677 len=108573 reads=86 covStat=86.05 gappedBases=no class=contig suggestRepeat=no suggestCircular=no</div><div>>tig00021790 len=202251 reads=158 covStat=184.12 gappedBases=no class=contig suggestRepeat=no suggestCircular=no</div><div>>tig00316948 len=280333 reads=237 covStat=253.23 gappedBases=no class=contig suggestRepeat=no suggestCircular=no</div><div>>tig00019606 len=149709 reads=82 covStat=150.02 gappedBases=no class=contig suggestRepeat=no suggestCircular=no</div><div>>tig00023852 len=189461 reads=115 covStat=192.28 gappedBases=no class=contig suggestRepeat=no suggestCircular=no</div><div>>tig00316994 len=19742 reads=1 covStat=0.00 gappedBases=no class=contig suggestRepeat=no suggestCircular=no</div><div><br></div><div>I have just about given up, I have no idea why its happening it makes zero sense. </div><div><br></div><div>Any help or information as to why this might be happening would be amazing. </div><div><br></div><div>Thank you in advance. </div><div>Lahcen</div><div><br></div>-- <br><div class="m_-244735716841605874m_-8577554359614042639gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr">==============================<wbr>============<br>> Dr. Lahcen Campbell <<div>> Contact: <a href="mailto:lahcencampbell@gmail.com" target="_blank">lahcencampbell@gmail.com</a> <</div><div>> <a href="https://www.ebi.ac.uk/about/people/lahcen-campbell" target="_blank">https://www.ebi.ac.uk/about/<wbr>people/lahcen-campbell</a> <</div><div><div><div>==============================<wbr>============</div></div></div></div></div></div></div></div></div></div></div>
</div></div>
</div></div><span id="m_-244735716841605874m_-8577554359614042639cid:07B64DDF-C3AA-47DA-BD60-41FA6DC97152@cshl.edu"><maker_bopts.ctl></span><span id="m_-244735716841605874m_-8577554359614042639cid:68A7D25B-B32E-4C44-AB8B-920CE17CE120@cshl.edu"><maker_exe.ct<wbr>l></span><span id="m_-244735716841605874m_-8577554359614042639cid:FC931630-8C80-4927-B0BE-40A3EF7BE6D2@cshl.edu"><maker_opts.ctl></span>____________<wbr>______________________________<wbr>_____<br>maker-devel mailing list<br><a href="mailto:maker-devel@box290.bluehost.com" target="_blank">maker-devel@box290.bluehost.co<wbr>m</a><br><a href="http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org" target="_blank">http://box290.bluehost.com/mai<wbr>lman/listinfo/maker-devel_yand<wbr>ell-lab.org</a><br></div></blockquote></div><br></div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="m_-244735716841605874gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr">==============================<wbr>============<br>> Dr. Lahcen Campbell <<div>> Contact: <a href="mailto:lahcencampbell@gmail.com" target="_blank">lahcencampbell@gmail.com</a> <</div><div>> <a href="https://www.ebi.ac.uk/about/people/lahcen-campbell" target="_blank">https://www.ebi.ac.uk/about/<wbr>people/lahcen-campbell</a> <</div><div><div><div>==============================<wbr>============</div></div></div></div></div></div></div></div></div></div></div>
</div>
</div></blockquote></div><br></div></div></div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr">==========================================<br>> Dr. Lahcen Campbell <<div>> Contact: <a href="mailto:lahcencampbell@gmail.com" target="_blank">lahcencampbell@gmail.com</a> <</div><div>> <a href="https://www.ebi.ac.uk/about/people/lahcen-campbell" target="_blank">https://www.ebi.ac.uk/about/people/lahcen-campbell</a> <</div><div><div><div>==========================================</div></div></div></div></div></div></div></div></div></div></div>
</div>