<html dir="ltr">

<head>

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">

<style type="text/css" id="owaParaStyle"></style>

</head>

<body fpstyle="1" ocsi="0">

<div style="direction: ltr;font-family: Tahoma;color: #000000;font-size: 10pt;">Hi Rebecca, So, as far as pruning down the dataset goes, I think that the biggest gains will be made by trimming the number of scaffolds that you annotate. What is the n50 of your

 400,000 scaffold set? Usually, scaffolds shorter than 5k or 10kbp won't contribute much to the gene counts in the end.

<div><span style="font-size: 10pt;"><br>

</span></div>

<div>Also, if you can, try to avoid using the alt_est option. It works completely fine, but blasting those sequences takes much longer than blastn or blastp. </div>

<div><br>

</div>

<div>Otherwise, I'd need to see your maker_opts.ctl file to see how you've got things set up. You can attach those to your reply (to the maker-devel list), and I'll take a look. I don't how to force maker to create fewer files. You definitely want to be able

 to make use of the results from prior runs to save time. </div>

<div><br>

</div>

<div>Thanks,<br>

Daniel</div>

<div><br>

</div>

<div>

<div>

<div>

<div class="BodyFragment"><font size="2">

<div class="PlainText">Daniel Ence<br>

Graduate Student<br>

Eccles Institute of Human Genetics<br>

University of Utah<br>

15 North 2030 East, Room 2100<br>

Salt Lake City, UT 84112-5330</div>

</font></div>

</div>

<div style="font-family: Times New Roman; color: #000000; font-size: 16px">

<hr tabindex="-1">

<div id="divRpF340723" style="direction: ltr;"><font face="Tahoma" size="2" color="#000000"><b>From:</b> maker-devel [maker-devel-bounces@yandell-lab.org] on behalf of Rebecca Harris [rbharris@uw.edu]<br>

<b>Sent:</b> Wednesday, March 19, 2014 7:19 PM<br>

<b>To:</b> maker-devel@yandell-lab.org<br>

<b>Subject:</b> [maker-devel] tradeoff between run time & file number<br>

</font><br>

</div>

<div></div>

<div>

<div dir="ltr">Hi -

<div><br>

</div>

<div>I'm running maker on a dataset of >400,000 scaffolds with MPI -n 64. I've gone through it once - and used the clean_up option because otherwise maker exceeds the clusters file_quote. However, now I'm retraining SNAP and it is taking a very long time -

 probably because it has to go through BLAST again. Is there anyway of getting around this? I expect I may have to train SNAP and rerun maker multiple times and it is taking about 3 weeks to get through my dataset. Is there a way to prune down my original dataset

 based on maker's output?</div>

<div><br>

</div>

<div>Thanks,</div>

<div>Rebecca</div>

</div>

</div>

</div>

</div>

</div>

</div>

</body>

</html>