[maker-devel] Running MAKER on highly fragmented assembly

Mikael Brandström Durling mikael.durling at slu.se
Thu Feb 21 08:39:37 MST 2013


Hi Daniel,
the genomes I work with are in the order of 30-60 Mb. Other assemblies have been quick jobs for maker without any problems. 

If I run maker with -debugmpi, I get sets of debug printouts from the different ranks now and then:

COMM INITIALIZATION     |  SEND |  who_I_am                     |  3    -->     0       |  46731
COMM INITIALIZATION     |  SEND |  what_I_want                  |  3    -->     0       |  46732
COMM INITIALIZATION     |  RECV |  what_I_want                  |  0    <--     3       |  312850
COMM HAVE C_RESULT      |  SEND |  c_res_status (no_c_res)      |  0    -->     3       |  312851
HELPER/RESULT REQUESTED |  RECV |  work_order (num_helpers_req) |  0    <--     3       |  312852
COMM HAVE C_RESULT      |  RECV |  c_res_status (is c_res?)     |  3    <--     0       |  46733
HELPER/RESULT REQUESTED |  SEND |  work_order (num_helpers_req) |  3    -->     0       |  46734
HELPER/RESULT REQUESTED |  SEND |  req_stat (no_helpers_avail)  |  0    -->     3       |  312853
HELPER/RESULT REQUESTED |  RECV |  req_stat (is helper avail?)  |  3    <--     0       |  46735
COMM INITIALIZATION     |  RECV |  who_I_am                     |  0    <--     ANY     |  312854

and then they seem to stay waiting while a single rank continues to run the normal analysis. I have filtered the assembly for contigs shorter than then minimum length set in maker_opts.ctl.

I did some strace:ing of the ranks that do nothing, and it seems they loop over running a subprocess that basically does a process listing. 

I might be completely off in my guesses of what the problem might be. I'm sort of afraid that I'm bitten by some NFS related problem as I have been quite a few times by know. I will soon try to reannotate a genome sequenced by the JGI where we have 35Mb in 15 scaffolds just to make sure that make behaves as expected with that genome.
Mikael


20 feb 2013 kl. 17:29 skrev Daniel Ence <dence at genetics.utah.edu>:

> Hi Mikael, Depending on the genome size, the assembly you've described shouldn't be too difficult to work with. The process activity that you're describing sounds more like a race condition, where one process is hogging all the work and all the other processes keep trying to find work, but keep getting in each others' way. 
> 
> How much of the genome has maker completed when the processes start doing this?
> 
> Thanks,
> Daniel
> 
> Daniel Ence
> Graduate Student
> Eccles Institute of Human Genetics
> University of Utah
> 15 North 2030 East, Room 2100
> Salt Lake City, UT 84112-5330
> ________________________________________
> From: maker-devel-bounces at yandell-lab.org [maker-devel-bounces at yandell-lab.org] on behalf of Mikael Brandström Durling [mikael.durling at slu.se]
> Sent: Wednesday, February 20, 2013 6:12 AM
> To: maker-devel at yandell-lab.org
> Subject: [maker-devel] Running MAKER on highly fragmented assembly
> 
> Hi,
> 
> I'm trying to run MAKER on a rather fragmented assembly. I know this is not optimal, as I will most likely miss a substantial part of the gene complement due to the fragmentation. Disregarding this, my question is if there are other problems with running maker on these kinds of genomes with roughly 1500 scaffolds and an N50 of 60 kb? I find that maker, run with MPI (mpich2) behaves rather in a rather strange way, with basically one of the ranks staying at 100% cpu, and the others lingering at about 0%. Now and then I see a burst of activity in the other ranks before they get back to low activity. Could this be a result of the fragmentation level, or should I look for other problems? (Like the all to common problems of running over NFS with locking etc).
> 
> cheers,
> Mikael
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org





More information about the maker-devel mailing list