[maker-devel] Profiling MAKER

Carson Holt carsonhh at gmail.com
Fri Sep 18 09:12:14 MDT 2015


Yes. Still BioPerl. You’re right, I probably need  to switch indexing schemes.  I’ve actually made a faidx implementation, but don’t particularly like it.  An NCBI index API might be more ideal.

—Carson




> On Sep 17, 2015, at 8:50 PM, Fields, Christopher J <cjfields at illinois.edu> wrote:
> 
> Possibly.  Might be also feasible to use faidx via samtools API (if we’re intent on that path, there is Bio::DB::Sam, where I added a branch with samtools 1.2 support so could possibly tap into faidx at the XS level).
> 
> chris
> 
>> On Sep 17, 2015, at 9:25 PM, Jason Stajich <jason.stajich at gmail.com <mailto:jason.stajich at gmail.com>> wrote:
>> 
>> What about cdbfasta -- wonder if perl Api to this indexing is possible -- or could be NCBI blast index since that also is a dependency in maker. 
>> On Thu, Sep 17, 2015 at 7:05 PM Fields, Christopher J <cjfields at illinois.edu <mailto:cjfields at illinois.edu>> wrote:
>> Carson, 
>> 
>> Thanks!  Will pass this on to the folks at NCSA, that should help quite a bit.
>> 
>> Yeah, I kinda think it would be nice to come up with an alternative indexing scheme for fasta indexing, at least add some more flexibility (I’m guessing this is BioPerl still?).
>> 
>> chris
>>  
>>> On Sep 16, 2015, at 12:22 PM, Carson Holt <carsonhh at gmail.com <mailto:carsonhh at gmail.com>> wrote:
>>> 
>>> Sorry for the slow reply.  I’m out of the lab right now and will be for the next two weeks.
>>> 
>>> MAKER uses MPI for parallelization.  So it is optimized for distributed non-shared memory systems, but should still work fine on a shared memory system.
>>> 
>>> With MPI, you specify the number of processes to start using the -n flag for mpiexec.  Each MAKER process will need about 2Gb.  It could be more or less depending on the amount of evidence it has to hold in RAM (i.e. deep evidence alignments use more memory). By default each MAKER process will use a single CPU (even though it will start 3 threads - two of the threads will use close to 0% CPU).
>>> 
>>> MAKER will use a lot of IO.  Each process will write/read independently of the others, so the more processes you start, the more simultaneous IO you will have. I’ve tried to put most very heavy IO operations in /tmp or whatever temporary directory you specify. It is important that you never specify an NFS location for your temporary directory. The rest of the IO will occur in the working directory.
>>> 
>>> Also the Berkley DB implementation that sits behind the fasta indexes for sequence access don’t always work well with in memory scratch.  You should always try and set /tmp to a physical drive if possible. You will get several Gb of files in /tmp.
>>> 
>>> —Carson
>>> 
>>> 
>>>> On Sep 15, 2015, at 10:39 AM, Fields, Christopher J <cjfields at illinois.edu <mailto:cjfields at illinois.edu>> wrote:
>>>> 
>>>> We have a group locally (at NCSA) who is interested in profiling MAKER with various performance analysis tools.  They would like to know CPU, RAM, I/O patterns and usage.  In particular, we’re seeing some odd performance problems on a local system which uses a large shared memory cache for storing temp/scratch data (/dev/shm).  
>>>>  
>>>> The question is: are there any particular pain points users and developers know of or could point us to that we can start focusing on?  Any help would be greatly appereciated.
>>>> 
>>>> Thanks,
>>>> 
>>>> chris
>>>> 
>>>> Chris Fields
>>>> Technical Lead in Genome Informatics
>>>> High Performance Computing in Biology
>>>> University of Illinois at Urbana-Champaign
>>>> Roy J. Carver Biotechnology Center / W.M. Keck Center
>>>> Carl R. Woese Institute for Genomic Biology
>>>> 
>>>> _______________________________________________
>>>> maker-devel mailing list
>>>> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=fbHa8Njtvh9VmSnzJxiEUTW9NWDwMMwQAzhgZDO41GQ&m=3kWOgPLUEWO5vIQ6wgmfto8L8wscvRDsXktQwovt8e0&s=Tos660YhJSdwQx-rJPlr5TCprK7416rThaZR-JdQwhE&e=>
>>> 
>> 
>> _______________________________________________
>> maker-devel mailing list
>> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=fbHa8Njtvh9VmSnzJxiEUTW9NWDwMMwQAzhgZDO41GQ&m=3kWOgPLUEWO5vIQ6wgmfto8L8wscvRDsXktQwovt8e0&s=Tos660YhJSdwQx-rJPlr5TCprK7416rThaZR-JdQwhE&e=>
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20150918/88eb0a40/attachment-0003.html>


More information about the maker-devel mailing list