[maker-devel] mpi issue on computing cluster

Carson Holt carsonhh at gmail.com
Tue Apr 17 14:25:51 MDT 2012


Sorry missed the ';' at the end of the eval block.

Should be this -->

    my $select;
    eval{
        require Proc::ProcessTable;
        my $obj = new Proc::ProcessTable;
        foreach my $p (@{$obj->table}) {
    #now check for the id
            if ($p->pid == $id){
$select = $p;
last;
            }
        }
    };

    return $select;


--Carson

From:  Carson Holt <carsonhh at gmail.com>
Date:  Tue, 17 Apr 2012 13:09:32 -0400
To:  Scott Geib <smg283 at gmail.com>, <maker-devel at yandell-lab.org>
Subject:  Re: [maker-devel] mpi issue on computing cluster

If it's a sharedlibs issue then 'maker -help' would cause the same error.
Try that.


Are you sure that  you are not worried about Signal.pm causing the error?
Try changing 
/mnt/work/scratch/scottge/maker-2.24/maker/bin/../lib/Proc/Signal.pm lines
136-143 from this -->


    require Proc::ProcessTable;
    my $obj = new Proc::ProcessTable;
    foreach my $p (@{$obj->table}) {
        #now check for the id
        return $p if ($p->pid == $id);
    }

    return undef;



To this -->


    my $select;
    eval{
        require Proc::ProcessTable;
        my $obj = new Proc::ProcessTable;
        foreach my $p (@{$obj->table}) {
    #now check for the id
            if ($p->pid == $id){
$select = $p;
last;
            }
        }
    };

    return $select;


If it works, I can generate a cleaner workaround, but I'd like to know If
that is the root of the problem.

Thanks,
Carson


From:  Scott Geib <smg283 at gmail.com>
Date:  Fri, 13 Apr 2012 09:00:29 -1000
To:  <maker-devel at yandell-lab.org>
Subject:  [maker-devel] mpi issue on computing cluster

Hi, 
I am trying to run maker 2.24 on a compute cluster and get the following
error (not worried about Signal.pm error):

an into unknown state (hex char: 29) at
/mnt/work/scratch/scottge/maker-2.24/maker/bin/../lib/Proc/Signal.pm line
138.
Fatal error in MPI_Init: Other MPI error, error stack:
MPIR_Init_thread(388)........:
MPID_Init(139)...............: channel initialization failed
MPIDI_CH3_Init(49)...........: progress_init failed
MPIDI_CH3I_Progress_init(808): This version of MPICH requires the SIGUSR1
signal, but the application has already installed a handler
[proxy:0:0 at r01n11.local] HYD_pmcd_pmip_control_cmd_cb
(./pm/pmiserv/pmip_cb.c:868): assert (!closed) failed
[proxy:0:0 at r01n11.local] HYDT_dmxu_poll_wait_for_event
(./tools/demux/demux_poll.c:77): callback returned error status
[proxy:0:0 at r01n11.local] main (./pm/pmiserv/pmip.c:208): demux engine error
waiting for event
[proxy:0:1 at r01n13.local] HYD_pmcd_pmip_control_cmd_cb
(./pm/pmiserv/pmip_cb.c:868): assert (!closed) failed
[proxy:0:1 at r01n13.local] HYDT_dmxu_poll_wait_for_event
(./tools/demux/demux_poll.c:77): callback returned error status
[proxy:0:1 at r01n13.local] main (./pm/pmiserv/pmip.c:208): demux engine error
waiting for event
[proxy:0:3 at r07n27.local] HYD_pmcd_pmip_control_cmd_cb
(./pm/pmiserv/pmip_cb.c:868): assert (!closed) failed
[proxy:0:3 at r07n27.local] HYDT_dmxu_poll_wait_for_event
(./tools/demux/demux_poll.c:77): callback returned error status
[proxy:0:3 at r07n27.local] main (./pm/pmiserv/pmip.c:208): demux engine error
waiting for event
[mpiexec at r01n11.local] HYDT_bscu_wait_for_completion
(./tools/bootstrap/utils/bscu_wait.c:70): one of the processes terminated
badly; aborting
[mpiexec at r01n11.local] HYDT_bsci_wait_for_completion
(./tools/bootstrap/src/bsci_wait.c:18): launcher returned error waiting for
completion
[mpiexec at r01n11.local] HYD_pmci_wait_for_completion
(./pm/pmiserv/pmiserv_pmci.c:216): launcher returned error waiting for
completion
[mpiexec at r01n11.local] main (./ui/mpich/mpiexec.c:404): process manager
error waiting for completion

I do not know how mpich2 was compiled, I feel this may be a
--enable-sharedlibs issue?

I may need to contact my cluster support, but I thought I would try here
first, 

Thanks
_______________________________________________ maker-devel mailing list
maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/m
aker-devel_yandell-lab.org


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20120417/3ccae139/attachment-0003.html>


More information about the maker-devel mailing list