-
Notifications
You must be signed in to change notification settings - Fork 16
Segfault and core dump on Summit #31
Copy link
Copy link
Open
Description
I need some guidance on how to run MACSio on Summit.
To reproduce: I was able to build successfully the MACSio binary with the following dependencies:
ldd ~/opt/macsio/macsio
linux-vdso64.so.1 => (0x00007fffb6120000)
libjson-cwx.so.2 => /ccs/home/wgodoy/opt/json-cwx/lib/libjson-cwx.so.2 (0x00007fffb60e0000)
libmpiprofilesupport.so.3 => /autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/spectrum-mpi-10.3.1.2-20200121-awz2q5brde7wgdqqw4ugalrkukeub4eb/lib/libmpiprofilesupport.so.3 (0x00007fffb60b0000)
libmpi_ibm.so.3 => /autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/spectrum-mpi-10.3.1.2-20200121-awz2q5brde7wgdqqw4ugalrkukeub4eb/lib/libmpi_ibm.so.3 (0x00007fffb5f30000)
libstdc++.so.6 => /sw/summit/gcc/6.4.0/lib64/libstdc++.so.6 (0x00007fffb5d20000)
libm.so.6 => /lib64/libm.so.6 (0x00007fffb5c10000)
libgcc_s.so.1 => /sw/summit/gcc/6.4.0/lib64/libgcc_s.so.1 (0x00007fffb5bd0000)
libc.so.6 => /lib64/libc.so.6 (0x00007fffb59e0000)
librt.so.1 => /lib64/librt.so.1 (0x00007fffb59b0000)
libutil.so.1 => /lib64/libutil.so.1 (0x00007fffb5980000)
libhwloc_ompi.so.15 => /autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/spectrum-mpi-10.3.1.2-20200121-awz2q5brde7wgdqqw4ugalrkukeub4eb/lib/libhwloc_ompi.so.15 (0x00007fffb5910000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007fffb58e0000)
libevent-2.1.so.6 => /autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/spectrum-mpi-10.3.1.2-20200121-awz2q5brde7wgdqqw4ugalrkukeub4eb/lib/libevent-2.1.so.6 (0x00007fffb5860000)
libevent_pthreads-2.1.so.6 => /autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/spectrum-mpi-10.3.1.2-20200121-awz2q5brde7wgdqqw4ugalrkukeub4eb/lib/libevent_pthreads-2.1.so.6 (0x00007fffb5830000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fffb57f0000)
libopen-rte.so.3 => /autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/spectrum-mpi-10.3.1.2-20200121-awz2q5brde7wgdqqw4ugalrkukeub4eb/lib/libopen-rte.so.3 (0x00007fffb56e0000)
libopen-pal.so.3 => /autofs/nccs-svm1_sw/summit/.swci/1-compute/opt/spack/20180914/linux-rhel7-ppc64le/gcc-6.4.0/spectrum-mpi-10.3.1.2-20200121-awz2q5brde7wgdqqw4ugalrkukeub4eb/lib/libopen-pal.so.3 (0x00007fffb55f0000)
/lib64/ld64.so.2 (0x00007fffb6140000)
Unfortunately, any combination of input parameters result in a core dump being emitted and a seg fault.
jsrun -n 2 ${MACSIO_EXEC} --interface hdf5 --parallel_file_mode MIF 2 --part_size 1M
Results:
cat output.490227
[b28n03:147697] *** Process received signal ***
[b28n03:147697] Signal: Segmentation fault (11)
[b28n03:147697] Signal code: Address not mapped (1)
[b28n03:147697] Failing at address: 0x40
[b28n03:147697] [ 0] [0x2000000504d8]
[b28n03:147697] [ 1] [0x20000004d6b0]
[b28n03:147697] [ 2] /ccs/home/wgodoy/opt/json-cwx/lib/libjson-cwx.so.2(json_object_set_string+0x58)[0x2000000f9838]
[b28n03:147697] [ 3] /ccs/home/wgodoy/opt/json-cwx/lib/libjson-cwx.so.2(json_object_path_set_string+0x34)[0x2000000fa764]
[b28n03:147697] [ 4] /ccs/home/wgodoy/opt/macsio/macsio[0x10018cc0]
[b28n03:147697] [ 5] /ccs/home/wgodoy/opt/macsio/macsio(main+0x900)[0x10005980]
[b28n03:147697] [ 6] /lib64/libc.so.6(+0x25200)[0x200000655200]
[b28n03:147697] [ 7] /lib64/libc.so.6(__libc_start_main+0xc4)[0x2000006553f4]
[b28n03:147697] *** End of error message ***
[b28n03:147696] *** Process received signal ***
[b28n03:147696] Signal: Segmentation fault (11)
[b28n03:147696] Signal code: Address not mapped (1)
[b28n03:147696] Failing at address: 0x40
[b28n03:147696] [ 0] [0x2000000504d8]
[b28n03:147696] [ 1] [0x20000004d6b0]
[b28n03:147696] [ 2] /ccs/home/wgodoy/opt/json-cwx/lib/libjson-cwx.so.2(json_object_set_string+0x58)[0x2000000f9838]
[b28n03:147696] [ 3] /ccs/home/wgodoy/opt/json-cwx/lib/libjson-cwx.so.2(json_object_path_set_string+0x34)[0x2000000fa764]
[b28n03:147696] [ 4] /ccs/home/wgodoy/opt/macsio/macsio[0x10018cc0]
[b28n03:147696] [ 5] /ccs/home/wgodoy/opt/macsio/macsio(main+0x900)[0x10005980]
[b28n03:147696] [ 6] /lib64/libc.so.6(+0x25200)[0x200000655200]
[b28n03:147696] [ 7] /lib64/libc.so.6(__libc_start_main+0xc4)[0x2000006553f4]
[b28n03:147696] *** End of error message ***
ERROR: One or more process (first noticed rank 0) terminated with signal 11 (core dumped)
macsio-log.log:
--------------------------------------------------------Processor 000000-------------------------------------------------------
Any help would be appreciated!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels