[CMake] Memory checking MPI programs with valgrind?

Bartlett, Roscoe A rabartl at sandia.gov
Thu Feb 26 13:01:08 EST 2009


> -----Original Message-----
> From: Eric Noulard [mailto:eric.noulard at gmail.com] 
> Sent: Thursday, February 26, 2009 9:29 AM
> To: Bartlett, Roscoe A
> Cc: cmake at cmake.org; Willenbring, James M; Perschbacher, Brent M
> Subject: Re: [CMake] Memory checking MPI programs with valgrind?
> 
> 2009/2/26 Bartlett, Roscoe A <rabartl at sandia.gov>:
> > Hello,
> >
> > Does anyone know how to get CMake/CTest to do memory 
> testing with MPI 
> > programs using valgrind?  The problem is that by default 
> valgrind just 
> > tests the mpiexec/mpirun driver program and not your user 
> program.  I 
> > think there are some options you can pass to valgrind to 
> get it to do 
> > this
> 
> mpirun/mpiexec should somehow fork your program such that you 
> need to add:
> 
> --trace-children=yes
> 
> to valgrind in order to make it trace the child process too.

Okay, I think that is working but valgrind is compaining about the mpiexec program itself as shown by:


----------------------------------------------------------------------------------------

Changing directory into /home/rabartl/PROJECTS/dashboards/Trilinos.base/MPI_OPT/BUILD/packages/epetra/test/SerialDense 
 20/ 24 Memory Check Epetra_SerialDense_test_MPI_1 Memory check command: /usr/bin/valgrind --trace-children=yes 
 
MemCheck command: /usr/lib64/openmpi/1.2.7-gcc/bin/mpiexec -np 1 ./Epetra_SerialDense_test.exe -v 
Test timeout computed to be: 600 
==18349== Memcheck, a memory error detector. 
==18349== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al. 
==18349== Using LibVEX rev 1658, a library for dynamic binary translation. 
==18349== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP. 
==18349== Using valgrind-3.2.1, a dynamic binary instrumentation framework. 
==18349== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al. 
==18349== For more details, rerun with: -v 
==18349==  
==18350== Memcheck, a memory error detector. 
==18350== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al. 
==18350== Using LibVEX rev 1658, a library for dynamic binary translation. 
==18350== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP. 
==18350== Using valgrind-3.2.1, a dynamic binary instrumentation framework. 
==18350== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al. 
==18350== For more details, rerun with: -v 
==18350==  
==18350==  
==18350== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 5 from 1) 
==18350== malloc/free: in use at exit: 0 bytes in 0 blocks. 
==18350== malloc/free: 0 allocs, 0 frees, 0 bytes allocated. 
==18350== For counts of detected errors, rerun with: -v 
==18350== All heap blocks were freed -- no leaks are possible. 
==18351== Syscall param writev(vector[...]) points to uninitialised byte(s) 
==18351==    at 0x3AD20CC0DC: writev (in /lib64/libc-2.5.so) 
==18351==    by 0x5C57C0C: mca_oob_tcp_msg_send_handler (in /usr/lib64/openmpi/1.2.7-gcc/lib/openmpi/mca_oob_tcp.so) 
==18351==    by 0x5C58C18: mca_oob_tcp_peer_send (in /usr/lib64/openmpi/1.2.7-gcc/lib/openmpi/mca_oob_tcp.so) 
==18351==    by 0x5C5BF57: mca_oob_tcp_send (in /usr/lib64/openmpi/1.2.7-gcc/lib/openmpi/mca_oob_tcp.so) 
==18351==    by 0x82C6810: orte_iof_proxy_svc_publish (in /usr/lib64/openmpi/1.2.7-gcc/lib/openmpi/mca_iof_proxy.so) 
==18351==    by 0x82C62D1: orte_iof_proxy_publish (in /usr/lib64/openmpi/1.2.7-gcc/lib/openmpi/mca_iof_proxy.so) 
==18351==    by 0x3AD182C8C5: orte_iof_base_setup_parent (in /usr/lib64/openmpi/1.2.7-gcc/lib/libopen-rte.so.0.0.0) 
==18351==    by 0x7CBD2D9: orte_odls_default_launch_local_procs (in /usr/lib64/openmpi/1.2.7-gcc/lib/openmpi/mca_odls_default.so) 
==18351==    by 0x401B4D: (within /usr/lib64/openmpi/1.2.7-gcc/bin/orted) 
==18351==    by 0x6067E06: orte_gpr_proxy_deliver_notify_msg (in /usr/lib64/openmpi/1.2.7-gcc/lib/openmpi/mca_gpr_proxy.so) 
==18351==    by 0x60665D1: orte_gpr_proxy_notify_recv (in /usr/lib64/openmpi/1.2.7-gcc/lib/openmpi/mca_gpr_proxy.so) 
==18351==    by 0x3AD183035D: (within /usr/lib64/openmpi/1.2.7-gcc/lib/libopen-rte.so.0.0.0) 
==18351==  Address 0x7FEFFEA91 is on thread 1's stack 
==18352== Memcheck, a memory error detector. 
==18352== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al. 
==18352== Using LibVEX rev 1658, a library for dynamic binary translation. 
==18352== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP. 
==18352== Using valgrind-3.2.1, a dynamic binary instrumentation framework. 
==18352== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al. 
==18352== For more details, rerun with: -v 
----------------------------------------------------------------------------------------


Does anyone know how to get valgrind to not search for errors in the parent but only in the children?

Thanks,

- Ross



More information about the CMake mailing list