I don't know if you've considered it and I'm not sure if it would work in your situation, but you might be able to use mpirun to run valgrind. For example: mpirun <mpirun args> valgrind <valgrind args> <executable> <executable args> --ND