[CMake] FW: Parallel GNU make issue

Hennigan, Gary L glhenni at sandia.gov
Thu Sep 11 14:29:57 EDT 2014


I have a strange, and very frustrating, problem. I have a pretty large piece of software that I build nightly as part of regression testing of my own software. All of the software uses CMake and I use a ctest script, via "ctest -S [script file]", for my nightly regression testing . As I stated, this is a pretty large collection of software but during development it's not a huge issue because the build is quite parallelizable via GNU make's "-j N" option. On my nightly test platform, a 64-core machine, I can build the whole thing in about an hour.  A nice manageable amount of time for a nightly regression test. Unfortunately when I run the build process via ctest something is causing the parallel make to fail and I'm lucky if the build takes under 15 hours. Barely practical for a nightly test.

I'm not sure how to find out what's going on. After the ctest build I can go into the build directory, do a "make clean" and then a "make -j 12", for example, and the build flies. Of course I can build the software entirely outside of ctest and it too flies. Only when the build happens as part of ctest does it seem to revert to, essentially, a "make -j 1" and slow to a crawl.

I can look at the process tree, via "ps -ef", during the ctest build and I see the root invocation of gmake and it's fine. For example, it typically looks something like:

  PID  PPID  C STIME TTY          TIME CMD
 6141  6283 96 11:41 pts/0    00:24:40 ctest -VV -S ctest_nightly.cmake -DPROCESSORCOUNT=12
 8032  6141  0 11:42 pts/0    00:00:00 /usr/bin/gmake -i -j 12
 8035  8032  0 11:42 pts/0    00:00:00 /usr/bin/gmake -f CMakeFiles/Makefile2 all
  851  8035  0 11:52 pts/0    00:00:00 /usr/bin/gmake -f packages/ml/src/CMakeFiles/ml.dir/build.make
27797  8035  0 11:46 pts/0    00:00:00 /usr/bin/gmake -f packages/moocho/src/CMakeFiles/moocho.dir/build.make

You can see that the parent make, PID 8032 which is started via ctest (PID 6141), has the appropriate flag, "-j 12", but at this point in the build it's compiling one file at a time. Another odd thing is that I think the build starts out fine, invoking multiple file compilations simultaneously, but after a couple of minutes it reverts to essentially the "make -j 1" behavior. It's like the GNU make jobserver is failing, but I'm not getting any error messages from GNU make to that affect.

If anyone has any suggestions on how I can figure this out I'd appreciate it.

Apologies for the lengthy explanation. I've been pulling my hair out trying to figure out how to solve this issue.

Thanks in advance,
Gary


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://public.kitware.com/pipermail/cmake/attachments/20140911/4079c1b6/attachment.html>


More information about the CMake mailing list