View Issue Details Jump to Notes ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0015873CMakeCMakepublic2015-12-07 04:162016-06-10 14:31
ReporterRalf Mitschke 
Assigned ToKitware Robot 
PriorityhighSeverityblockReproducibilityalways
StatusclosedResolutionmoved 
Platform64bit (virtual machine)OSSuse Linux EnterpriseOS Version12
Product VersionCMake 3.3.2 
Target VersionFixed in Version 
Summary0015873: CMake hangs indefinitely after executing other tools (e.g., gmake, getconf, file)
DescriptionCMake is used in a virtual machine environment and started from Jenkins for large scale build automation of C++ based projects (so not only a single project).

The error mostly occurs in the compiler detection phase.
The spawned gmake or getconf processes CMake calls are reported as zombie processes in the OS.
The cmake process then hangs indefinitely in on a select statement to a UNIX pipe it needs for inter-process communication (a stack trace is posted as additional information).
The issue was even observed earlier in the build, where CMake was executing the process "file", possibly to read the CMakeLists.txt or another input file.

The frequency of the occurrence varies from machine to machine.
But overall, it happens reliably in 1 of 500 runs.

I suspect it is a timing issue and outside of automation not often found.
As a workaround, I created a patched version of the file
Source/kwsys/ProcessUNIX.c

There was some code to completely remove the pipe select mechanism and revert to polling the pipes. By switching the following definition to false we got around the problem:
# define KWSYSPE_USE_SELECT 0

But CMake seems to get slower when not using selects.
Steps To Reproduce- Create a simple C++ project
- Run a batch script to start the build 1000 times.
Additional InformationStack trace where CMake hangs in pipe select statement:

#0 0x00007fab1410ea43 in __select_nocancel () from /lib64/libc.so.6
#1 0x00000000007eb662 in cmsysProcess_WaitForData ()
0000002 0x00000000005c91ac in cmSystemTools::RunSingleCommand(std::vector<std::string, std::allocator<std::string> > const&, std::string*, std::string*, int*, char const*, cmSystemTools::OutputOption, double) ()
0000003 0x000000000053c9f3 in cmGlobalGenerator::Build(std::string const&, std::string const&, std::string const&, std::string const&, std::string&, std::string const&, std::string const&, bool, bool, bool, double, cmSystemTools::OutputOption, std::vector<std::string, std::allocator<std::string> > const&) ()
0000004 0x000000000053d093 in cmGlobalGenerator::TryCompile(std::string const&, std::string const&, std::string const&, std::string const&, bool, std::string&, cmMakefile*) ()
0000005 0x00000000005745bc in cmMakefile::TryCompile(std::string const&, std::string const&, std::string const&, std::string const&, bool, std::vector<std::string, std::allocator<std::string> > const*, std::string&) ()
0000006 0x0000000000676745 in cmCoreTryCompile::TryCompileCode(std::vector<std::string, std::allocator<std::string> > const&) ()
0000007 0x0000000000688453 in cmTryCompileCommand::InitialPass(std::vector<std::string, std::allocator<std::string> > const&, cmExecutionStatus&) ()
0000008 0x0000000000584374 in cmMakefile::ExecuteCommand(cmListFileFunction const&, cmExecutionStatus&) ()
#9 0x00000000006ac560 in cmIfFunctionBlocker::IsFunctionBlocked(cmListFileFunction const&, cmMakefile&, cmExecutionStatus&) ()
0000010 0x00000000005841b1 in cmMakefile::ExecuteCommand(cmListFileFunction const&, cmExecutionStatus&) ()
#11 0x000000000065065f in cmFunctionHelperCommand::InvokeInitialPass(std::vector<cmListFileArgument, std::allocator<cmListFileArgument> > const&, cmExecutionStatus&) ()
0000012 0x00000000005845cd in cmMakefile::ExecuteCommand(cmListFileFunction const&, cmExecutionStatus&) ()
0000013 0x00000000006ac560 in cmIfFunctionBlocker::IsFunctionBlocked(cmListFileFunction const&, cmMakefile&, cmExecutionStatus&) ()
0000014 0x00000000005841b1 in cmMakefile::ExecuteCommand(cmListFileFunction const&, cmExecutionStatus&) ()
0000015 0x00000000005848cc in cmMakefile::ReadListFileInternal(char const*, bool, bool) ()
0000016 0x0000000000585c6d in cmMakefile::ReadListFile(char const*, bool, bool) ()
0000017 0x0000000000586614 in cmMakefile::ReadListFile(char const*) ()
0000018 0x0000000000542e3c in cmGlobalGenerator::EnableLanguage(std::vector<std::string, std::allocator<std::string> > const&, cmMakefile*, bool) ()
0000019 0x0000000000768d9d in cmGlobalUnixMakefileGenerator3::EnableLanguage(std::vector<std::string, std::allocator<std::string> > const&, cmMakefile*, bool) ()
0000020 0x0000000000577313 in cmMakefile::EnableLanguage(std::vector<std::string, std::allocator<std::string> > const&, bool) ()
0000021 0x00000000006a6ab7 in cmProjectCommand::InitialPass(std::vector<std::string, std::allocator<std::string> > const&, cmExecutionStatus&) ()
0000022 0x0000000000584374 in cmMakefile::ExecuteCommand(cmListFileFunction const&, cmExecutionStatus&) ()
0000023 0x00000000005848cc in cmMakefile::ReadListFileInternal(char const*, bool, bool) ()
0000024 0x0000000000585c6d in cmMakefile::ReadListFile(char const*, bool, bool) ()
0000025 0x0000000000586428 in cmMakefile::ProcessBuildsystemFile(char const*) ()
0000026 0x000000000055f9fd in cmLocalGenerator::Configure() ()
0000027 0x0000000000795ad6 in cmLocalUnixMakefileGenerator3::Configure() ()
0000028 0x000000000054b0dc in cmGlobalGenerator::Configure() ()
0000029 0x0000000000768fbc in cmGlobalUnixMakefileGenerator3::Configure() ()
0000030 0x0000000000600a07 in cmake::ActualConfigure() ()
0000031 0x0000000000605523 in cmake::Configure() ()
0000032 0x000000000060946c in cmake::Run(std::vector<std::string, std::allocator<std::string> > const&, bool) ()
0000033 0x000000000050eb76 in do_cmake(int, char const* const*) ()
0000034 0x00000000005103ee in main ()
TagsNo tags attached.
Attached Files

 Relationships

  Notes
(0040019)
michael.smith (reporter)
2015-12-21 16:55

I've also run into this. Thanks for the patch notes.
(0040176)
Anton Astafiev (reporter)
2016-01-11 06:19
edited on: 2016-01-11 06:20

Same on Fedora 22 (GNU Make 4.0): process gmake hangs on waiting another invocation, in my case cp or cat.

(0040177)
Brad King (manager)
2016-01-11 08:58

This is strange. AFAIK CMake uses POSIX-compliant select() calls and the implementation has worked well for years on many platforms.

Re 0015873:0040176: Anton, is it gmake or cmake that hangs?
(0040183)
Ralf Mitschke (reporter)
2016-01-11 10:30

Hi Brad,

I'm not an expert on POSIX select, but I had the same issue now with the previous Suse Linux Enterprise version (11).

After doing some digging on the net I found a post from years back where someone explained how a race condition might arise in kwsys/ProcessUNIX.c, BUT this is in the context of a processor emulator, so an additional layer of error source is present:
https://bugs.launchpad.net/qemu/+bug/955379/comments/15 [^]

These guys conclude that they have to change something in their emulator:
https://bugs.launchpad.net/qemu/+bug/955379/comments/38 [^]

It might also be a problem related to the Suse Linux Enterprise POSIX implementation.
Which UNIX distros are used for CMake regression testing?
Is there anything in the regression tests comparable to the test scenario I outlined?
(0040186)
Brad King (manager)
2016-01-11 11:03

Re 0015873:0040183: Thanks for the research. The conclusion in the latter link is correct. The approach we're using was not our invention and was taken from a tutorial on how to listen for child exit and child output at the same time with select(). It was done around 2003 and I don't remember which tutorial was used or whether it is still available online. However, I've seen the approach recommended many times. It is a proven approach.

I don't know if there is anything in the test suite that would hit the trouble reported here, but CMake has been doing this for over 12 years and has been used in many scripted environments on many POSIX platforms.

I'm quite surprised to learn of platforms that do not implement select() and SIGCHLD in a way that works with this approach, but the discussion here reveals that such platforms exist. Unfortunately I'm not aware of any other approach that uses pure POSIX-only APIs to achieve the same thing. If one is available with newer POSIX standard then great. Otherwise one will have to investigate platform-specific APIs to fix this on offending platforms.
(0040187)
Brad King (manager)
2016-01-11 11:04

When KWSys's Process implementation was first written there was no portable (POSIX + native Windows) library with a compatible license available to do process execution with all the features we needed. Since then "libuv" (http://libuv.org/ [^]) has come to exist and looks like a good replacement. If anyone is interested in trying to port CMake over to it, please raise discussion on the developer list:

 https://cmake.org/mailman/listinfo/cmake-developers [^]
(0042896)
Kitware Robot (administrator)
2016-06-10 14:29

Resolving issue as `moved`.

This issue tracker is no longer used. Further discussion of this issue may take place in the current CMake Issues page linked in the banner at the top of this page.

 Issue History
Date Modified Username Field Change
2015-12-07 04:16 Ralf Mitschke New Issue
2015-12-21 16:55 michael.smith Note Added: 0040019
2016-01-11 06:19 Anton Astafiev Note Added: 0040176
2016-01-11 06:20 Anton Astafiev Note Edited: 0040176
2016-01-11 08:58 Brad King Note Added: 0040177
2016-01-11 10:30 Ralf Mitschke Note Added: 0040183
2016-01-11 11:03 Brad King Note Added: 0040186
2016-01-11 11:04 Brad King Note Added: 0040187
2016-06-10 14:29 Kitware Robot Note Added: 0042896
2016-06-10 14:29 Kitware Robot Status new => resolved
2016-06-10 14:29 Kitware Robot Resolution open => moved
2016-06-10 14:29 Kitware Robot Assigned To => Kitware Robot
2016-06-10 14:31 Kitware Robot Status resolved => closed


Copyright © 2000 - 2018 MantisBT Team