MantisBT - CMake
View Issue Details
0015039CMakeCMakepublic2014-07-27 04:072016-06-10 14:31
Mojca Miklavec 
Brad King 
normalmajoralways
closedmoved 
Apple MacOS X10.7
CMake 3.0 
 
0015039: CMake linked against libc++ on OS X 10.7 cuts lines at 1024 characters
As already reported (http://public.kitware.com/pipermail/cmake-developers/2014-June/010712.html [^]) building CMake on OS X 10.7 against libc++ fails because some lines are cut at 1024 characters when reading from link.txt.

If I manually execute a couple of commands that have been "cut", the build eventually succeeds, but the resulting cmake binary is "useless" because any project that requires lines longer than 1024 characters to compile code, fails to build as well.
Compile CMake on OS X 10.7 using

CC='/usr/bin/clang'
CXX='/usr/bin/clang++'
CXXFLAGS='-stdlib=libc++'
Brad King wrote:

The lines are read here:

 http://cmake.org/gitweb?p=cmake.git;a=blob;f=Source/cmcmd.cxx;hb=v3.0.0#l988 [^]

using the GetLineFromStream helper:

 http://cmake.org/gitweb?p=cmake.git;a=blob;f=Source/kwsys/SystemTools.cxx;hb=v3.0.0#l4157 [^]

Note the buffer size is 1024. Something must go wrong with the
stream state to make it look like EOL or EOF to this code.


Ben Boeckel wrote:

The proper way to read lines from a file using iostream is:

    std::string line;
    while (std::getline(istr, line)) {
        // use line
    }

Unfortunately, it looks like there's unavoidable extra logic in there.
Maybe we could do:

    #ifdef BROKEN_STREAMS
    // crutch code
    #else
    // sanity
    #endif
No tags attached.
cxx GetLineFromStream.cxx (2,336) 2014-07-28 14:39
https://public.kitware.com/Bug/file/5202/GetLineFromStream.cxx
Issue History
2014-07-27 04:07Mojca MiklavecNew Issue
2014-07-28 09:39Brad KingAssigned To => Ben Boeckel
2014-07-28 09:39Brad KingStatusnew => assigned
2014-07-28 12:43Brad KingAssigned ToBen Boeckel => Brad King
2014-07-28 12:57Brad KingNote Added: 0036476
2014-07-28 13:02Brad KingNote Added: 0036477
2014-07-28 14:15Mojca MiklavecNote Added: 0036479
2014-07-28 14:39Brad KingFile Added: GetLineFromStream.cxx
2014-07-28 14:40Brad KingNote Added: 0036480
2014-07-28 15:47Mojca MiklavecNote Added: 0036481
2014-07-28 15:52Mojca MiklavecNote Added: 0036482
2014-07-28 16:06Brad KingNote Added: 0036483
2014-07-28 16:08Brad KingNote Added: 0036484
2014-07-28 16:27Mojca MiklavecNote Added: 0036485
2014-07-28 17:26Mojca MiklavecNote Added: 0036486
2014-07-29 09:41Brad KingNote Added: 0036488
2014-07-29 10:09Mojca MiklavecNote Added: 0036490
2014-07-29 10:18Brad KingNote Added: 0036491
2014-07-29 10:28Mojca MiklavecNote Added: 0036492
2014-07-29 10:50Brad KingNote Added: 0036493
2014-07-29 11:13Mojca MiklavecNote Added: 0036494
2016-06-10 14:29Kitware RobotNote Added: 0042591
2016-06-10 14:29Kitware RobotStatusassigned => resolved
2016-06-10 14:29Kitware RobotResolutionopen => moved
2016-06-10 14:31Kitware RobotStatusresolved => closed

Notes
(0036476)
Brad King   
2014-07-28 12:57   
As discussed in the mailing list thread, the GetLineFromStream interface cannot be fully replaced by getline. However, the implementation of GetLineFromStream should be made to work with libc++ on OS X 10.7 (a combination not currently available to me). Basically GetLineFromStream uses gcount() after getline() instead of trusting the stream state (which was buggy on OS X 10.3 and older HP platforms). It is standard conforming AFAIK, so this "bug" is actually the lack of a workaround for a bug on OS X 10.7 in libc++.

The crux of the loop is here:

 http://cmake.org/gitweb?p=cmake.git;a=blob;f=Source/kwsys/SystemTools.cxx;hb=v3.0.0#l4192 [^]

I suspect gcount() comes back incorrectly, either as zero, or such that the following length comparison is tricked into setting haveNewline. Can someone with access to this platform combination debug this, please?
(0036477)
Brad King   
2014-07-28 13:02   
One could also try a patch like this (untested):

diff --git a/Source/kwsys/SystemTools.cxx b/Source/kwsys/SystemTools.cxx
index db94510..6f17b91 100644
--- a/Source/kwsys/SystemTools.cxx
+++ b/Source/kwsys/SystemTools.cxx
@@ -4323,8 +4323,13 @@ bool SystemTools::GetLineFromStream(kwsys_ios::istream& is,
   // been reached. Clear the fail bit just before reading.
   while(!haveNewline &&
         leftToRead != 0 &&
+#if defined(__GNUC__) && __GNUC__ >= 4
+ is.getline(buffer, bufferSize)
+#else
         (is.clear(is.rdstate() & ~kwsys_ios::ios::failbit),
- is.getline(buffer, bufferSize), is.gcount() > 0))
+ is.getline(buffer, bufferSize), is.gcount() > 0)
+#endif
+ )
     {
     // We have read at least one byte.
     haveData = true;
(0036479)
Mojca Miklavec   
2014-07-28 14:15   
What would be a standalone line reading example that I could test on 10.7?
(0036480)
Brad King   
2014-07-28 14:40   
I attached GetLineFromStream.cxx which ports the code in question out to a standalone example. See comments at the top of the file for instructions.
(0036481)
Mojca Miklavec   
2014-07-28 15:47   
I shortened the length to 11 characters and used the string "abcd.efgh.ijkl.mnop.qrstuv". Below is what I get. I'll play with getline next.


clang++ GetLineFromStream.cxx -o GetLineFromStream1
clang++ GetLineFromStream.cxx -o GetLineFromStream2 -stdlib=libc++

> ./GetLineFromStream1
new loop; ()
    haveNewLine: 0
    leftToRead: -1
    is.gcount(): 10
    is.rdstate(): 4
    std::ios::failbit: 4
    strlen(buffer): 10
new loop; (abcd.efgh.)
    haveNewLine: 0
    leftToRead: -1
    is.gcount(): 10
    is.rdstate(): 4
    std::ios::failbit: 4
    strlen(buffer): 10
new loop; (abcd.efgh.ijkl.mnop.)
    haveNewLine: 0
    leftToRead: -1
    is.gcount(): 7
    is.rdstate(): 0
    std::ios::failbit: 4
    strlen(buffer): 6
end of loop; (abcd.efgh.ijkl.mnop.qrstuv)
    haveNewLine: 1
    leftToRead: -1
    is.gcount(): 7
    is.rdstate(): 0
    std::ios::failbit: 4
    strlen(buffer): 6
26

> ./GetLineFromStream2
new loop; ()
    haveNewLine: 0
    leftToRead: -1
    is.gcount(): 11
    is.rdstate(): 4
    std::ios::failbit: 4
    strlen(buffer): 10
end of loop; (abcd.efgh.)
    haveNewLine: 1
    leftToRead: -1
    is.gcount(): 11
    is.rdstate(): 4
    std::ios::failbit: 4
    strlen(buffer): 10
10
(0036482)
Mojca Miklavec   
2014-07-28 15:52   
If I apply the patch for getline, nothing works any longer (I get empty strings both with libc++ and with libstdc++).
(0036483)
Brad King   
2014-07-28 16:06   
Re 0015039:0036481: Okay, so gcount() returns 10 with libstdc++ and 11 with libc++ even though it only stored 10 characters into the buffer. The latter tricks our logic into thinking that a newline has occurred.

There is an example in C++98 27.6.1.3/22 that shows the relationship of getline, rdstate, and gcount that was used as the basis for GetLineFromStream.
(0036484)
Brad King   
2014-07-28 16:08   
Re 0015039:0036482: Oops, we still want the failbit to be cleared on a partial long line. Just remove the ", is.gcount() > 0" from the end of the (,) group in the condition instead.
(0036485)
Mojca Miklavec   
2014-07-28 16:27   
There seems to be a bug in /usr/bin/clang++. Using a newer clang++ (3.4 or 3.5) solves the problem. Here's a minimal example:

#include <iostream>
#include <fstream>
#include <string>

int main()
{
  std::ifstream is("test.txt");
  std::string line;
  const int bufferSize = 11;
  char buffer[bufferSize];
  is.getline(buffer, bufferSize);
  std::cout << is.gcount() << std::endl;
  return 0;
}

With /usr/bin/clang++ and libc++ the code returns 11 instead of 10.

As a consequence the following code happily concludes that we have reached a newline and stops reading in the next loop:

    std::size_t length = std::strlen(buffer);
    if(length < static_cast<std::size_t>(is.gcount()))
      {
      haveNewline = true;
      }
(0036486)
Mojca Miklavec   
2014-07-28 17:26   
The problem with the initial untested patch was probably the fact that is.getline(buffer, bufferSize) doesn't return "true" in case of success, so the result was never processed even when the execution was successful.

I would need more testing, but I suspect that

    #ifdef BROKEN_STREAMS
    // crutch code
    #else
    // sanity
    #endif

would work better even in the case of my buggy clang 3.3 compiler.

Others pointed out that current code would also fail to work properly if the file would contain characters with code zero. But that's probably never the case anyway.
(0036488)
Brad King   
2014-07-29 09:41   
Re 0015039:0036486: The conditional switch between crutch and sanity would now have to be in the haveNewline decision to do something other than trust gcount. Something like

 haveNewline = !is.eof() && !is.fail();
(0036490)
Mojca Miklavec   
2014-07-29 10:09   
There's a serious problem. The approach works perfectly in the sense that it now continues reading until the end of the line. The problem is that the code seems to have read 11 charters indeed. So the first 10 characters are read and appended to the string. But then the 11th character is missing, and 22nd etc.

Instead of "abcd.efgh.ijkl.mnop.qrstuv" it outputs "abcd.efgh.jkl.mnop.qstuv" with "i" and "r" missing.

How horribly broken behaviour!

(I'm unable to figure out how to "reply to" a particular message in this tracker.)
(0036491)
Brad King   
2014-07-29 10:18   
Re 0015039:0036490: Okay, I do not think we need to try to support such a broken compiler/platform combination.

FYI, my "Re" syntax is written by hand using the "~" operator to refer to a particular comment number:

Re 0015039:0036490
(0036492)
Mojca Miklavec   
2014-07-29 10:28   
If you decide not to support this compiler/stdlib combination, I would be grateful for at least a configure-time test raising an error when this particular type of broken behaviour is detected.

Other than that, std::getline(fin, line) instead of GetLineFromStream(fin, line) might work properly. But I didn't test it on insanely long lines.

While this particular bug has been fixed long ago and OS X 10.7 is not so common any longer, the following related bug is still present in the latest and greatest compiler and affects software on both Mac and FreeBSD where libc++ is now the default: http://llvm.org/bugs/show_bug.cgi?id=17782 [^]
(0036493)
Brad King   
2014-07-29 10:50   
Re 0015039:0036492: How does the latter bug affect CMake?
(0036494)
Mojca Miklavec   
2014-07-29 11:13   
Probably and hopefully it doesn't. I'm just pointing out that some serious (known) parsing bugs are still present in the latest versions. (We spent a lot of time debugging another piece of software that kept crashing on OS X 10.9 when parsing PostScript code.)

(Maintainer of clang compilers for MacPorts is also willing to patch the clang distributed with MacPorts if I manage to figure out what change in 3.4 fixed the problem. But I'm still clueless.)
(0042591)
Kitware Robot   
2016-06-10 14:29   
Resolving issue as `moved`.

This issue tracker is no longer used. Further discussion of this issue may take place in the current CMake Issues page linked in the banner at the top of this page.