Notes |
|
(0036476)
|
Brad King
|
2014-07-28 12:57
|
|
As discussed in the mailing list thread, the GetLineFromStream interface cannot be fully replaced by getline. However, the implementation of GetLineFromStream should be made to work with libc++ on OS X 10.7 (a combination not currently available to me). Basically GetLineFromStream uses gcount() after getline() instead of trusting the stream state (which was buggy on OS X 10.3 and older HP platforms). It is standard conforming AFAIK, so this "bug" is actually the lack of a workaround for a bug on OS X 10.7 in libc++.
The crux of the loop is here:
http://cmake.org/gitweb?p=cmake.git;a=blob;f=Source/kwsys/SystemTools.cxx;hb=v3.0.0#l4192 [^]
I suspect gcount() comes back incorrectly, either as zero, or such that the following length comparison is tricked into setting haveNewline. Can someone with access to this platform combination debug this, please? |
|
|
(0036477)
|
Brad King
|
2014-07-28 13:02
|
|
One could also try a patch like this (untested):
diff --git a/Source/kwsys/SystemTools.cxx b/Source/kwsys/SystemTools.cxx
index db94510..6f17b91 100644
--- a/Source/kwsys/SystemTools.cxx
+++ b/Source/kwsys/SystemTools.cxx
@@ -4323,8 +4323,13 @@ bool SystemTools::GetLineFromStream(kwsys_ios::istream& is,
// been reached. Clear the fail bit just before reading.
while(!haveNewline &&
leftToRead != 0 &&
+#if defined(__GNUC__) && __GNUC__ >= 4
+ is.getline(buffer, bufferSize)
+#else
(is.clear(is.rdstate() & ~kwsys_ios::ios::failbit),
- is.getline(buffer, bufferSize), is.gcount() > 0))
+ is.getline(buffer, bufferSize), is.gcount() > 0)
+#endif
+ )
{
// We have read at least one byte.
haveData = true; |
|
|
(0036479)
|
Mojca Miklavec
|
2014-07-28 14:15
|
|
What would be a standalone line reading example that I could test on 10.7? |
|
|
(0036480)
|
Brad King
|
2014-07-28 14:40
|
|
I attached GetLineFromStream.cxx which ports the code in question out to a standalone example. See comments at the top of the file for instructions. |
|
|
(0036481)
|
Mojca Miklavec
|
2014-07-28 15:47
|
|
I shortened the length to 11 characters and used the string "abcd.efgh.ijkl.mnop.qrstuv". Below is what I get. I'll play with getline next.
clang++ GetLineFromStream.cxx -o GetLineFromStream1
clang++ GetLineFromStream.cxx -o GetLineFromStream2 -stdlib=libc++
> ./GetLineFromStream1
new loop; ()
haveNewLine: 0
leftToRead: -1
is.gcount(): 10
is.rdstate(): 4
std::ios::failbit: 4
strlen(buffer): 10
new loop; (abcd.efgh.)
haveNewLine: 0
leftToRead: -1
is.gcount(): 10
is.rdstate(): 4
std::ios::failbit: 4
strlen(buffer): 10
new loop; (abcd.efgh.ijkl.mnop.)
haveNewLine: 0
leftToRead: -1
is.gcount(): 7
is.rdstate(): 0
std::ios::failbit: 4
strlen(buffer): 6
end of loop; (abcd.efgh.ijkl.mnop.qrstuv)
haveNewLine: 1
leftToRead: -1
is.gcount(): 7
is.rdstate(): 0
std::ios::failbit: 4
strlen(buffer): 6
26
> ./GetLineFromStream2
new loop; ()
haveNewLine: 0
leftToRead: -1
is.gcount(): 11
is.rdstate(): 4
std::ios::failbit: 4
strlen(buffer): 10
end of loop; (abcd.efgh.)
haveNewLine: 1
leftToRead: -1
is.gcount(): 11
is.rdstate(): 4
std::ios::failbit: 4
strlen(buffer): 10
10 |
|
|
(0036482)
|
Mojca Miklavec
|
2014-07-28 15:52
|
|
If I apply the patch for getline, nothing works any longer (I get empty strings both with libc++ and with libstdc++). |
|
|
(0036483)
|
Brad King
|
2014-07-28 16:06
|
|
Re 0015039:0036481: Okay, so gcount() returns 10 with libstdc++ and 11 with libc++ even though it only stored 10 characters into the buffer. The latter tricks our logic into thinking that a newline has occurred.
There is an example in C++98 27.6.1.3/22 that shows the relationship of getline, rdstate, and gcount that was used as the basis for GetLineFromStream. |
|
|
(0036484)
|
Brad King
|
2014-07-28 16:08
|
|
Re 0015039:0036482: Oops, we still want the failbit to be cleared on a partial long line. Just remove the ", is.gcount() > 0" from the end of the (,) group in the condition instead. |
|
|
(0036485)
|
Mojca Miklavec
|
2014-07-28 16:27
|
|
There seems to be a bug in /usr/bin/clang++. Using a newer clang++ (3.4 or 3.5) solves the problem. Here's a minimal example:
#include <iostream>
#include <fstream>
#include <string>
int main()
{
std::ifstream is("test.txt");
std::string line;
const int bufferSize = 11;
char buffer[bufferSize];
is.getline(buffer, bufferSize);
std::cout << is.gcount() << std::endl;
return 0;
}
With /usr/bin/clang++ and libc++ the code returns 11 instead of 10.
As a consequence the following code happily concludes that we have reached a newline and stops reading in the next loop:
std::size_t length = std::strlen(buffer);
if(length < static_cast<std::size_t>(is.gcount()))
{
haveNewline = true;
} |
|
|
(0036486)
|
Mojca Miklavec
|
2014-07-28 17:26
|
|
The problem with the initial untested patch was probably the fact that is.getline(buffer, bufferSize) doesn't return "true" in case of success, so the result was never processed even when the execution was successful.
I would need more testing, but I suspect that
#ifdef BROKEN_STREAMS
// crutch code
#else
// sanity
#endif
would work better even in the case of my buggy clang 3.3 compiler.
Others pointed out that current code would also fail to work properly if the file would contain characters with code zero. But that's probably never the case anyway. |
|
|
(0036488)
|
Brad King
|
2014-07-29 09:41
|
|
Re 0015039:0036486: The conditional switch between crutch and sanity would now have to be in the haveNewline decision to do something other than trust gcount. Something like
haveNewline = !is.eof() && !is.fail(); |
|
|
(0036490)
|
Mojca Miklavec
|
2014-07-29 10:09
|
|
There's a serious problem. The approach works perfectly in the sense that it now continues reading until the end of the line. The problem is that the code seems to have read 11 charters indeed. So the first 10 characters are read and appended to the string. But then the 11th character is missing, and 22nd etc.
Instead of "abcd.efgh.ijkl.mnop.qrstuv" it outputs "abcd.efgh.jkl.mnop.qstuv" with "i" and "r" missing.
How horribly broken behaviour!
(I'm unable to figure out how to "reply to" a particular message in this tracker.) |
|
|
(0036491)
|
Brad King
|
2014-07-29 10:18
|
|
Re 0015039:0036490: Okay, I do not think we need to try to support such a broken compiler/platform combination.
FYI, my "Re" syntax is written by hand using the "~" operator to refer to a particular comment number:
Re 0015039:0036490 |
|
|
(0036492)
|
Mojca Miklavec
|
2014-07-29 10:28
|
|
If you decide not to support this compiler/stdlib combination, I would be grateful for at least a configure-time test raising an error when this particular type of broken behaviour is detected.
Other than that, std::getline(fin, line) instead of GetLineFromStream(fin, line) might work properly. But I didn't test it on insanely long lines.
While this particular bug has been fixed long ago and OS X 10.7 is not so common any longer, the following related bug is still present in the latest and greatest compiler and affects software on both Mac and FreeBSD where libc++ is now the default: http://llvm.org/bugs/show_bug.cgi?id=17782 [^] |
|
|
(0036493)
|
Brad King
|
2014-07-29 10:50
|
|
|
|
(0036494)
|
Mojca Miklavec
|
2014-07-29 11:13
|
|
Probably and hopefully it doesn't. I'm just pointing out that some serious (known) parsing bugs are still present in the latest versions. (We spent a lot of time debugging another piece of software that kept crashing on OS X 10.9 when parsing PostScript code.)
(Maintainer of clang compilers for MacPorts is also willing to patch the clang distributed with MacPorts if I manage to figure out what change in 3.4 fixed the problem. But I'm still clueless.) |
|
|
(0042591)
|
Kitware Robot
|
2016-06-10 14:29
|
|
Resolving issue as `moved`.
This issue tracker is no longer used. Further discussion of this issue may take place in the current CMake Issues page linked in the banner at the top of this page. |
|