View Issue Details Jump to Notes ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0015377CMakeCMakepublic2015-01-27 18:342015-07-08 08:57
ReporterOngun Kanat 
Assigned ToBrad King 
PlatformLinux x86_64OSArch LinuxOS VersionRolling
Product VersionCMake 3.1.1 
Target VersionCMake 3.1.3Fixed in VersionCMake 3.1.3 
Summary0015377: CMake cannot test compiler features in Turkish locale
DescriptionWhen using Turkish UTF-8 locale(tr_TR.UTF-8) CMake exits with error below

   CMake Error at /usr/share/cmake-3.1/Modules/CMakeTestCCompiler.cmake:78 (CMAKE_DETERMINE_COMPILE_FEATURES):

Exporting LANG and LC_ALL variables as en_US.UTF-8 fixes problem temporarily.
Steps To Reproduce- Download any source with CMake build support
- Run
  $ export LANG=tr_TR.UTF-8
  $ export LC_ALL=tr_TR.UTF-8
  $ cmake
- It will exit.
Additional InformationI suspect that there may be a Turkish 'I' problem in source code. If it does a uppercase/lowercase conversion there is a risk that the result of conversion wrong/non-English.

For detailed info check: [^]

I'm also adding trace output of cmake.
Tagslinux, locale, make
Attached Filestxt file icon cmakeout.txt [^] (336,038 bytes) 2015-01-27 18:34
patch file icon 0001-Encoding-Only-call-setlocale-where-required.patch [^] (3,876 bytes) 2015-02-06 12:19 [Show Content]


Ben Boeckel (developer)
2015-02-02 13:54

Should we just force the locale to be either en_US.UTF-8 or C in main()? Or maybe just for try_* functions?
Clinton Stimpson (developer)
2015-02-02 14:27

Probably by setting the locale to C in the try_* functions.
See FindSubversion.cmake as an example.
Ongun Kanat (reporter)
2015-02-04 17:51

Does not changing locale to "C" affect files with UTF-8 names. There may be files with non-ascii names.
Stephen Kelly (developer)
2015-02-05 14:15

To reproduce on Ubuntu, install the language-pack-tr package. Then:

  $ cat turkish.cmake







  $ LC_ALL=tr_TR.UTF-8 cmake -P turkish.cmake
  CMake Error at turkish.cmake:15 (macro3_i):
    Unknown CMake command "macro3_i".

  $ cat turkish_if.cmake


  $ LC_ALL=tr_TR.UTF-8 cmake -P turkish_if.cmake
  CMake Error at turkish_if.cmake:2 (IF):
    Unknown CMake command "IF".

This bisects to commit v3.1.0-rc1~406^2~1 (Encoding: Add setlocale() to applications., 2014-05-30).

I haven't followed what has been going on regarding encodings, but it seems that if the locale is going to come from the environment, we'd have to use ICU or so to do case insensitive comparisons for things like that. ToLower won't cut it.

Also, questions come up about whether TOUPPER should be locale aware so that

 string(TOUPPER "Straße" OUT)

results in "STRASSE" etc. Currently it outputs STRAßE, which is 'wrong'. Or should a new command should be added for locale aware uppercasing etc.

Also whether need new commands like


are needed, whether list(SORT) should be locale aware etc. All that is stuff that ICU gives.
Ben Boeckel (developer)
2015-02-05 15:05

Well, LOCALE_STREQUAL makes no sense because that is closer to Unicode normalization rules than anything else (something I don't want to touch). As for sorting and string(TOUPPER) and string(TOLOWER), having LOCALE_ versions of those makes sense. Outside of LOCALE_* bits, we should probably just force en_US.UTF-8 while saving LC_ALL in main() for use at those places. "Just" need to put icu into CMake's build tree with support for an external one.

Also, seems your commit name is off? I see it as 730e386291cb7aad8f532125216b2ec71d710748 while v3.1.0-rc1~406^2~1 is b70295760c22414ca80f51704ee1ab63872e0a7a.
Clinton Stimpson (developer)
2015-02-05 16:28

Thanks Stephen for narrowing that down and your comments Ben.

Since this is a regression, I see a few possible ways to get the old behavior back:

Change SystemTools.cxx to use:
std::toupper(..., std::locale("C"));
std::tolower(..., std::locale("C"));

Don't assume cmCommand::GetName() returns a lower case string, and always call to lower() on it while comparing with another tolower'd string.

Use SystemTools::Strucmp() for all case independent comparisons.

Remove the setlocale() call in the commit identified by Stephen (this will cause other regressions).

And yes, its a good question whether string(TOUPPER ...) should be locale aware.
But I think introducing new string(LOCALE_*) options is separate from fixing the regression.

Any preference for the regression fix, or other ideas?
Brad King (manager)
2015-02-06 11:40

Re 0015377:0037929: Will removing setlocale cause 3.1 to regress from 3.0 capabilities?

Currently the setlocale() call uses only LC_CTYPE. Why is that necessary/sufficient to address 0014934?

Why do we need a locale to handle UTF-8 strings and file names if our implementation is 8-bit clean? I can't imagine every tool in the world needs to link/distribute libicu and a huge amount of locale data to deal with non-ASCII file names.

We don't provide any functionality for conversion or normalization of strings beyond TOUPPER, TOLOWER, and case-insensitive command names. All of these are defined by CMake only for ASCII characters right now.
Clinton Stimpson (developer)
2015-02-06 12:00

I also don't think we need ICU.

libarchive uses nl_langinfo(CODESET) for iconv, which requires setlocale(LC_CTYPE) to work with non-ascii filenames.

Perhaps we can just move the setlocale() to go around libarchive calls. This is probably a better way to go.
Clinton Stimpson (developer)
2015-02-06 12:20

I've attached a patch to remove setlocale() which fixes this Turkish issue, plus it adds setlocale() calls for libarchive to keep the fix for bug 0014934.
Brad King (manager)
2015-02-06 13:40

Re 0015377:0037936: Thanks. Based on that I constructed these commits on top of 3.1.2:

 Do not call setlocale() globally in CMake applications;a=commitdiff;h=87be2e14 [^]

 Add setlocale() calls around use of libarchive APIs;a=commitdiff;h=cd408d93 [^]

Please test.
Clinton Stimpson (developer)
2015-02-06 14:34

I tested stage/no-global-setlocale on examples provided in this bug report and also in bug 0014934, and it works fine for me.
Brad King (manager)
2015-02-10 09:50

The changes linked in 0015377:0037937 have been merged to the 'release' branch for 3.2.0 and also to 'release-3.1' for 3.1.3.
Robert Maynard (manager)
2015-07-08 08:57

Closing resolved issues that have not been updated in more than 4 months.

 Issue History
Date Modified Username Field Change
2015-01-27 18:34 Ongun Kanat New Issue
2015-01-27 18:34 Ongun Kanat File Added: cmakeout.txt
2015-01-27 18:36 Ongun Kanat Tag Attached: linux
2015-01-27 18:36 Ongun Kanat Tag Attached: make
2015-01-27 18:36 Ongun Kanat Tag Attached: locale
2015-02-02 13:54 Ben Boeckel Note Added: 0037879
2015-02-02 14:27 Clinton Stimpson Note Added: 0037880
2015-02-04 17:51 Ongun Kanat Note Added: 0037918
2015-02-05 14:15 Stephen Kelly Note Added: 0037927
2015-02-05 15:05 Ben Boeckel Note Added: 0037928
2015-02-05 16:28 Clinton Stimpson Note Added: 0037929
2015-02-06 11:40 Brad King Note Added: 0037934
2015-02-06 12:00 Clinton Stimpson Note Added: 0037935
2015-02-06 12:19 Clinton Stimpson File Added: 0001-Encoding-Only-call-setlocale-where-required.patch
2015-02-06 12:20 Clinton Stimpson Note Added: 0037936
2015-02-06 13:30 Brad King Assigned To => Brad King
2015-02-06 13:30 Brad King Status new => assigned
2015-02-06 13:30 Brad King Target Version => CMake 3.1.3
2015-02-06 13:40 Brad King Note Added: 0037937
2015-02-06 14:34 Clinton Stimpson Note Added: 0037941
2015-02-10 09:50 Brad King Note Added: 0037951
2015-02-10 09:50 Brad King Status assigned => resolved
2015-02-10 09:50 Brad King Resolution open => fixed
2015-02-10 09:50 Brad King Fixed in Version => CMake 3.1.3
2015-07-08 08:57 Robert Maynard Note Added: 0039046
2015-07-08 08:57 Robert Maynard Status resolved => closed

Copyright © 2000 - 2018 MantisBT Team