View Issue Details Jump to Notes ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0015004CMakeCMakepublic2014-07-02 21:492016-06-10 14:31
ReporterChris Foster 
Assigned ToKitware Robot 
PrioritynormalSeverityminorReproducibilityalways
StatusclosedResolutionmoved 
PlatformOSlinux ubuntuOS Version12.04
Product VersionCMake 2.8.7 
Target VersionFixed in Version 
Summary0015004: string(REGEX REPLACE) doesn't correctly anchor regex with ^ for multiple matches
Descriptionstring(REGEX REPLACE) doesn't seem to anchor multiple matches correctly with the ^ symbol - subsequent matches seem to be anchored at the start of the next substring.
Steps To ReproduceRun the following cmake script:

-----------
string(REGEX REPLACE "^([^-]*-)" "@\\1#" output "foo-1.2-3")
message(STATUS "\"${output}\"")
-----------

Output is

-- "@foo-#@1.2-0000003"

so the pattern has matched twice, whereas it should only match the input string once at the start - the expected output is

-- "@foo-#1.2-3"
TagsNo tags attached.
Attached Files

 Relationships
related to 0014336closedKitware Robot REGEX REPLACE does not give correct results for a left anchor (^) combined with careted brackets 

  Notes
(0036313)
David Cole (manager)
2014-07-03 02:47

It might make more sense to you if you examine the output of MATCHALL instead:

    string(REGEX MATCHALL "^([^-]*-)" output "fighter-1.2-3")
    message(STATUS "\"${output}\"")

Yields:

    -- "fighter-;1.2-"

The regex matches "fighter-" and also "1.2-" within the input string since you are not considering the remainder of the input in the matching operation. (i.e. when considering this input, there are two sub-strings that go from their beginnings to a "-" character without any preceding "-" characters...)


To do what you want, simply change your regex to consume the entire input string when performing the match... then, there will only be one match. You can do this by appending ".*$" to your regex:

    string(REGEX MATCHALL "^([^-]*-).*$" output "fighter-1.2-3")
    message(STATUS "\"${output}\"")

Now yields:

    -- "fighter-1.2-3"


And:

    string(REGEX REPLACE "^([^-]*-).*$" "@\\1#" output "fighter-1.2-3")
    message(STATUS "\"${output}\"")

yields:

    -- "@fighter-#"


I will refrain from commenting on whether or not a code change to CMake should be considered in response to this issue... but, at least you have a way within existing CMake to achieve your immediate objective.

CMake's REGEX handling is quirky at best, as noted in many many mailing list discussions. People want PCRE, but that's not what it is. Perhaps someday PCRE will be added in parallel to the existing quirky implementation.
(0036314)
David Cole (manager)
2014-07-03 02:52

Also, you can glean some more about what's happening under the hood by using:

    message(STATUS "${CMAKE_MATCH_0}")
    message(STATUS "${CMAKE_MATCH_1}")
    message(STATUS "${CMAKE_MATCH_2}")


For the REGEX REPLACE originally reported, this yields:

    -- 1.2-
    -- 1.2-
    --


And for the one ending with ".*$" it yields:

    -- fighter-1.2-3
    -- fighter-
    --
(0036317)
Chris Foster (reporter)
2014-07-03 07:47

Thanks for the suggestions David, your workaround is the same one that I ended up using: match the whole string, and pull out the pieces using references as required. In some ways this turned out to be a cleaner way to achieve what I was going for since it's more explicit.

Thanks also for pointing out ${CMAKE_MATCH_N} - I haven't seen that before.

Regarding whether this is indeed a bug or simply a quirky feature - consider the cmake documentation for string(REGEX):

> The following characters have special meaning in regular expressions:
> ^ Matches at beginning of input
> ...

The intent here seems pretty clear, and contrary to the actual behavior. If nothing else, I think it's worth specifying the behavior more clearly.
(0042579)
Kitware Robot (administrator)
2016-06-10 14:29

Resolving issue as `moved`.

This issue tracker is no longer used. Further discussion of this issue may take place in the current CMake Issues page linked in the banner at the top of this page.

 Issue History
Date Modified Username Field Change
2014-07-02 21:49 Chris Foster New Issue
2014-07-03 02:47 David Cole Note Added: 0036313
2014-07-03 02:52 David Cole Note Added: 0036314
2014-07-03 07:47 Chris Foster Note Added: 0036317
2014-07-03 09:24 Brad King Relationship added related to 0014336
2016-06-10 14:29 Kitware Robot Note Added: 0042579
2016-06-10 14:29 Kitware Robot Status new => resolved
2016-06-10 14:29 Kitware Robot Resolution open => moved
2016-06-10 14:29 Kitware Robot Assigned To => Kitware Robot
2016-06-10 14:31 Kitware Robot Status resolved => closed


Copyright © 2000 - 2018 MantisBT Team