MantisBT - CMake
View Issue Details
0015004CMakeCMakepublic2014-07-02 21:492016-06-10 14:31
Chris Foster 
Kitware Robot 
normalminoralways
closedmoved 
linux ubuntu12.04
CMake 2.8.7 
 
0015004: string(REGEX REPLACE) doesn't correctly anchor regex with ^ for multiple matches
string(REGEX REPLACE) doesn't seem to anchor multiple matches correctly with the ^ symbol - subsequent matches seem to be anchored at the start of the next substring.
Run the following cmake script:

-----------
string(REGEX REPLACE "^([^-]*-)" "@\\1#" output "foo-1.2-3")
message(STATUS "\"${output}\"")
-----------

Output is

-- "@foo-#@1.2-0000003"

so the pattern has matched twice, whereas it should only match the input string once at the start - the expected output is

-- "@foo-#1.2-3"
No tags attached.
related to 0014336closed Kitware Robot REGEX REPLACE does not give correct results for a left anchor (^) combined with careted brackets 
Issue History
2014-07-02 21:49Chris FosterNew Issue
2014-07-03 02:47David ColeNote Added: 0036313
2014-07-03 02:52David ColeNote Added: 0036314
2014-07-03 07:47Chris FosterNote Added: 0036317
2014-07-03 09:24Brad KingRelationship addedrelated to 0014336
2016-06-10 14:29Kitware RobotNote Added: 0042579
2016-06-10 14:29Kitware RobotStatusnew => resolved
2016-06-10 14:29Kitware RobotResolutionopen => moved
2016-06-10 14:29Kitware RobotAssigned To => Kitware Robot
2016-06-10 14:31Kitware RobotStatusresolved => closed

Notes
(0036313)
David Cole   
2014-07-03 02:47   
It might make more sense to you if you examine the output of MATCHALL instead:

    string(REGEX MATCHALL "^([^-]*-)" output "fighter-1.2-3")
    message(STATUS "\"${output}\"")

Yields:

    -- "fighter-;1.2-"

The regex matches "fighter-" and also "1.2-" within the input string since you are not considering the remainder of the input in the matching operation. (i.e. when considering this input, there are two sub-strings that go from their beginnings to a "-" character without any preceding "-" characters...)


To do what you want, simply change your regex to consume the entire input string when performing the match... then, there will only be one match. You can do this by appending ".*$" to your regex:

    string(REGEX MATCHALL "^([^-]*-).*$" output "fighter-1.2-3")
    message(STATUS "\"${output}\"")

Now yields:

    -- "fighter-1.2-3"


And:

    string(REGEX REPLACE "^([^-]*-).*$" "@\\1#" output "fighter-1.2-3")
    message(STATUS "\"${output}\"")

yields:

    -- "@fighter-#"


I will refrain from commenting on whether or not a code change to CMake should be considered in response to this issue... but, at least you have a way within existing CMake to achieve your immediate objective.

CMake's REGEX handling is quirky at best, as noted in many many mailing list discussions. People want PCRE, but that's not what it is. Perhaps someday PCRE will be added in parallel to the existing quirky implementation.
(0036314)
David Cole   
2014-07-03 02:52   
Also, you can glean some more about what's happening under the hood by using:

    message(STATUS "${CMAKE_MATCH_0}")
    message(STATUS "${CMAKE_MATCH_1}")
    message(STATUS "${CMAKE_MATCH_2}")


For the REGEX REPLACE originally reported, this yields:

    -- 1.2-
    -- 1.2-
    --


And for the one ending with ".*$" it yields:

    -- fighter-1.2-3
    -- fighter-
    --
(0036317)
Chris Foster   
2014-07-03 07:47   
Thanks for the suggestions David, your workaround is the same one that I ended up using: match the whole string, and pull out the pieces using references as required. In some ways this turned out to be a cleaner way to achieve what I was going for since it's more explicit.

Thanks also for pointing out ${CMAKE_MATCH_N} - I haven't seen that before.

Regarding whether this is indeed a bug or simply a quirky feature - consider the cmake documentation for string(REGEX):

> The following characters have special meaning in regular expressions:
> ^ Matches at beginning of input
> ...

The intent here seems pretty clear, and contrary to the actual behavior. If nothing else, I think it's worth specifying the behavior more clearly.
(0042579)
Kitware Robot   
2016-06-10 14:29   
Resolving issue as `moved`.

This issue tracker is no longer used. Further discussion of this issue may take place in the current CMake Issues page linked in the banner at the top of this page.