[cmake-developers] slow regex implementation in RegularExpression

Rolf Eike Beer eike at sf-mail.de
Thu Nov 24 03:44:47 EST 2011


> On 11/24/2011 12:34 AM, Brad King wrote:
>> On 11/23/2011 5:43 PM, Brad King wrote:
>>> On 11/23/2011 12:44 PM, Brad King wrote:
>>>> However, the above does not need to stand in the way of solving the
problem you're addressing.  We can simply set that goal aside for now
by not exposing TRE in the CMake language anywhere.  Use it just for
cmCTestBuildHandler.
>>>
>>> but people kept going on "the above" part of the debate ;)
>>
>> After some more thought, I've realized that no approach currently
proposed is practical:
>>
>> - cmCTestBuildHandler can use a list of custom regular expressions
>>   so we cannot assume all of them will be compatible with TRE

Use the old regex handler for all user supplied expressions and use TRE
only for the internal ones?

>> - As David Cole pointed out there are many places, like CTest's
>>   "-R" and "-E" options, that use regular expressions in contexts where
we cannot possibly use a policy.  Any attempt to do so in such places
would just turn into a second API to set the policy in the local
context of the regex.

Either use your way as specified below or just document that these places
only understand the old regex language. These places will have very few
regexes for an at most moderate number of lines, so the huge speed
improvement of TRE will not really hit you here.

[...]
>>  (?#OLD)...   # old
>>  (?#TRE)...   # TRE
>>
>> This is quite easy to implement.  Just take the currently proposed
patch that replaces use of cmsys::RegularExpression with the new
cmFastRegularExpression wrapper (perhaps renamed cmRegularExpression).
Inside the wrapper look for a leading comment of the above form to
decide which regex impl to use internally.  Then strip off the prefix
and pass the rest of the regex to the underlying implementation. Once
this is done update all the default warning and error regular
expressions that CTest uses.  Add the (?#TRE) prefix to them.
>>
>> This approach will solve the speed problem, give people access to the
TRE extended features when they want it anywhere CMake already uses a
regex, has no compatibility problems, is a very narrow second
interface, and is extensible for future optional regex behavior.

> I like that proposal a lot, although I'm afraid it is a bit verbose.
Some of my regexes are already pretty lengthy, pushing the 80-columns
limit.

For things that are inside CMakeLists.txt we then could introduce a global
flag that decides what to do with unprefixed expressions:

SET(CMAKE_PREFERED_REGEX_STYLE TRE)

The default will of course will be "CMake" or however this will be called.
And then a lazy user may just change it to something sensible ;)

Eike




More information about the cmake-developers mailing list