[cmake-developers] slow regex implementation in RegularExpression

David Cole david.cole at kitware.com
Wed Nov 23 14:12:30 EST 2011


On Wed, Nov 23, 2011 at 2:09 PM, David Cole <david.cole at kitware.com> wrote:
> On Wed, Nov 23, 2011 at 2:03 PM, Bill Hoffman <bill.hoffman at kitware.com> wrote:
>> On 11/23/2011 12:51 PM, Brad King wrote:
>>>
>>> On 11/23/2011 12:48 PM, Brad King wrote:
>>>>
>>>> On 11/23/2011 12:43 PM, Brad King wrote:
>>>>>
>>>>> On 11/23/2011 12:34 PM, Alexandru Ciobanu wrote:
>>>>>>
>>>>>> The regex in question is:
>>>>>>     ^[^][:/*?]+\$
>>>>
>>>>  "To include a literal ] in the list, make it either the first item"
>>>
>>> It must be the "[:" in this regex that TRE sees as special since it
>>> allows expressions like "[:digit:]" inside a bracket expression.
>>>
>>> Still, this is a case that my proposed policy would pick up.
>>>
>>> -Brad
>>>
>> I am still very wary about this policy.  For 99% of folks the current regex
>> is just fine.  Making them "eventually" change to get the new regex is
>> making them do work that they don't need or want.  I would rather have two
>> API's.   I just don't see the big upside of TRE, and I see this causing pain
>> for lots and lots of folks if we push them to make the change.  CMake has
>> most likely 100,000 or more users at this point.  A change like this could
>> easily inflict a man years of effort onto the world, and should not be taken
>> lightly.
>>
>> -Bill
>> --
>>
>> Powered by www.kitware.com
>>
>> Visit other Kitware open-source projects at
>> http://www.kitware.com/opensource/opensource.html
>>
>> Please keep messages on-topic and check the CMake FAQ at:
>> http://www.cmake.org/Wiki/CMake_FAQ
>>
>> Follow this link to subscribe/unsubscribe:
>> http://public.kitware.com/cgi-bin/mailman/listinfo/cmake-developers
>>
>
> Big upside:    (quoting from Alexandru Ciobanu's email of Nov. 17th
> earlier in this thread)
>
> The impact on the build time is pretty dramatic:
>     CMake: 7h39m
>     CMake + TRE: 1h06m
>

And although there is a big upside, we do still have to be careful.

We have to remember that regexes are used in the context of ctest -D
invocations, ctest -S script running and cmake -P running, too, where
policies are not really a reliable mechanism. So in addition to having
a careful policy, we also have to decide what to do in those cases.
The case that is in question here for the big performance gain is
ctest running and filtering build output based on regexes. No cmake
policy mechanism in sight for that scenario.....


More information about the cmake-developers mailing list