[cmake-developers] slow regex implementation in RegularExpression

Alexander Neundorf neundorf at kde.org
Thu Nov 24 13:45:07 EST 2011


On Wednesday 23 November 2011, David Cole wrote:
> On Wed, Nov 23, 2011 at 2:09 PM, David Cole <david.cole at kitware.com> wrote:
> > On Wed, Nov 23, 2011 at 2:03 PM, Bill Hoffman <bill.hoffman at kitware.com> 
wrote:
> >> On 11/23/2011 12:51 PM, Brad King wrote:
> >>> On 11/23/2011 12:48 PM, Brad King wrote:
> >>>> On 11/23/2011 12:43 PM, Brad King wrote:
> >>>>> On 11/23/2011 12:34 PM, Alexandru Ciobanu wrote:
> >>>>>> The regex in question is:
> >>>>>>     ^[^][:/*?]+\$
> >>>> 
> >>>>  "To include a literal ] in the list, make it either the first item"
> >>> 
> >>> It must be the "[:" in this regex that TRE sees as special since it
> >>> allows expressions like "[:digit:]" inside a bracket expression.
> >>> 
> >>> Still, this is a case that my proposed policy would pick up.
> >>> 
> >>> -Brad
> >> 
> >> I am still very wary about this policy.  For 99% of folks the current
> >> regex is just fine.  Making them "eventually" change to get the new
> >> regex is making them do work that they don't need or want.  I would
> >> rather have two API's.   I just don't see the big upside of TRE, and I
> >> see this causing pain for lots and lots of folks if we push them to
> >> make the change.  CMake has most likely 100,000 or more users at this
> >> point.  A change like this could easily inflict a man years of effort
> >> onto the world, and should not be taken lightly.
> >> 
> >> -Bill
> >> --
> >> 
> >> Powered by www.kitware.com
> >> 
> >> Visit other Kitware open-source projects at
> >> http://www.kitware.com/opensource/opensource.html
> >> 
> >> Please keep messages on-topic and check the CMake FAQ at:
> >> http://www.cmake.org/Wiki/CMake_FAQ
> >> 
> >> Follow this link to subscribe/unsubscribe:
> >> http://public.kitware.com/cgi-bin/mailman/listinfo/cmake-developers
> > 
> > Big upside:    (quoting from Alexandru Ciobanu's email of Nov. 17th
> > earlier in this thread)
> > 
> > The impact on the build time is pretty dramatic:
> >     CMake: 7h39m
> >     CMake + TRE: 1h06m
> 
> And although there is a big upside, we do still have to be careful.
> 
> We have to remember that regexes are used in the context of ctest -D
> invocations, ctest -S script running and cmake -P running, too, where
> policies are not really a reliable mechanism. So in addition to having
> a careful policy, we also have to decide what to do in those cases.
> The case that is in question here for the big performance gain is
> ctest running and filtering build output based on regexes. No cmake
> policy mechanism in sight for that scenario.....

Also, AFAIK supports more stuff than the current regexps. This is good.
But doesn't that mean that potentially there could be regexps existing which 
right now don't have a special meaning, but with TRE they suddenly get a 
special meaning and change the matching  ?

Does e.g. the following produce the same result ? I get ":" as match here.

set(text "hello world [:digit:]")
if ("${text}" MATCHES ".+([:digit:])")
  message(STATUS "match: \"${CMAKE_MATCH_1}\"")
endif()

Alex


More information about the cmake-developers mailing list