[cmake-developers] [PSA] API changes to use strings

Ben Boeckel ben.boeckel at kitware.com
Thu Feb 6 16:21:43 EST 2014


[ TL;DR: If you're thinking you might need .c_str(), try it without and
  possibly avoid a string reallocation :) . ]

Hi,

I am merging a branch into next later today which changes APIs to use
std::string instead of const char* for things such as:

  - target names
  - property names
  - variable names
  - test names

which should *never* be NULL anyways and usually get turned into strings
behind the scenes anyways. The values for them can be NULL, but the
names are always constructed by CMake itself, so should be empty at
worst. The changes which change call sites to not use .c_str() on
variables which are already strings is too conflict-y at this point, but
should land soon after 3.0 is released. The purpose of landing the API
changes early is so that new code may use the new APIs and my branch
doesn't need to keep being rebased to pick up new changes. For those
interested, here's a comparison of before and after my branch:

    % git grep -w c_str master | wc -l
    5356
    % git grep -w c_str HEAD | wc -l
    4048

Other than those changes listed above, there are also a few other API
changes to be aware of:

  - cmStrCmp() can now take a std::string constructor and stores a
    string internally.
  - The cmMakefile::GetPropertyOrDefinition method is removed; it was
    not used anywhere.
  - cmMakefile::Get{Source,Header}Extensions returns a std::set rather
    than a std::vector since it was never modified and is only iterated
    over and searched in other code.
  - cmLocalGenerator::Convert and related methods now take a string as
    the first argument.

Other API changes will be made as well, but they will be backwards
compatible. Methods which take a value for a property or variable will
gain a std::string overload for that argument while the const char*
variant will check for NULL then pass control off to the std::string
overload.

The commit messages probably need some rewording before final merge.
Squashing is also likely to happen (these changes have been extracted
from my main branch which makes some commits much more substantial).

As a PSA, some other patterns I've seen that I'm trying to kill with
fire on the conflict-y part of the branch:

  { // One branch ("kill c_str").
    stream << str.c_str()

    str = otherstr.c_str()

    str = std::string("static") + otherstr

    const char* cstr = str.c_str()
    // then use cstr in strcmp or casted into string later. These are
    // harder to find. Removing .c_str() calls makes it easier.
  }

  { // Possibly separate branches.
    if (Lookup(var))
    {
        std::string str = Lookup(var);
    }

    vec[vec.size()-1] (use *vec.rbegin())

    if (container.size() > 0) (use .empty())

    sprintf(staticbuf, ...) (use snprintf)
    // Not really a performance thing, just something I noticed while
    // poking around the code for the other changes.
  }

If there are any other patterns that belong on this branch (or in the
sweep), please let me know and I'll throw it on the pile.

The goal of all of this? To make CMake faster by removing the chaff in
callgrind runs. String allocations and strlen/memcmp in string/C string
comparisons are way too high and hide other bottlenecks, so I'm whacking
the vast majority of problems with a very large hammer here.

--Ben


More information about the cmake-developers mailing list