[cmake-developers] Developer tasks - Refactoring

Stephen Kelly steveire at gmail.com
Tue Jun 7 01:42:58 EDT 2016


On 06/06/2016 11:14 PM, Daniel Pfeifer wrote:
> Here is what I found.
>
> * SetLinkScriptShell is called from two places:
> cmMakefileExecutableTargetGenerator::WriteExecutableRule and
> cmMakefileLibraryTargetGenerator::WriteLibraryRules.
> * We can instantiate a cmOutputConverter in those places and then pass
> it along the call tree in all branches that contain calls to Convert.

I don't think that's the right approach. It just worsens spaghetti code
and an antipattern in CMake which is already too widespread.

Stepping back a bit to look at the more general problem in CMake:

It is fairly common in CMake code to have methods used throughout the
codebase which take parameters and then use the parameters in if()
conditions to change the behavior of the method in significant ways.

This is an anti-pattern and a code smell. It couples all code which
calls the method and makes it harder to refactor, as you now see with
Convert(). It violates:

 https://en.wikipedia.org/wiki/Interface_segregation_principle

I fixed some of the instances of it in my
15-years-worth-of-cmake-clean-up-in-9-months last year.

Unfortunately because it is so common in CMake, this pattern still grows
in the codebase. See eg cmake::GetSuppressDeprecatedWarnings and related
methods which take a cmMakefile which defaults to null. Those methods
exist since v3.5.0-rc1~165^2 (Explicitly enable deprecated warnings by
default., 2015-11-29).

Notice how there is only ONE caller where the cmMakefile is not-null!

So it makes sense to take the condition out of the inside of the method
and put it in the call site. It simplifies everything:

 https://github.com/steveire/CMake/commit/86d13501f3253c5489b8f6abf21b6a983481a77c

I didn't manage to get enough interest in that commit or the rest of the
branch to get it merged:

 http://thread.gmane.org/gmane.comp.programming.tools.cmake.devel/15607

but I would like your review of the branch if you have time.

Anyway, back to Convert()...

The reason I listed this refactoring of removal of cmOutputConverter as
a base of cmLocalGenerator in the OP of this thread is to create
separation of compute-time code from generate-time code. So let's keep
that in mind as a goal.

Generate time should be just for generating. If we have that strict
design, it benefits the daemon because that needs to configure and
compute, and it benefits a multi-toolchain feature architecture because
that needs to configure for multiple toolchains, compute across the
result and then generate once (I'm talking about an idea evolved from
http://thread.gmane.org/gmane.comp.programming.tools.cmake.devel/7272 in
case you never followed that).

Notice that Convert() is called at configure time, compute time and
generate time currently.

In the very end:

1) It should only be called at configure and compute time
2) CMake should create some state at compute time using APIs similar to
Convert()
3) Configure-time and compute-time Convert() behavior and APIs will not
necessarily be the same because the needs are different - you will see
that if you analyze the users of the Convert method and the OutputFormat
enum.

So, look into the Convert() methods. There are two overloads. Both do
some computation, then call ConvertToOutputFormat().

What does ConvertToOutputFormat() do? It does nothing at all if the
OutputFormat is UNCHANGED.

What is the default value of OutputFormat? It is UNCHANGED!

I guess you can see where this is going:

1) The parts of Convert() which come before the ConvertToOutputFormat()
calls should be extracted into two new overloads, maybe called
RawConvert(). The name is not great, but in the end the existing
Convert() methods can probably be deleted and the Raw* ones renamed.
2) Convert() should call RawConvert() and pass the result to
ConvertToOutputFormat().
3) Callers of Convert() should be ported to RawConvert(). A quick way to
do most of this is to find all callers of Convert() which pass only two
arguments and change those to RawConvert(). Note that there are other
callers which explicitly pass UNCHANGED!
4) Analyze remaining callers of Convert() and inspect the OutputFormat
they pass. For example see the output of 'git grep -w RESPONSE'. That
enum value can probably be removed and the EscapeForShell inlined into
the callers.
5) Do the same analysis for the other enum values. SHELL is used a lot.
In the end some new method might make sense for the SHELL case instead
of that enum.
6) Look at the result and see if there is an obvious way to go from
here, including for the LinkScriptShell case. The case where that
condition is used might be the last remaining case when the above is
done, and a different design can be considered.
7) The OutputFormat enum and the LinkScriptShell can probably be removed
in the end, with their callers ported to more-purposeful methods.

The whole point of this is to reach the essence of what all this
Convert() stuff in CMake is really about. It's currently very confused
and fixing that should be easy in part (such as porting to
RawConvert()), and have an immediate positive effect.

Does that make sense?

Thanks,

Steve.



More information about the cmake-developers mailing list