[CMake] Reusing an already built object

Michael Hertling mhertling at online.de
Tue Jun 15 20:08:19 EDT 2010


On 06/13/2010 10:08 PM, Linghua Tseng wrote:
> On 06/12/2010 23:30:50 Michael Hertling wrote:
>> On 06/12/2010 04:10 AM, Linghua Tseng wrote:
>>> ...
>> Look at the following CMakeLists.txt:
>>
>> project(main)
>> cmake_minimum_required(VERSION 2.8)
>>
>> add_library(gen1 STATIC src1.c)
>> set(gen_src2_SRCS gen_src2.c)
>> add_executable(gen_src2 ${gen_src2_SRCS})
>> target_link_libraries(gen_src2 gen1)
>> get_target_property(gen_src2_EXE gen_src2 LOCATION)
>> add_custom_command(
>>        OUTPUT src2.c
>>        COMMAND ${gen_src2_EXE}
>>        ARGS > src2.c
>>        DEPENDS gen_src2
>> )
>>
>> add_library(gen2 STATIC ${PROJECT_BINARY_DIR}/src2.c)
>> set(gen_src3_SRCS gen_src3.c)
>> add_executable(gen_src3 ${gen_src3_SRCS})
>> target_link_libraries(gen_src3 gen2 gen1)
>> get_target_property(gen_src3_EXE gen_src3 LOCATION)
>> add_custom_command(
>>        OUTPUT src3.c
>>        COMMAND ${gen_src3_EXE}
>>        ARGS > src3.c
>>        DEPENDS gen_src3
>> )
>>
>> add_library(gen3 STATIC ${PROJECT_BINARY_DIR}/src3.c)
>> set(gen_src4_SRCS gen_src4.c)
>> add_executable(gen_src4 ${gen_src4_SRCS})
>> target_link_libraries(gen_src4 gen3 gen2 gen1)
>> get_target_property(gen_src4_EXE gen_src4 LOCATION)
>> add_custom_command(
>>        OUTPUT src4.c
>>        COMMAND ${gen_src4_EXE}
>>        ARGS > src4.c
>>        DEPENDS gen_src4
>> )
>>
>> set(mylib_SRCS src1.c
>>               ${PROJECT_BINARY_DIR}/src2.c
>>               ${PROJECT_BINARY_DIR}/src3.c
>>               ${PROJECT_BINARY_DIR}/src4.c
>> )
>> add_library(mylib ${mylib_SRCS})
>>
>> After cmaking, a "make | grep Building" yields:
>>
>> [  6%] Building C object CMakeFiles/gen1.dir/src1.c.o
>> [ 13%] Building C object CMakeFiles/gen_src2.dir/gen_src2.c.o
>> [ 26%] Building C object CMakeFiles/gen2.dir/src2.c.o
>> [ 33%] Building C object CMakeFiles/gen_src3.dir/gen_src3.c.o
>> [ 46%] Building C object CMakeFiles/gen3.dir/src3.c.o
>> [ 53%] Building C object CMakeFiles/gen_src4.dir/gen_src4.c.o
>> [ 66%] Building C object CMakeFiles/mylib.dir/src1.c.o
>> [ 73%] Building C object CMakeFiles/mylib.dir/src2.c.o
>> [ 80%] Building C object CMakeFiles/mylib.dir/src3.c.o
>> [ 86%] Building C object CMakeFiles/mylib.dir/src4.c.o
>>
>> Thus, the sources whose object files will be incorporated in the
>> executables as well as in your library are compiled just twice, and this
>> is unavoidable, or at least shouldn't be bypassed, as AN has pointed out.
> 
> I knew this approach, and I also found that I can write this line in the end of CMakeLists.txt:
>   add_library(mylib gen4 gen3 gen2 gen1)
> instead of re-listing sources.
> (Assume that you wrote: add_library(gen4 STATIC ...) before)

This won't work as desired since gen{4,3,2,1} aren't source files; in
fact, it will result in the same "Cannot find source file" errors you
mention in your later example. Nevertheless, there're chances that it
maliciously pretends to work, see below.

> Now I modified something in order to explain the next issue:
> project(main)
> cmake_minimum_required(VERSION 2.8)
> 
> add_library(src1 STATIC src1.c)
> set(lower_layer_lib_LIBRARIES src1)
> 
> set(gen_src2_SRCS gen_src2.c)
> add_executable(gen_src2 ${gen_src2_SRCS})
> target_link_libraries(gen_src2 src1)
> get_target_property(gen_src2_EXE gen_src2 LOCATION)
> add_custom_command(
>         OUTPUT src2.c
>         COMMAND ${gen_src2_EXE}
>         ARGS > src2.c
>         DEPENDS gen_src2
> )

BTW, you don't need to bother with the LOCATION target property here;
ADD_CUSTOM_COMMAND() is smart enough to figure out the executable by
itself.

> add_library(src2 STATIC src2.c)
> set(lower_layer_lib_LIBRARIES src2 ${lower_layer_lib_LIBRARIES})
> 
> set(gen_src3_SRCS gen_src3.c)
> add_executable(gen_src3 ${gen_src3_SRCS})
> target_link_libraries(gen_src3 src2 src1)
> get_target_property(gen_src3_EXE gen_src3 LOCATION)
> add_custom_command(
>         OUTPUT src3.c
>         COMMAND ${gen_src3_EXE}
>         ARGS > src3.c
>         DEPENDS gen_src3
> )
> add_library(src3 STATIC src3.c)
> set(lower_layer_lib_LIBRARIES src3 ${lower_layer_lib_LIBRARIES})
> 
> set(gen_src4_SRCS gen_src4.c)
> add_executable(gen_src4 ${gen_src4_SRCS})
> target_link_libraries(gen_src4 src3 src2 src1)
> get_target_property(gen_src4_EXE gen_src4 LOCATION)
> add_custom_command(
>         OUTPUT src4.c
>         COMMAND ${gen_src4_EXE}
>         ARGS > src4.c
>         DEPENDS gen_src4
> )
> add_library(src4 STATIC src4.c)
> set(lower_layer_lib_LIBRARIES src4 ${lower_layer_lib_LIBRARIES})
> 
> # Yes, it works.
> add_library(lower_layer_lib ${lower_layer_lib_LIBRARIES})

This line expands to add_library(lower_layer_lib src4 src3 src2 src1),
and since you have src{4,3,2,1}.c available from the custom commands -
you're doing an in-source-build, I suppose - CMake takes them as the
sources for lower_layer_lib. Rename the libraries from srcN to genN
like in the previous example, and you'll probably see that it's not
working anymore. Subsequently, do a "touch gen{4,3,2,1}.c" in the
source directory, and you'll probably see that it's working again.
So, you mean libraries, but CMake looks for sources and finds them
due to the the particular file naming in this case; thus, it works
by accident. As proof, "make VERBOSE=1" and see how lower_layer_lib
gets built from freshly compiled object files instead of libraries
generated before; consequently, the source files are compiled twice.

>>> I know someone said I can build static libraries for avoiding this,
>>> but it will fall into another issue:
>>> [Cmake] How do I link a static library into a library
>>> http://www.cmake.org/pipermail/cmake/2004-April/004990.html
>>> Therefore, it still cannot solve my problem.
>>
>> If I understand correctly, the concern of that thread's OP was to
>> enhance a static library with another one that was built externally,
>> i.e. outside the project. In your project, you can decide when and from
>> which object files a static library is built, so you don't need to stick
>> to a single library, in particular. Instead, you can adapt to your, say,
>> incremental build process and generate one static library per step which
>> will be used in later steps in order to avoid numerous recompilations of
>> the same source files.
> 
> In my simplified example, the above approach is really good.
> But it's bad in the multi-layer building architecture.

No, I wouldn't say so. IMO, w.r.t. the mentioned approach, there is no
fundamental difference whether you sources are organized hierarchically
or centrally, i.e. all in one directory. In the former case, of course,
you must communicate lists of executables, libraries, generated source
files etc. across the various directory levels, but this should be
manageable using the means provided by CMake.

> Assume that libhigher_layer_lib.a contains all objects of liblower_layer_lib1.a liblower_layer_lib2.a, ..., and so on.
> Of course, I don't need to build liblower_layer_lib###.a if CMake can link these object files directly.

AFAIK, CMake can't do that, but if higher_layer_lib isn't your final
product there's no need to incorporate any lower_layer_lib code in it;
just use all these libraries together where you would have solely used
higher_layer_lib otherwise.

> In the following example, I reduce them to a file `liblower_layer_lib.a'.
> 
> For the multi-layer case, I appended these lines to CMakeLists.txt:
>   set(lower_layer_LIBRARIES lower_layer_lib)
>   set(higher_layer_lib_SRCS higher_src1.c)
>   add_library(higher_layer_lib STATIC ${higher_layer_lib_SRCS} ${lower_layer_LIBRARIES})
> And then I got these error messages:
>   CMake Error in CMakeLists.txt:
>     Cannot find source file "lower_layer_lib".  Tried extensions .c .C .c++ .cc
>     .cpp .cxx .m .M .mm .h .hh .h++ .hm .hpp .hxx .in .txx

Yes, since lower_layer_lib isn't a source file but a target, and in
general, you may not mention a target amongst the source files of
another target.

> To replace them by the 4 lines "seems" to work:
>   set(lower_layer_LIBRARIES lower_layer_lib)
>   set(higher_layer_lib_SRCS higher_src1.c)
>   add_library(higher_layer_lib STATIC ${higher_layer_lib_SRCS})                         
>   target_link_libraries(higher_layer_lib ${lower_layer_LIBRARIES})      
> It yields:
>   Scanning dependencies of target higher_layer_lib
>   [ 88%] Building C object CMakeFiles/higher_layer_lib.dir/higher_src1.c.o
>   Linking C static library libhigher_layer_lib.a
>   [ 88%] Built target higher_layer_lib
> But the real commands are:
>   /usr/bin/gcc    -o CMakeFiles/higher_layer_lib.dir/higher_src1.c.o   -c /home/uranus/lab/test/cmake2/higher_src1.c
>   /usr/bin/ar cr libhigher_layer_lib.a  CMakeFiles/higher_layer_lib.dir/higher_src1.c.o
>   /usr/bin/ranlib libhigher_layer_lib.a
> 
> Yes, libhigher_layer_lib.a only contains higher_src1.c.o.
> It doesn't contain any objects in liblower_layer_lib.a.

This is because you can't link a static library against another static
one in order to incorporate the latter's code in the former. Here, the
target_link_libraries() just serves as a hint for CMake to track these
targets' interdependency, i.e. higher_layer_lib gets rebuilt when it's
found to be older than lower_layer_lib.

> OK. I know that I can re-list the sources into add_library(higher_layer_lib STATIC ...).
> I can also layering the sources by using a lot of SET commands.
> But we have to re-compile everything in each higher layer.

In the most extreme scenario, build a static library from each source
file and pass it along with those libraries from the lower layers to
the next higher layer to be used there. Thus, in each layer, you
compile that layer's sources only.

> Look at the following tree:
>   layer1/ (this layer is required to re-compile 200 files)
>     layer2a/ (this layer is required to re-compile 100 files)
>       layer3a/ (50 files)
>       layer3b/ (50 files)
>     layer2b/ (this layer is required to re-compile 100 files)
>       layer3c/ (50 files)
>       layer3d/ (50 files)
> 
> Note that we have to build executable files in each layer,
> and these executable files need to link the objects in its own layer and sub-layers.

layer1/layer2a/layer3a/CMakeLists.txt:
add_library(3a-lib STATIC src-3a-lib.c)
add_executable(3a-exe src-3a-exe.c)
target_link_libraries(3a-exe 3a-lib)
set(3a-libs 3a-lib PARENT_SCOPE)

layer1/layer2a/CMakeLists.txt:
add_subdirectory(layer3a)
add_subdirectory(layer3b)
add_library(2a-lib STATIC src-2a-lib.c)
add_executable(2a-exe src-2a-exe.c)
target_link_libraries(2a-exe 2a-lib ${3a-libs} ${3b-libs})
set(2a-libs 2a-lib ${3a-libs} ${3b-libs} PARENT_SCOPE)

layer1/CMakeLists.txt
add_subdirectory(layer2a)
add_subdirectory(layer2b)
add_library(1-lib STATIC src-1-lib.c)
add_executable(1-exe src-1-exe.c)
target_link_libraries(1-exe 1-lib ${2a-libs} ${2b-libs})
set(1-libs 1-lib ${2a-libs} ${2b-libs} PARENT_SCOPE)

etc., and custom commands at your will. Of course, it's a headache, but
it reflects your build procedure and avoids multiple compilations of
source files unless they get incorporated in multiple targets.

> Thus, we MUST build liblayer3a.a, liblayer3b.a, liblayer3c.a, liblayer3d.a, liblayer2a.c, liblayer2b.a, and ilblayer1.a.
> But GNU make can just link these objects directly, so it doesn't need to build these static libraries.

Linking against a static library which contains a single object file is
nearly the same as linking against this object file directly, so if you
can achieve it with Make and the object files you can achieve it with
CMake and static libraries, too.

> If we have to spend 30 minutes to build this kind of source tree by using GNU make,
> we have to spend 30 * 3 = 90 minutes to build it by using CMake.
> I think it's really not reasonable.

Where does this factor 3 come from? If I understand your build process
correctly every source file must be compiled at most once for all your
executables and another time for every final library it becomes a part
of. How many final libraries incorporating the same object file do you
have in your project? For each object file, this is typically only one
if you use static libraries exclusively. Thus, the maximum multiple of
compilations for a CMake-based build should be at most 2 compared with
an exhaustively optimized Makefile-based appoach. In practice, I would
even expect a noticeable smaller value, so I doubt if it's reasonable
to turn away from CMake in favour of saving those extra compilations.

Besides, are shared libraries perhaps an alternative? Look:

CMAKE_MINIMUM_REQUIRED(VERSION 2.8 FATAL_ERROR)
PROJECT(SHRDLIBS C)
FILE(WRITE f.c "void f(void){}\n")
FILE(WRITE g.c "void g(void){}\n")
ADD_LIBRARY(f SHARED f.c)
ADD_LIBRARY(g SHARED g.c)
FILE(WRITE dummy.c "")
ADD_LIBRARY(h SHARED dummy.c)
TARGET_LINK_LIBRARIES(h g f)
FILE(WRITE main.c "int main(void){return 0;}\n")
ADD_EXECUTABLE(main main.c)
TARGET_LINK_LIBRARIES(main h)

After cmaking and "make VERBOSE=1", one can see:

[...]
gcc -fPIC -shared -Wl,-soname,libh.so -o libh.so
CMakeFiles/h.dir/dummy.c.o libg.so libf.so [...]
[...]
gcc CMakeFiles/main.dir/main.c.o -o main -rdynamic libh.so libg.so
libf.so [...]
[...]

That means you could build, say, pseudo shared libraries which don't
contain code but refer to other shared libraries, and the latters are
included if one links against the formers. Thus, you could compile your
sources to shared libraries, combine them to other shared libraries and
use these in later steps or higher layers of your build process without
any multiple compilations at all which comes up to your original concern.

Regards,

Michael


More information about the CMake mailing list