View Issue Details Jump to Notes ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0015157CMakeModulespublic2014-09-14 12:392016-06-10 14:31
Reporterbchretien 
Assigned ToJames Bigler 
PriorityhighSeveritymajorReproducibilityalways
StatusclosedResolutionmoved 
PlatformOSOS Version
Product VersionCMake 3.0.2 
Target VersionFixed in Version 
Summary0015157: FindCUDA.cmake: separate compilation not working as expected
DescriptionI tried using separate compilation for CUDA with CMake, but I had the issue described by someone else on Stack Overflow: https://stackoverflow.com/questions/22540783/nvlink-error-when-linking-cuda-code-against-cuda-static-library-cmake [^]

In the CUDA SDK samples, there's a "simpleSeparateCompilation" sample that uses a Makefile to build a device static library and link it to the executable. I tried to adapt it to CMake, but the same CUDA linker error arises.

If this is simply because this should be done some other way, then I guess the documentation in FindCUDA.cmake should be completed. Else, this may be an error in the module.
Steps To ReproduceAdd the enclosed CMakeLists to $CUDA_HOME/samples/0_Simple/simpleSeparateCompilation.

$ mkdir /tmp/test && cd /tmp/test
$ cmake $CUDA_HOME/samples/0_Simple/simpleSeparateCompilation
$ make
...
[100%] Building NVCC intermediate link file CMakeFiles/simpleSeparateCompilation.dir/./simpleSeparateCompilation_intermediate_link.o
nvlink error : Undefined reference to '_Z13multiplyByTwof' in '/tmp/test/CMakeFiles/simpleSeparateCompilation.dir//./simpleSeparateCompilation_generated_simpleSeparateCompilation.cu.o'
nvlink error : Undefined reference to '_Z11divideByTwof' in '/tmp/test/CMakeFiles/simpleSeparateCompilation.dir//./simpleSeparateCompilation_generated_simpleSeparateCompilation.cu.o'
CMakeFiles/simpleSeparateCompilation.dir/build.make:61: recipe for target 'CMakeFiles/simpleSeparateCompilation.dir/./simpleSeparateCompilation_intermediate_link.o' failed
Additional InformationI tested this on Arch Linux with CUDA 6.5 and CMake 3.0.2.
TagsCMake, CUDA, FindCUDA, linker, nvcc
Attached Filestxt file icon CMakeLists.txt [^] (519 bytes) 2014-09-14 12:39 [Show Content]

 Relationships

  Notes
(0036802)
James Bigler (developer)
2014-09-15 13:48

This is what I use to compile simpleSeparateCompilatino:

include_directories(
  common/inc
  shared/inc
  )

set(CUDA_SEPARABLE_COMPILATION ON)
cuda_add_executable(simpleSeparateCompilation
  simpleDeviceLibrary.cuh
  simpleDeviceLibrary.cu
  simpleSeparateCompilation.cu
  OPTIONS -gencode=arch=compute_20,code=sm_20 -gencode=arch=compute_30,code=sm_30
  )
(0036803)
bchretien (reporter)
2014-09-15 14:10

When you use this CMakeLists.txt, you don't actually have a separate static library containing device code that you can use elsewhere.

The default Makefile of simpleSeparateCompilation generates simpleDeviceLibrary.a, which you can then use when compiling some executable or library. The typical use case would be: you have a very large (e.g. templated) device codebase, and you want to generate it just once so that code (CUDA executable or library) using it will only require to parse the headers and link with the static library that contains device code, thus hopefully making compilation faster. This is what is advertised on slide 5 (http://on-demand.gputechconf.com/gtc-express/2012/presentations/gpu-object-linking.pdf [^]).
(0036804)
James Bigler (developer)
2014-09-15 15:13

Sorry, I missed the part about the library.

This gets a bit more complex, so let me see if I can destruct it.

Ultimately this is a problem because CMake can't replace the linker for all generators. Instead of replacing the link phase we have to create an intermediary link file. This takes object files generated by CUDA and does all the device code linking and produces another object file that gets linked into the executable module (shared library or executable).

The problem you are having is that the intermediate link file is being generated when making simpleDeviceLibrary. At this point your library's symbols are resolved and now there is nothing for simpleSeparateCompilation.cu to link against.

At some point you would need to generate a different intermediate link file based on the object files made for simpleSeparateCompilation and the library simpleDeviceLibrary (or the object files from simpleDeviceLibrary). There would also need to be a way to indicate that the intermediate link file for simpleSeparateCompilation needs to link from simpleDeviceLibrary since we don't know what the libraries are that will be linked later.

In addition you probably don't want to create the intermediate link file for simpleDeviceLibrary if it were a STATIC library, but STATIC could mean STATIC host and all the device code is linked correctly or STATIC for the host and device code in which case you need to do something special.

I'm just not sure how to construct this to do the right thing with the current way CMake is implemented. I'm trying to do certain things outside of the standard CMake build workflow and there just aren't all the right pieces of information during configure (they are created at generation time when there are very few hooks).

There might be a way to work around this, though I feel it is rather ugly.

You can get the list of object files used for separable compilation with this variable:

${target}_SEPARABLE_COMPILATION_OBJECTS

You could try this:

CUDA_ADD_LIBRARY(simpleDeviceLibrary STATIC simpleDeviceLibrary.cu)
list(APPEND simpleSeparateCompilation_SEPARABLE_COMPILATION_OBJECTS
            ${simpleDeviceLibrary_SEPARABLE_COMPILATION_OBJECTS})
CUDA_ADD_EXECUTABLE(simpleSeparateCompilation simpleSeparateCompilation.cu)
TARGET_LINK_LIBRARIES(simpleSeparateCompilation simpleDeviceLibrary)

I fear, though that simpleDeviceLibrary will have intermediate link file in it. I don't know of a good way to do this without a new implementation of cuda_add_library. It would look identical to the one in FindCUDA.cmake, but with the intermediate file generation removed:

  # Add a link phase for the separable compilation if it has been enabled. If
  # it has been enabled then the ${cuda_target}_SEPARABLE_COMPILATION_OBJECTS
  # variable will have been defined.
  CUDA_LINK_SEPARABLE_COMPILATION_OBJECTS("${link_file}" ${cuda_target} "${_options}" "${${cuda_target}_SEPARABLE_COMPILATION_OBJECTS}")
(0036805)
bchretien (reporter)
2014-09-15 16:15

Thanks for looking into it.

The intermediate link file is indeed a problem if I set:

SET(CUDA_SEPARABLE_COMPILATION ON)


FYI, I made a dummy project on GitHub to test things (https://github.com/bchretien/cmake-cuda-static [^]). There's currently a shell script that compiles the library and the executable (build.sh) for testing, and a CMakeLists.txt in the src folder. There's a **very** ugly CMake macro that attempts to run the adequate commands that worked in the shell script.

As for changing FindCUDA.cmake for testing:

  # Add the library.
  SET(${cuda_target}_SEPARABLE_COMPILATION_OBJECTS "")
  SET(link_file "")
  add_library(${cuda_target} ${_cmake_options}
    ${_generated_files}
    ${_sources}
    ${link_file}
    )
  ...

Then with (in my dummy project):

  SET(CUDA_SEPARABLE_COMPILATION ON)
  CUDA_ADD_LIBRARY(dummy STATIC dummy.cu)
  CUDA_ADD_EXECUTABLE(dummy_exe dummy_exe.cu)
  TARGET_LINK_LIBRARIES(dummy_exe dummy)

The correct library is generated (libdummy.a), so this is indeed where things get awry. Still, then the linker fails for the executable. However, using the proper linker command (cf. build.sh), compilation works:

  # Make executable and link with static library
  nvcc ${NVCC_ARGS} -dlink ${LIB_STATIC} -c ${EXE_SRC}
  nvcc ${NVCC_ARGS} ${LIB_STATIC} ${EXE_OBJ} -o ${EXE_NAME}
(0036830)
James Bigler (developer)
2014-09-19 02:33
edited on: 2014-09-19 02:35

I finally had some time to dig into this. Most of this is observations on my part and not really actionable items.

1. If you want to statically link your device code (i.e. not resolve device symbols), you can't create the intermediate link file for the static library. There would be now two methods of linking the library (host code is STATIC and device code is STATIC).

I'm not sure how to represent this option at this point. A couple of options are
a. Add some sort of argument to the command that has to be parsed out. This is pretty messy.
b. Add another "external variable" like CUDA_SEPERABLE_COMPILATION.

2. You need to tell whatever executable module you want to link in to link against the static library. You must do this before you call cuda_add_{library,executable}. This is somewhat unfortunate since CMake usually does this after the target is created (target_link_libraries). We can't do it then, since the variables or properties we need to set need to be present when we create the target. I might be able to figure this out with generate time created files. I was just hoping to avoid creating yet another script file for execution.

3. You must link against the static library when creating the intermediate link file and when you link the host code. In addition there should be a dependency between this intermediate link file and the static library.

4. If I had a reliable way to replace the host linker this would be *a lot* less messy, but alas the only way you saw to do that was to create your own custom target which opens up a lot of other issues.

(0036878)
Nick Maludy (reporter)
2014-09-30 13:20

I am also having this problem.

I am trying to create a library (static) that exposes several device functions. Due to this bug these same linker errors are preventing me from successfully linking with this library.
(0036880)
James Bigler (developer)
2014-09-30 15:10

Out of curiosity Nick, would you have interest in static library with separable compilation resolved at that library ever? (This is what is happening now)

Do you plan on relying on transitive linking? A->B->C (A and B are static and need to both be linked to C)
(0036881)
Nick Maludy (reporter)
2014-10-01 09:15
edited on: 2014-10-01 10:33

James,

1) I don't want it the way it is now because i'm not able to link against device functions in that library

2) I don't plan on relying on transitive linking. I would ideally like the device code to be in a shared library that i can load at runtime.

Thinking about transitive linking it makes me scared that i would have a utility library and two "middle" libraries link against the utility library then a "super" library that links against "middle". In this case i would be afraid that the symbols from the utility library would be doubled coming from both "middle" libraries. Example:

utility -> middle 1 ->----> super
        -> middle 2 ->--^

Edit: Transitive linking would be nice if the duplicate symbol issue above was not a problem.

(0040466)
Too-Ticky (reporter)
2016-02-10 13:15

Is there likely to be a fix for this issue at some point in the future? Thanks.
(0040693)
Karl Ljungkvist (reporter)
2016-03-15 10:19
edited on: 2016-03-15 10:27

I too am experiencing this issue and would consider a fix highly of interest. It is really a problem to not be able to encapsulate CUDA code into a library, especially for larger software libraries where usability is of high importance.

As is, the FindCUDA does not offer a working functionality for separable compilation, so if this will not be fixed in another two years, I think you need to update the documentation with a disclaimer.

(0040696)
Karl Ljungkvist (reporter)
2016-03-15 13:36

I should admit that I do not have a complete understanding of the linking/library creation process, but as I understand it, there are two problems here:

1. The additional link phase for the library removes the symbols for CUDA functions/data.
2. The additional link phase for the application using these functions/data cannot resolve them since they are external.

Probably, I'm missing some major point, but wouldn't everything be solved by removing both additional link phases altogether? The simple example here: http://stackoverflow.com/a/33233086/2702753 [^] can be compiled successfully using the listed Makefile.
(0042629)
Kitware Robot (administrator)
2016-06-10 14:29

Resolving issue as `moved`.

This issue tracker is no longer used. Further discussion of this issue may take place in the current CMake Issues page linked in the banner at the top of this page.

 Issue History
Date Modified Username Field Change
2014-09-14 12:39 bchretien New Issue
2014-09-14 12:39 bchretien File Added: CMakeLists.txt
2014-09-14 13:20 bchretien Tag Attached: CMake
2014-09-14 13:20 bchretien Tag Attached: CUDA
2014-09-14 13:20 bchretien Tag Attached: FindCUDA
2014-09-14 13:20 bchretien Tag Attached: linker
2014-09-14 13:20 bchretien Tag Attached: nvcc
2014-09-15 09:44 Brad King Assigned To => James Bigler
2014-09-15 09:44 Brad King Status new => assigned
2014-09-15 13:48 James Bigler Note Added: 0036802
2014-09-15 14:10 bchretien Note Added: 0036803
2014-09-15 15:13 James Bigler Note Added: 0036804
2014-09-15 16:15 bchretien Note Added: 0036805
2014-09-19 02:33 James Bigler Note Added: 0036830
2014-09-19 02:35 James Bigler Note Edited: 0036830
2014-09-30 13:20 Nick Maludy Note Added: 0036878
2014-09-30 15:10 James Bigler Note Added: 0036880
2014-10-01 09:15 Nick Maludy Note Added: 0036881
2014-10-01 10:32 Nick Maludy Note Edited: 0036881
2014-10-01 10:33 Nick Maludy Note Edited: 0036881
2016-02-10 13:15 Too-Ticky Note Added: 0040466
2016-03-15 10:19 Karl Ljungkvist Note Added: 0040693
2016-03-15 10:27 Karl Ljungkvist Note Edited: 0040693
2016-03-15 13:36 Karl Ljungkvist Note Added: 0040696
2016-06-10 14:29 Kitware Robot Note Added: 0042629
2016-06-10 14:29 Kitware Robot Status assigned => resolved
2016-06-10 14:29 Kitware Robot Resolution open => moved
2016-06-10 14:31 Kitware Robot Status resolved => closed


Copyright © 2000 - 2018 MantisBT Team