[CMake] Better BlueGene/P and cross-compile support for CMake
Todd Gamblin
tgamblin at llnl.gov
Fri Jun 25 19:08:01 EDT 2010
On Jun 25, 2010, at 12:06 PM, Brad King wrote:
> Alexander Neundorf wrote:
>> On Wednesday 23 June 2010, Todd Gamblin wrote:
>>> 3a. First, using this setup, FindMPI fails because the last library it
>>> needs is in /bgsys/drivers/ppcfloor/runtime/SPI, not
>>> /bgsys/drivers/ppcfloor/runtime/SPI/lib. CMAKE_FIND_ROOT_PATH seems to
>>> assume that its elements are just above a lib directory, but that's not how
>>> things are structured on BG/P (you can blame IBM). So, if I just use the
>>> root path for searching, find_library fails for libs in runtime/SPI.
>>
>> I guess you need a special Platform/BlueGeneP.cmake file, which not simply
>> includes UnixPaths.cmake, but sets some special directories.
>> Please add a bug report for this on http://*public.kitware.com/Bug/ too.
>
> Good idea. There is a distinction to understand here:
>
> (1) CMAKE_FIND_ROOT_PATH is meant for cross-compiling and should
> appear only in toolchain files. It re-roots all search paths.
> Usually the goal is to find libraries only from the target
> platform SDK (and not the host), which is why we suggest using
>
> set(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY)
>
> Typically CMAKE_FIND_ROOT_PATH should be set to the *top* of the
> SDK tree, such as "/bgsys/drivers/ppcfloor" in your case.
This makes sense, but it's actually slightly more complicated for BG/P. I just got off the phone with our lead BG sysadmin (Adam Bertsch) about this, and here's what I got about the layout of that directory:
/bgsys/drivers/ppcfloor
This is the root of the BG system software release ("driver" in IBM-speak), and it
contains files for the frontends, the I/O nodes, and the compute nodes. We probably
don't want to make this the system root for the backend, because it would allow other
types of binaries into the build.
/bgsys/drivers/ppcfloor/mcp-2.6.16.60
This is the root of the I/O node system image. This is the 32-bit PPC
linux (pretty sure it's SuSE too) that runs on the I/O nodes (IONs) of BG/P. I do tools
development, and it's possible to run daemons like TotalView on the IONs, so I might
at some point try to make a platform/toolchain file for this, but not in the near future, if
at all. This shouldn't go in the plain BG/P toolchain file.
FYI -- the version number here is the linux kernel version, which *could* change, but
typically IBM doesn't upgrade the BG ION kernels during a particular BG product's
lifetime, or at least they haven't yet, for L or P.
Now, for the Compute Nodes (CNs). Adam essentially said that there *is* no formal documentation about where you should look to find CN system libraries. However, there are the MPI compiler scripts in /bgsys/drivers/ppcfloor/comm/bin. These were written by Argonne and are now included with the BG driver distribution, and they have to link against all the BG system libraries. I ran all the compilers with -show, and unioned all the args I was getting with that command. This narrows it down to a few directories:
/bgsys/drivers/ppcfloor/comm/default/lib
/bgsys/drivers/ppcfloor/comm/sys/lib
/bgsys/drivers/ppcfloor/runtime/SPI
That's it. Those are really the only backend directories that contain libraries used by apps. The CN Kernel (CNK) OS is minimal, so this makes sense, and if you look in those directories, they contain MPI libs and various lower-level comm APIs like DCMF, which BG uses. Now, there are also directories for the particular toolchains, and I'm not sure you care about these, because the BG compilers already link against the libraries in these directories. For GNU, the backend compiler runtime libs are in:
/bgsys/drivers/ppcfloor/gnu-linux
And for XL compilers, the runtime libraries seem to be in:
/opt/ibmcmp/<compiler>/bg/<version>/bglib
I believe that these things are already put either in the runtime path by the compiler (or they're somewhere in the CNK runtime load path), so I don't think they belong in a toolchain file.
> (2) CMAKE_SYSTEM_PREFIX_PATH, CMAKE_SYSTEM_LIBRARY_PATH, and
> similar variables are meant for controlling the search *within*
> the target platform SDK to find different types of files. See
> help for find_library, find_program, and find_path for specific
> details. This is where the "<prefix>/lib" assumption occurs:
>
> - find_library looks in <CMAKE_SYSTEM_PREFIX_PATH>/lib
> - find_library looks in CMAKE_SYSTEM_LIBRARY_PATH, but the
> Modules/Platform/UnixPaths.cmake file sets this variable
> with "/lib" at the end of most of its paths.
Ok, given all that I've said above, there is still a small problem. The BG CNK system libraries are rooted in /bgsys/drivers/ppcfloor/, but /lib and /lib64 under that prefix are FRONT END libraries. This means that you really shouldn't look in those two directories. /lib64 you don't need to worry about, since the BG backend is 32-bit. and /lib you *probably* don't need to worry about too much because it only has 3 libraries. The only one I could see having a name conflict is libedit.a, but I guess that is a minimal risk. So it's *probably* safe to set your CMAKE_FIND_ROOT_PATH to /bgsys/drivers/ppcfloor, but technically it's outside your definition above. The sad truth is that there is no consistent target environment directory on BG/P.
Thinking about this some more, maybe you just don't *need* a find root on BG/P, since there really isn't a valid "default" location for any libraries you'd look for with find_library, find_package, etc the way there is in Linux. Things aren't installed there on BG, they're already linked to the binaries by the compiler wrappers. So maybe the right thing to do, given the way CMake defines CMAKE_FIND_ROOT_PATH, is to leave it empty. You can get all the system information you need from the MPI compiler on BG/P, and that's autodetected by FindMPI, so maybe that is the way to go.
> What Alex suggests is to create a Modules/Platform/BlueGeneP.cmake
> file that defines CMAKE_SYSTEM_LIBRARY_PATH to something like
>
> /runtime/SPI
>
> and other paths relative to CMAKE_FIND_ROOT_PATH entries. This
> tells CMake how to find libraries within the target platform SDK.
>
> IIRC, you can try this out by setting CMAKE_SYSTEM_LIBRARY_PATH
> directly in your toolchain file, but I do not remember for sure
> if it works there.
The problem here is that on BG machines, there are frequently a lot of custom package directories that you want to supply to the compiler. At Argonne, these are all in /soft/apps. At LLNL, they're in various common places, and users also build their own libraries for the backend. These are typically installed in a common place and used by lots of people, but it's not a standard place, so I don't think it's something that should go in the tool file. You want a user to be able to specify it as an option to a CMake build.
The problem right now is really that the find_ commands ignore both PATHS and HINTS in a cross compile environment, even when they might be correct. From the documentation of find_library, PATHS are supposed to come from user-supplied locations, and HINTS are supposed to come from system locations. Given that, shouldn't you accept PATHS in a cross compile environment? This would allow users to specify custom libraries for the target platform even when the toolchain file does this:
> set(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY)
I feel like this is the right thing to do, since it's assuming the user knows what they're doing when they tell you the location of a library. I can see why you would want to ignore HINTS in this scenario, as you're assuming that the introspection isn't going to be right for a cross-compile.
Now if you allow PATHS outside the CMAKE_FIND_ROOT_PATH and disallow HINTS, you still have the problem that FindMPI uses HINTS to locate the compiler libraries, so it won't find things if you set the CMAKE_FIND_ROOT_PATH to empty. But maybe the right thing to do is to set things up the way Brad is suggesting, with CMAKE_FIND_ROOT_PATH set to /bgsys/drivers/ppcfloor and CMAKE_SYSTEM_LIBRARY_PATH set to /runtime/SPI and the various other directories under ppcfloor/comm. As I mentioned above, it's not so dangerous to search ppcfloor/lib, as there are only 3 frontend libraries there, and adding these system directories as special cases would allow them to be found by FindMPI on BG/P. And users could still specify external PATHS for custom libraries in shared filesystems outside the target environment root. I think this would at least get my builds to work.
Or, maybe find_library needs a new type of parameter for systems that have a target platform library directory but ALSO share filesystems that contain target binaries with the target runtime environment. This is the setup on the BG systems and the Cray XT systems right now, and I think this is what's giving the build so many problems.
-Todd
More information about the CMake
mailing list