[cmake-developers] Experiments in CMake support for Clang (header & standard) modules

Fri Aug 24 05:35:31 EDT 2018

On 24/08/18 02:32, David Blaikie wrote:
> On Tue, Jul 24, 2018 at 3:20 PM Stephen Kelly <steveire at gmail.com 
> <mailto:steveire at gmail.com>> wrote:
>
>     David Blaikie wrote:
>
>     > (just CC'ing you Richard in case you want to read my
>     ramblings/spot any
>     > inaccuracies, etc)
>     >
>     > Excuse the delay - coming back to this a bit now. Though the varying
>     > opinions on what modules will take to integrate with build
>     system still
>     > weighs on me a bit
>
>     Can you explain what you mean by 'weighs on you'? Does that mean
>     you see it
>     as tricky now?
>
>
> Yes, to some extent. If the build system is going to require the 
> compiler-callsback-to-buildsystem that it sounds like (from 
> discussions with Richard & Nathan, etc) is reasonable - yeah, I'd say 
> that's a bigger change to the way C++ is compiled than I was 
> expecting/thinking of going into this.

Yes.

>     I've kind of been assuming that people generally think it is not
>     tricky, and
>     I'm just wrong in thinking it is and I'll eventually see how it is
>     all
>     manageable.
>
>
> I think it's manageable - the thing that weighs on me, I suppose, is 
> whether or not the community at large will "buy" it, as such.

Yes, that has been my point since I first started talking about modules. 
I don't think modules will gain a critical mass of adoption as currently 
designed (and as currently designed to work with buildsystems).

> And part of that is on the work we're doing to figure out the 
> integration with build systems, etc, so that there's at least the 
> first few pieces of support that might help gain user adoption to 
> justify/encourage/provide work on further support, etc...

Yes, reading the document Nathan sent us on June 12th this year, it 
seems that CMake would have to implement a server mode so that the 
compiler will invoke it with RPC. That server will also need to consume 
some data generated by CMake during buildsystem generation (eg user 
specified flags) and put that together with information sent by the 
compiler (eg ) in order to formulate a response. It's complex. Maybe 
CMake and other buildsystem generators can do it, but there are many 
bespoke systems out there which would have to have some way to justify 
the cost of developing such a thing.

>     > The build.sh script shows the commands required to build it
>     (though I
>     > haven't checked the exact fmodule-file dependencies to check
>     that they're
>     > all necessary, etc) - and with current Clang top-of-tree it does
>     build and
>     > run the example dinnerparty program.
>
>     Ok. I tried with my several-weeks-old checkout and it failed on
>     the first
>     command with -modules-ts in it (for AbstractFruit.cppm - the
>     simplest one).
>
>     I'll update my build and try again, but that will take some time.
>
>
> Huh - I mean it's certainly a moving target - I had to file/workaround 
> a few bugs to get it working as much as it is, so not /too/ 
> surprising. Did you get it working in the end? If not, could you 
> specify the exact revision your compiler's at and show the complete 
> output?

Yes, I got it working. See

  https://www.mail-archive.com/cmake-developers@cmake.org/msg18623.html

>     > But I'm not sure how best to determine the order in which to
>     build files within a library - that's where the sort of -MM-esque
>     stuff, etc, would be
>     > necessary.
>
>     Would it? I thought the -MM stuff would mostly be necessary for
>     determining
>     when to rebuild? Don't we need to determine the build order before
>     the first
>     build of anything? The -MM stuff doesn't help that.
>
>
> -MM produces output separate from the compilation (so far as I can 
> tell - clang++ -MM x.cpp doesn't produce anything other than the 
> makefile fragment on stdout) & finds all the headers, etc. So that's 
> basically the same as what we'd need here

Are you sure? I thought compiling with -MM gives us information that we 
need before we compile the first time. Sorry if that was not clear from 
what I wrote above. I see a chicken-egg problem. However, I assume I'm 
just misunderstanding you (you said that -MM would be used to determine 
build order for the initial build) so let's just drop this.

> Looking at your example - if you have a library for all the fruits and 
> libabstractfruit, libfruitsalad, libnotfruitsalad, and libbowls - then 
> you'd have one module interface for each of those (AbstractFruit.cppm, 
> FruitSalad.cppm, NotFruitSalad.cppm, Bowls.cppm) that would be 
> imported (so replace "import Apple", "import Grape" with "import 
> FruitSalad", etc... ) & the implementations could be in multiple files 
> if desired (Apple.cpp, Grape.cpp, etc).

Could you show me what that would look like for the repo? I am 
interested to know if this approach means concatenating the content of 
multiple files (eg Grape.h and Apple.h) and porting that result to a 
module. My instinct says that won't gain adoption.

>     >> Ok. That's not much better though. It still means
>     editing/generating the
>     >> buildsystem each time you add an import.
>     >
>     >
>     > Isn't that true today with headers, though?
>
>     No. Imagine you implemented FruitBowl.cpp in revision 1 such that
>     it did not
>     #include Grape.h and it did not add the Grape to the bowl.
>
>     Then you edit FruitBowl.cpp to #include Grape.h and add the Grape
>     to the
>     bowl. Because Grape.h and Apple.h are in the same directory (which
>     you
>     already have a -Ipath/to/headers for in your buildsystem), in this
>     (today)
>     scenario, you don't have to edit the buildsystem.
>
>
> Well, you don't have to do it manually, but your build system ideally 
> should reflect this new dependency so it knows to rebuild 
> FruitBowl.cpp if Grape.h changes.

I never said it had to be done manually in the real world. I mentioned 
that in the context of your script. The point I keep making is that the 
buildsystem has to be regenerated.

>     Perhaps. I notice that running CMake on my
>     llvm/clang/clang-tools-extra
>     checkout takes a non-zero amount of time, and for other
>     buildsystems takes a
>     significantly non-zero amount of time.
>
>     Many buildsystem generators already avoid the time/complexity of
>     automatically regenerating the buildsystem when needed. Users have
>     to leave
>     their IDE and run a script on the command line.
>
>
> That surprises me a bit

Yes, there is a large diversity out there in the world regarding how 
things work.

>     I wonder if people will use C++ modules if CMake/their generator
>     has to be
>     re-run (automatically or through explicit user action) every time
>     they add
>     'import foo;' to their C++ code... What do you think?
>
>
> If it's automatic & efficient (I hope it doesn't redo all the work of 
> discovery for all files - just the ones that have changed) it seems 
> plausible to me.

At least in the CMake case, the logic is currently coarse - if the 
buildsystem needs to be regenerated, the entire configure and generate 
steps are invoked. Maybe that can be changed, but that's just more 
effort required on the part of all buildsystem generators, including 
bespoke ones. I think the level of effort being pushed on buildsystems 
is not well appreciated by the modules proposal.

What I see as a worst-case scenario is:

* Modules gets added to the standard to much applause
* User realize that they have to rename all of their .h files to cppm 
and carefully change those files to use imports. There are new 
requirements regarding where imports can appear, and things don't work 
at first because of various reasons.
* Maybe some users think that creating a module per library is a better 
idea, so they concat those new cppm files, sorting all the imports to 
the top.
* Porting to Modules is hard anyway, because dependencies also need to 
be updated etc. Developers don't get benefits until everything is 'just 
right'.
* Some popular buildsystems develop the features to satisfy the new 
requirements
* Most buildsystems, which are bespoke, don't implement the GCC 
oracle-type stuff and just fudge things with parsing headers using a 
simple script which looks for imports. It kind of works, but is fragile.
* Lots of time is spent on buildsystems being regenerated, because the 
bespoke systems don't get optimized in this new way.
* After a trial run, most organizations that try modules reverse course 
and stop using them.
* Modules deemed to have failed.

Maybe I'm being too negative, but this seems to be the likely result to 
me. I think there are more problems lurking that we don't know about 
yet. But, I've said this before, and I still hope I'm wrong and just 
missing something.

> Sorry for the rather long delay on this - hopefully it helps us 
> converge a little.
>
> I'll try to find some time to get back to my original prototype & your 
> replies to do with that to see if I can flesh out the simpler "one 
> module per library (with some of the inefficiency of just assuming 
> strong dependencies between libraries, rather than the fine grained 
> stuff we could do with -MM-esque support), no external modules" 
> scenario (& maybe the retro/"header modules" style, rather than/in 
> addition to the new C++ modules TS/atom style) - would be great to 
> have a reasonable prototype of that as a place to work from, I think.

Yes, sounds interesting.

There are other things we would want to explore then too. In particular, 
in my repo, all of the examples are part of the same buildsystem. We 
should model external dependencies too - ie, pretend each library has a 
standalone/hermetic buildsystem. That would mean that AbstractFruit 
would generate its own pcm files to build itself, but each dependency 
would also have to generate the AbstractFruit pcm files in order to 
compile against it as an external library (because pcm files will never 
be part of an install step, or a linux package or anything - they are 
not distribution artifacts).

Thanks,

Stephen.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://cmake.org/pipermail/cmake-developers/attachments/20180824/f98c5998/attachment-0001.html>