[CMake] [PATCH] major performance improvement for the C dependency
scanner
Alexander Neundorf
a.neundorf-work at gmx.net
Wed Nov 30 13:24:12 EST 2005
Hi,
the attached patch reduces the time a cmake-generated Makefile needs
until it actually starts to compile something on my box from 23 s down to
7 seconds. This is still much too long, but already a lot better.
The box is a PIII/450 MHz, the depend.make file is about 8500 lines long.
The patch does the following: cmDependsC::Scan() scans a C file line by
line for included files. If a header is included in multiple files, it is
scanned for each file again. The patch introduces a cache, which caches
all found include-lines per file.
Then if a file should be scanned, at first the cache is checked and if it
already contains the file, it isn't scanned again.
The patch has an issue, my depend.make went down from 8500 lines to 6500
lines. I'm not sure where this comes from. I guess it must be related to
the header search path. I noticed that some of the depend-files in
depend.make have absolute paths and some have relative paths and some
paths also contain "../". Maybe this is somehow related to the 2000 fewer
lines in depend.make.
930 files were scanned, several thousand were used from the cache.
Further ideas how to make it faster:
* maybe the line-by-line reading is not optimal
At first read the complete file could be read into memory, and then
parsed from there. Not sure how much it would gain.
* in most C files the include lines are at the top of the file
It would be nice if this could be somehow exploited. Maybe if the
complete file is in memory, before actually parsing it line by line, just
go completely through it and simply count the '#' it contains. Then parse
it line by line, and also count the '#'. If all have been found, stop
processing this file. I would hope that the benefit of stopping
(expensive regexp) parsing line by line earlier is bigger than the cost
of (cheap) single-byte comparison of the whole file.
* cache the contents of the new m_fileCache on a file on disk
When a cmDependsC object is created, fill m_fileCache with the contents
of the saved file. Then check for each file whether it changed since the
cache was written on disk. If that's the case, remove the entry from the
cache. I think this could be a big speedup, but it is slightly beyond my
cmake-hacking skills.
* make the m_fileCache more global
In the current patch the m_fileCache is created and deleted for every
cmDependsC object. I don't know under which circumstances such an object
is created. In my project three of them were created (maybe one for C
files, one for C++ files, and another one ?).
If the cache would be shared for all of them, it could be a significant
gain. Maybe it should be cleared for every new target. I don't know where
this would have to be done in the code.
What do you think ?
Bye
Alex
--
Telefonieren Sie schon oder sparen Sie noch?
NEU: GMX Phone_Flat http://www.gmx.net/de/go/telefonie
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cmDependsC.patch
Type: text/x-diff
Size: 3756 bytes
Desc: not available
Url : http://public.kitware.com/pipermail/cmake/attachments/20051130/72734051/cmDependsC.bin
More information about the CMake
mailing list