View Issue Details Jump to Notes ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0003349CMakeCMakepublic2006-06-08 02:032008-01-04 14:24
ReporterThomas Zander 
Assigned ToBrad King 
PriorityhighSeverityminorReproducibilityalways
StatusclosedResolutionfixed 
PlatformOSOS Version
Product Version 
Target VersionFixed in Version 
Summary0003349: Installing should not install unchanged files
DescriptionIt should be clear that copying files (even more so for things like libraries) while source and target have the same dates, and thus have the same content, should be avoided.

Therefor I would like to request that cmake + make are a bit smarter choosing which files to install and first do a stat to figure out if a file actually has to be installed.
TagsNo tags attached.
Attached Files

 Relationships

  Notes
(0004175)
David Jarvie (reporter)
2006-06-08 04:53

Please, if this is implemented, only make it an option. When creating packages (e.g. using checkinstall), it's essential that ALL files can be installed, whether changed or not.
(0004773)
Alex Neundorf (developer)
2006-08-27 11:11

Is it currently possible to force installing everything (even if it hasn't changed) ?
If so, I think this bug can be closed.
(0004804)
Brad King (manager)
2006-08-29 15:29

This has already been implemented (but could be made faster, see below). Currently the goal is to minimize the rebuilding needed by projects built against the installed headers/libraries. Therefore the installation installs missing files but compares the contents of already-installed files against the to-be-installed files. Whether a file already exists or not CMake always reports all installations to make it clear that the file is up to date. The check for unchanged files can be disabled by setting CMAKE_INSTALL_ALWAYS=1 in the environment before installing.

Comparing file contents to avoid installing unchanged files does minimize unnecessary updating of installed file times, but it is expensive. The second installation may be even slower than the first because the contents of both the old and new files must all be read and compared instead of just copied once.

I propose an md5sum-based approach to make the comparison faster. The installer can keep a map of the form

  {/full/path/to/file} -> {md5 timestamp, md5 sum}

Whenever a file is to be installed both its source location and destination location entries in this map are created or updated. Whenever the file has a newer timestamp than the table its md5 sum is recomputed. This map can then be used for very quick comparison of files which should greatly speed second installations. After installation the map can be serialized to disk and reloaded later to avoid computing the sums from scratch every time.
(0004806)
Brad King (manager)
2006-08-29 15:31

See notes in bug#2691 made at "9:36 AM 08-25-2006" for other details.
(0004811)
Thomas Zander (reporter)
2006-08-29 17:09

Would you consider using 'stat'?
(0004812)
Brad King (manager)
2006-08-29 17:14

Whether the file time in the build tree is newer or older than that in the install tree does not matter. It is easy to construct cases where either file is older than the other both when installation is desired and when it is not desired.

If I download a new source tarball and re-install the project all the headers would get updated if stat were used even if they didn't change. Then anything built using those headers would have to rebuild.

The only reliable way to do this is to compare file contents. The md5sum solution is an optimization of this comparison.
(0004815)
Thomas Zander (reporter)
2006-08-30 04:26

A couple of observations;
- overwriting a unchanged header on downloading a new sourcetarball is a cornercase that should not be optimized for, costing all other cases.
- tarballs unpacked files have filedates. Its a bug in the tarball creation if unchanged files have a newer filedate.
- this bugreport is about doing less work (being faster) at install time, NOT about avoiding overwriting existing files. I expect the filedate of the source to be used on the target file anyway avoiding unneeded recompiles of dependent sources.
- stat also has 'filesize' to know if the file changed without doing a calculation. This in combination with the date allows for minimum diskaccess to answer the question if its safe to not install something.
- all installation software on unix uses stat in the manner I described above. Its a proven method.
(0004818)
Brad King (manager)
2006-08-31 11:09

We do use stat's st_size info to determine if the files are different before reading them. Only if the summary information cannot be used to see that the files are different do we actually do the comparison. The problem you are seeing is that the files are NOT different and it actually takes longer to determine that they are not different than when they have changed.

Do you have any references for the widespread use of timestamps for installation?

Computing the md5 sum is not a bad solution: debian packages always install md5sum files to verify the rest of the installation. This solution provides the option of including such a manifest.
(0009387)
Brad King (manager)
2007-10-05 09:48

Looking back at this I don't know what I was thinking before. Clearly the file times are the correct choice. The following changes replace file content comparison with file modification time checks. The installed files are given the file times of their source files:

/cvsroot/CMake/CMake/Source/cmFileCommand.cxx,v <-- cmFileCommand.cxx
new revision: 1.89; previous revision: 1.88
/cvsroot/CMake/CMake/Source/cmSystemTools.cxx,v <-- cmSystemTools.cxx
new revision: 1.350; previous revision: 1.349
/cvsroot/CMake/CMake/Source/cmSystemTools.h,v <-- cmSystemTools.h
new revision: 1.142; previous revision: 1.141
(0009388)
Brad King (manager)
2007-10-05 09:53

The environment variable CMAKE_INSTALL_ALWAYS may be set to override the file time check. This will cause all files to always be installed with the current time.
(0009389)
Brad King (manager)
2007-10-05 10:03

Fix typo in previous commit:

/cvsroot/CMake/CMake/Source/cmSystemTools.cxx,v <-- cmSystemTools.cxx
new revision: 1.351; previous revision: 1.350
(0009391)
Brad King (manager)
2007-10-05 11:26

I'm closing this report since the requested change has been implemented.
(0009944)
Alex Neundorf (developer)
2007-12-16 14:31

Alexander Neundorf wrote:
> On Sunday 16 December 2007, Brad wrote:
> > Alexander Neundorf wrote:
> > > not too long ago you changed the install behaviour so that the file
> > > contents are not compared anymore, but only the modification times.
> > > Today I noticed a (at least for me) somewhat unexpected behaviour with
> > > that. I had modified an installed file, and then did "make install" and
> > > expected that this would overwrite my modifications. It didn't, I guess
> > > because the installed file was newer than the file from the build tree.
> > > So, this behaves different than previous version did.
> > >
> > > I know there is some cmake variable to force it to overwrite everything.
> >
> > export CMAKE_INSTALL_ALWAYS=1
> > make install
> >
> > However that will give all the files the current time as their
> > timestamp. A normal installation will no longer install anything that
> > isn't modified after that. That's probably okay though.
>
> Yes.
>
> > > Do you know how other (autotools, qmake) behave in these case ?
> >
> > According to Zander in the bug report mentioned above:
> >
> > "all installation software on unix uses stat in the manner I described
> > above. Its a proven method."
>
> I was surprised that my change was not overwritten, so now I tested with an
> autotools program (chrpath). I did "make install", edited the installed
> doc/chrpath-0.13/INSTALL, did "make install" again, and my change was gone.
>
> > > One could also argue that it is correct not to overwrite the modified
> > > installed file, since the user modified it on purpose.
> >
> > If the user modified it more recently than the version that is about to
> > be installed was modified then the file should probably not be installed.
>
> This is the question.
>
> So the old behaviour was "install if it is different", the new behaviour
> is "install if to-be-installed file is newer".
> How about "install if file is not guaranteed to be still the same", i.e. if
> size or time differs (just differs, no matter if older or newer) ? This
> would be closer to the old behaviour and more similar to autotools
> behaviour.

We have to be careful about using size due to windows newline issues.
Just the timestamp is probably enough. However even that could be a
problem if we want it to be *exactly* the same due to filesystem time
resolution differences.

> > > Maybe the files should be overwritten if the dates differ, not only if
> > > the date of the installed file is older than the one of the source/build
> > > tree file ?
> >
> > I don't think so. That goes counter to the reasoning above and could
> > silently blow away user changes.
>
> This is true, but it is different from what many people are probably used
> to. If they don't like the current behaviour, they could argue "but I did
> make install, and after that the installed stuff should be what I had in the
> build tree".
>
> > I guess we need another option CMAKE_INSTALL_FORCE to force installation
>
> I'd try to get away without another option.
(0010074)
Brad King (manager)
2008-01-04 14:24

I've updated the install decision to install if the file times are at least 1 second apart no matter which file is newer:

/cvsroot/CMake/CMake/Source/cmFileCommand.cxx,v <-- cmFileCommand.cxx
new revision: 1.94; previous revision: 1.93

This should account for modification time truncation on poor filesystems, always install when the files are different, and still install the source file's time.

 Issue History
Date Modified Username Field Change
2007-10-05 09:48 Brad King Note Added: 0009387
2007-10-05 09:53 Brad King Note Added: 0009388
2007-10-05 10:03 Brad King Note Added: 0009389
2007-10-05 11:26 Brad King Status assigned => closed
2007-10-05 11:26 Brad King Note Added: 0009391
2007-10-05 11:26 Brad King Resolution open => fixed
2007-12-16 14:31 Alex Neundorf Note Added: 0009944
2007-12-16 14:31 Alex Neundorf Status closed => assigned
2007-12-16 14:31 Alex Neundorf Resolution fixed => reopened
2008-01-04 14:24 Brad King Status assigned => closed
2008-01-04 14:24 Brad King Note Added: 0010074
2008-01-04 14:24 Brad King Resolution reopened => fixed


Copyright © 2000 - 2018 MantisBT Team