Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change build log to always log the most recent input mtime #1753

Merged
merged 1 commit into from
Mar 20, 2021

Conversation

jdrouhard
Copy link
Contributor

Fixes #1162 and is a continuation of the original idea in #1165.

If an edge's output files' mtimes are compared to the most recent
input's mtime, edges might be calculated as clean even if they are
actually dirty. While an edge's command is running its rule to produce
its outputs and an input to the edge is updated before the outputs are
written to disk, then subsequent runs will think that the outputs are
newer than the inputs, even though the inputs have actually been updated
and may be different than what were used to produce those outputs.

Ninja will now restat all inputs just prior to running an edge's command
and remember the most recent input mtime. When the command completes,
it will stat any discovered dependencies from dep files (if necessary),
recalculate the most recent input mtime, and log it to the build log
file. On subsequent runs, ninja will use this value to compare to the
edge's most recent input's mtime to determine whether the outputs are
dirty.

This extends the methodology used by restat rules to work in all cases.
Restat rules are still unique in that they will clean the edge's output
nodes recursively if the edge's command did not change the output, but
in all cases, the mtime recorded in the log file is now the most recent
input mtime. See the new tests for more clarification.

@jdrouhard jdrouhard force-pushed the log_input_mtime branch 2 times, most recently from 8b36c11 to 7df1459 Compare March 15, 2020 21:46
@jdrouhard
Copy link
Contributor Author

@jhasse let me know when you get a chance to review this or if you have any questions about it.

@bradking you might be interested in this as well. I know you have quite a bit of knowledge about the dependency system of ninja, and the dyndep functionality should make this change work even better.

src/build_test.cc Show resolved Hide resolved
src/build.cc Outdated Show resolved Hide resolved
src/build.cc Outdated
i != deps_nodes.end(); ++i) {
(*i)->StatIfNecessary(disk_interface_, err);
if ((*i)->mtime() > most_recent_input)
most_recent_input = (*i)->mtime();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if one of the just-discovered inputs was updated while the command was running, but after the command loaded it? That's the same race. I don't think we can do much about it though. We have no way to know what its mtime was when the command loaded it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One possible way to address this would be to have the depfile generated by the compiler somehow encode the mtime the file had when the compiler read it. That's way beyond the scope of this PR though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a test that explicitly shows this scenario. Unfortunately like you say, there's nothing we can do. The only way this will work 100% is to use the dyndep functionality to know what the inputs are before running the command.

This code here is just to ensure that discovered deps don't cause subsequent runs to immediately find the output dirty (since a discovered dep might have a newer time than the ones known to the command initially).

for (vector<Node*>::iterator i = edge->inputs_.begin();
i != edge->inputs_.end() - edge->order_only_deps_; ++i) {
if (!(*i)->Stat(disk_interface_, err))
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously this extra stat() call was made only on restat edge inputs in FinishCommand's node_cleaned case. Now we're calling stat() a second time on every input of every edge that runs. This may not matter because in practice the stat() result will probably be in the filesystem's hot cache, and the time it takes is likely small compared to the time it will take the edge's command to run (whose child process will normally actually open and read the whole input file).

I wonder if we can get away with StatIfNecessary here. Even if the input has been updated since our original stat round when ninja started, we still know that the outputs will not be older than that.

Copy link
Contributor Author

@jdrouhard jdrouhard Mar 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought hard about using StatIfNecessary here for awhile. I opted to re-stat because input nodes might be on multiple edges and later edges might be reading a different version of the input if that input is changed after one edge runs but before another begins.

This would cause a false positive on the subsequent run since the recorded mtime would reflect that input's original mtime even though the edge actually ran with the most recent version.

I suppose it's a tradeoff between saving unnecessary rebuilds on subsequent runs vs stat'ing more often while building.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining. If a performance problem surfaces we can re-consider. Otherwise I think Stat() is fine for now.

Copy link
Contributor

@bradking bradking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jhasse I've completed a basic review of this and in principle it LGTM and the tests look adequate.

for (vector<Node*>::iterator i = edge->inputs_.begin();
i != edge->inputs_.end() - edge->order_only_deps_; ++i) {
if (!(*i)->Stat(disk_interface_, err))
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining. If a performance problem surfaces we can re-consider. Otherwise I think Stat() is fine for now.

@jdrouhard
Copy link
Contributor Author

I'm going to squash these review commits. Should be good to go in a sec.

@jdrouhard
Copy link
Contributor Author

@jhasse ping, this is ready for merging!

Copy link
Collaborator

@jhasse jhasse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to take a closer look and really understand the changes. I don't know when I will find the time to do that, sorry.

src/build.cc Outdated Show resolved Hide resolved
src/build_log.cc Show resolved Hide resolved
@jhasse jhasse added this to the 1.11.0 milestone Mar 25, 2020
@jhasse
Copy link
Collaborator

jhasse commented Apr 15, 2020

Should we bump the build_log version for this?

@jdrouhard
Copy link
Contributor Author

Should we bump the build_log version for this?

Maybe? The actual format of the log file didn't change, just the interpretation of one of the fields. Going forward, this code can still use build logs generated by previous versions just fine. But if we want to bump it because of the semantic difference, I'm ok with that too.

@jhasse
Copy link
Collaborator

jhasse commented Apr 15, 2020

Could older ninja versions use the log generated by this PR though?

@jdrouhard
Copy link
Contributor Author

Could older ninja versions use the log generated by this PR though?

Yes. Older versions would simply interpret the output mtime column as the output file's actual mtime when the command completed. Since this PR changes that time to be an earlier time than the current code, ninja versions that don't have this PR will be more aggressive at deciding outputs are older and therefore dirty.

@jdrouhard
Copy link
Contributor Author

@jhasse any word on when you can merge this? I just rebased onto current master so it should all be up to date again. Let me know if there's anything else you'd like to see first.

@jhasse
Copy link
Collaborator

jhasse commented Feb 11, 2021

Thanks for rebasing. I'll test your PR in the coming weeks and then merge it if I don't find anything.

@jdrouhard
Copy link
Contributor Author

@jhasse rebased again. I've been using this branch at work in our large codebase for months and it's been working perfectly, how has your testing with it been going?

Copy link
Collaborator

@jhasse jhasse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also no issues on my site :)

src/graph.cc Outdated Show resolved Hide resolved
@jdrouhard
Copy link
Contributor Author

@jhasse ready to merge soon? :D

@jdrouhard
Copy link
Contributor Author

@jhasse are you waiting on anything else before merging this?

@jdrouhard
Copy link
Contributor Author

@jhasse rebased this on master

If an edge's output files' mtimes are compared to the most recent
input's mtime, edges might be calculated as clean even if they are
actually dirty. While an edge's command is running its rule to produce
its outputs and an input to the edge is updated before the outputs are
written to disk, then subsequent runs will think that the outputs are
newer than the inputs, even though the inputs have actually been updated
and may be different than what were used to produce those outputs.

Ninja will now restat all inputs just prior to running an edge's command
and remember the most recent input mtime. When the command completes,
it will stat any discovered dependencies from dep files (if necessary),
recalculate the most recent input mtime, and log it to the build log
file. On subsequent runs, ninja will use this value to compare to the
edge's most recent input's mtime to determine whether the outputs are
dirty.

This extends the methodology used by restat rules to work in all cases.
Restat rules are still unique in that they will clean the edge's output
nodes recursively if the edge's command did not change the output, but
in all cases, the mtime recorded in the log file is now the most recent
input mtime. See the new tests for more clarification.
@jhasse
Copy link
Collaborator

jhasse commented Mar 20, 2021

Thanks!

I'm a little bit unsure about increasing the log version.

It would also be interesting to know how much this changes performance.

Merging this will actually increase the exposure. Let's see, I can't promise that we won't actually need to rethink this change before the release.

@jhasse jhasse merged commit 2b97efa into ninja-build:master Mar 20, 2021
@jdrouhard
Copy link
Contributor Author

It would also be interesting to know how much this changes performance.

If it suffers too much, a simple change to get performance back on par is to use StatIfNecessary() instead of Stat() when starting an edge and looping over all known input nodes to get the most recent input mtime. It would speed things up at the cost of potentially marking things dirty that don't need to be. I believe there was a discussion about this trade off in the PR review.

Merging this will actually increase the exposure. Let's see, I can't promise that we won't actually need to rethink this change before the release.

Let's hope it works out! I've been trying to get this issue fixed in ninja for over 5 years now so I'm pretty excited. Thanks for merging!

@jdrouhard jdrouhard deleted the log_input_mtime branch March 22, 2021 13:51
jdrouhard added a commit to jdrouhard/ninja that referenced this pull request Mar 22, 2021
…re checking the build log (Fixes ninja-build#1932)

This is a followup fix for ninja-build#1753.

`build_log()` will always be valid (even for new builds). We should be
checking for the output being older than the input before we check the
build log for additional possible conditions that would make the output
dirty.
jdrouhard added a commit to jdrouhard/ninja that referenced this pull request Mar 22, 2021
…re checking the build log

This is a followup fix for ninja-build#1753. Fixes ninja-build#1932

`build_log()` will always be valid (even for new builds). We should be
checking for the output being older than the input before we check the
build log for additional possible conditions that would make the output
dirty.
jdrouhard added a commit to jdrouhard/ninja that referenced this pull request Mar 23, 2021
After ninja-build#1753, every file that was a non-order-only input to any edge was
being restat for each edge that had that same file as an input. Turns
out that's a lot of stat calls, so a simple tradeoff here is to just
stat each input file once during the build and use that mtime for each
edge that has it as an input when determining the most recent input
mtime.

This could potentially lead to unnecessary rebuilds (if one edge that
shares an input with another edge is started after the input is changed
but that input was changed after the first output's edge stat'd it).
The significant speed increase is worth it.
jhasse added a commit that referenced this pull request Mar 24, 2021
Revert #1753 and add additional tests to expose previously untested behavior
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Long running compile commands can cause incorrect dirty status computation
3 participants