Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change build log to always log the most recent input mtime #1753

Merged
merged 1 commit into from
Mar 20, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 40 additions & 35 deletions src/build.cc
Original file line number Diff line number Diff line change
Expand Up @@ -270,19 +270,11 @@ bool Plan::CleanNode(DependencyScan* scan, Node* node, string* err) {
#define MEM_FN mem_fn // mem_fun was removed in C++17.
#endif
if (find_if(begin, end, MEM_FN(&Node::dirty)) == end) {
// Recompute most_recent_input.
Node* most_recent_input = NULL;
for (vector<Node*>::iterator i = begin; i != end; ++i) {
if (!most_recent_input || (*i)->mtime() > most_recent_input->mtime())
most_recent_input = *i;
}

// Now, this edge is dirty if any of the outputs are dirty.
// If the edge isn't dirty, clean the outputs and mark the edge as not
// wanted.
bool outputs_dirty = false;
if (!scan->RecomputeOutputsDirty(*oe, most_recent_input,
&outputs_dirty, err)) {
if (!scan->RecomputeOutputsDirty(*oe, &outputs_dirty, err)) {
return false;
}
if (!outputs_dirty) {
Expand Down Expand Up @@ -696,6 +688,20 @@ bool Builder::StartEdge(Edge* edge, string* err) {
return false;
}

// Find the most recent mtime of any (existing) non-order-only input
Node* most_recent_input = NULL;
for (vector<Node*>::iterator i = edge->inputs_.begin();
i != edge->inputs_.end() - edge->order_only_deps_; ++i) {
if (!(*i)->Stat(disk_interface_, err))
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously this extra stat() call was made only on restat edge inputs in FinishCommand's node_cleaned case. Now we're calling stat() a second time on every input of every edge that runs. This may not matter because in practice the stat() result will probably be in the filesystem's hot cache, and the time it takes is likely small compared to the time it will take the edge's command to run (whose child process will normally actually open and read the whole input file).

I wonder if we can get away with StatIfNecessary here. Even if the input has been updated since our original stat round when ninja started, we still know that the outputs will not be older than that.

Copy link
Contributor Author

@jdrouhard jdrouhard Mar 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought hard about using StatIfNecessary here for awhile. I opted to re-stat because input nodes might be on multiple edges and later edges might be reading a different version of the input if that input is changed after one edge runs but before another begins.

This would cause a false positive on the subsequent run since the recorded mtime would reflect that input's original mtime even though the edge actually ran with the most recent version.

I suppose it's a tradeoff between saving unnecessary rebuilds on subsequent runs vs stat'ing more often while building.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining. If a performance problem surfaces we can re-consider. Otherwise I think Stat() is fine for now.

if (!most_recent_input || (*i)->mtime() > most_recent_input->mtime())
most_recent_input = *i;
}

edge->most_recent_input_ = most_recent_input;
if (most_recent_input)
edge->most_recent_input_mtime_ = most_recent_input->mtime();

// start command computing and run it
if (!command_runner_->StartCommand(edge)) {
err->assign("command '" + edge->EvaluateCommand() + "' failed.");
Expand Down Expand Up @@ -744,20 +750,18 @@ bool Builder::FinishCommand(CommandRunner::Result* result, string* err) {
return plan_.EdgeFinished(edge, Plan::kEdgeFailed, err);
}

// Restat the edge outputs
TimeStamp output_mtime = 0;
bool restat = edge->GetBindingBool("restat");
TimeStamp most_recent_input_mtime = 0;
if (!config_.dry_run) {
// Restat the edge outputs
bool node_cleaned = false;

for (vector<Node*>::iterator o = edge->outputs_.begin();
o != edge->outputs_.end(); ++o) {
TimeStamp new_mtime = disk_interface_->Stat((*o)->path(), err);
TimeStamp old_mtime = (*o)->mtime();
(*o)->Stat(disk_interface_, err);
TimeStamp new_mtime = (*o)->mtime();
if (new_mtime == -1)
return false;
if (new_mtime > output_mtime)
output_mtime = new_mtime;
if ((*o)->mtime() == new_mtime && restat) {
if (old_mtime == new_mtime && edge->GetBindingBool("restat")) {
// The rule command did not change the output. Propagate the clean
// state through the build graph.
// Note that this also applies to nonexistent outputs (mtime == 0).
Expand All @@ -767,33 +771,34 @@ bool Builder::FinishCommand(CommandRunner::Result* result, string* err) {
}
}

if (node_cleaned) {
TimeStamp restat_mtime = 0;
// If any output was cleaned, find the most recent mtime of any
// (existing) non-order-only input or the depfile.
for (vector<Node*>::iterator i = edge->inputs_.begin();
i != edge->inputs_.end() - edge->order_only_deps_; ++i) {
TimeStamp input_mtime = disk_interface_->Stat((*i)->path(), err);
if (input_mtime == -1)
return false;
if (input_mtime > restat_mtime)
restat_mtime = input_mtime;
}
// Use the time from the most recent input that was computed when the edge was
// started, not the mtime of the node as it is now. There could have been other edges
// that restat'd the input node and detected a change, but for *this* edge, we want
// the mtime as it was when the command began.
most_recent_input_mtime = edge->most_recent_input_mtime_;

// If there were any added deps, compute the most recent input mtime
for (vector<Node*>::iterator i = deps_nodes.begin();
i != deps_nodes.end(); ++i) {
(*i)->StatIfNecessary(disk_interface_, err);
if ((*i)->mtime() > most_recent_input_mtime)
most_recent_input_mtime = (*i)->mtime();
}

if (node_cleaned) {
// If any output was cleaned, take into account the mtime of the depfile
string depfile = edge->GetUnescapedDepfile();
if (restat_mtime != 0 && deps_type.empty() && !depfile.empty()) {
if (most_recent_input_mtime != 0 && deps_type.empty() && !depfile.empty()) {
TimeStamp depfile_mtime = disk_interface_->Stat(depfile, err);
if (depfile_mtime == -1)
return false;
if (depfile_mtime > restat_mtime)
restat_mtime = depfile_mtime;
if (depfile_mtime > most_recent_input_mtime)
most_recent_input_mtime = depfile_mtime;
}

// The total number of edges in the plan may have changed as a result
// of a restat.
status_->PlanHasTotalEdges(plan_.command_edge_count());

output_mtime = restat_mtime;
}
}

Expand All @@ -807,7 +812,7 @@ bool Builder::FinishCommand(CommandRunner::Result* result, string* err) {

if (scan_.build_log()) {
if (!scan_.build_log()->RecordCommand(edge, start_time_millis,
end_time_millis, output_mtime)) {
end_time_millis, most_recent_input_mtime)) {
*err = string("Error writing to build log: ") + strerror(errno);
return false;
}
Expand Down
10 changes: 5 additions & 5 deletions src/build_log.cc
Original file line number Diff line number Diff line change
Expand Up @@ -116,9 +116,9 @@ BuildLog::LogEntry::LogEntry(const string& output)
: output(output) {}

BuildLog::LogEntry::LogEntry(const string& output, uint64_t command_hash,
int start_time, int end_time, TimeStamp restat_mtime)
int start_time, int end_time, TimeStamp mtime)
jhasse marked this conversation as resolved.
Show resolved Hide resolved
: output(output), command_hash(command_hash),
start_time(start_time), end_time(end_time), mtime(restat_mtime)
start_time(start_time), end_time(end_time), mtime(mtime)
{}

BuildLog::BuildLog()
Expand Down Expand Up @@ -303,7 +303,7 @@ LoadStatus BuildLog::Load(const string& path, string* err) {
*end = 0;

int start_time = 0, end_time = 0;
TimeStamp restat_mtime = 0;
TimeStamp mtime = 0;

start_time = atoi(start);
start = end + 1;
Expand All @@ -319,7 +319,7 @@ LoadStatus BuildLog::Load(const string& path, string* err) {
if (!end)
continue;
*end = 0;
restat_mtime = strtoll(start, NULL, 10);
mtime = strtoll(start, NULL, 10);
start = end + 1;

end = (char*)memchr(start, kFieldSeparator, line_end - start);
Expand All @@ -343,7 +343,7 @@ LoadStatus BuildLog::Load(const string& path, string* err) {

entry->start_time = start_time;
entry->end_time = end_time;
entry->mtime = restat_mtime;
entry->mtime = mtime;
if (log_version >= 5) {
char c = *end; *end = '\0';
entry->command_hash = (uint64_t)strtoull(start, NULL, 16);
Expand Down
2 changes: 1 addition & 1 deletion src/build_log.h
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ struct BuildLog {

explicit LogEntry(const std::string& output);
LogEntry(const std::string& output, uint64_t command_hash,
int start_time, int end_time, TimeStamp restat_mtime);
int start_time, int end_time, TimeStamp mtime);
};

/// Lookup a previously-run command by its output path.
Expand Down
160 changes: 160 additions & 0 deletions src/build_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -609,6 +609,15 @@ bool FakeCommandRunner::StartCommand(Edge* edge) {
if (fs_->ReadFile(edge->inputs_[0]->path(), &content, &err) ==
DiskInterface::Okay)
fs_->WriteFile(edge->outputs_[0]->path(), content);
} else if (edge->rule().name() == "long-cc" ||
edge->rule().name() == "long-cc2") {
for (vector<Node*>::iterator out = edge->outputs_.begin();
out != edge->outputs_.end(); ++out) {
fs_->Tick();
fs_->Tick();
fs_->Tick();
fs_->Create((*out)->path(), "");
}
} else {
printf("unknown command\n");
return false;
Expand Down Expand Up @@ -664,6 +673,15 @@ bool FakeCommandRunner::WaitForCommand(Result* result) {
else
result->status = ExitSuccess;

// These rules simulate an external process modifying files while the build command
// runs. See TestInputMtimeRaceCondition and TestInputMtimeRaceConditionWithDepFile.
// Note: only the "first" time the rule is run per test is the file modified, so the
// test can verify that subsequent runs without the race have no work to do.
if (edge->rule().name() == "long-cc" && fs_->now_ == 4)
fs_->files_["in1"].mtime = 3;
if (edge->rule().name() == "long-cc2" && fs_->now_ == 4)
fs_->files_["header.h"].mtime = 3;
jdrouhard marked this conversation as resolved.
Show resolved Hide resolved

// Provide a way for test cases to verify when an edge finishes that
// some other edge is still active. This is useful for test cases
// covering behavior involving multiple active edges.
Expand Down Expand Up @@ -2266,6 +2284,148 @@ TEST_F(BuildWithDepsLogTest, DepsIgnoredInDryRun) {
builder.command_runner_.release();
}

TEST_F(BuildWithDepsLogTest, TestInputMtimeRaceCondition) {
string err;
const char* manifest =
"rule long-cc\n"
" command = long-cc\n"
"build out: long-cc in1\n";

State state;
ASSERT_NO_FATAL_FAILURE(AddCatRule(&state));
ASSERT_NO_FATAL_FAILURE(AssertParse(&state, manifest));

BuildLog build_log;
ASSERT_TRUE(build_log.Load("build_log", &err));
ASSERT_TRUE(build_log.OpenForWrite("build_log", *this, &err));

DepsLog deps_log;
ASSERT_TRUE(deps_log.Load("ninja_deps", &state, &err));
ASSERT_TRUE(deps_log.OpenForWrite("ninja_deps", &err));

BuildLog::LogEntry* log_entry = NULL;
{
Builder builder(&state, config_, &build_log, &deps_log, &fs_, &status_, 0);
builder.command_runner_.reset(&command_runner_);
command_runner_.commands_ran_.clear();

// Run the build, out gets built, dep file is created
EXPECT_TRUE(builder.AddTarget("out", &err));
ASSERT_EQ("", err);
EXPECT_TRUE(builder.Build(&err));
ASSERT_EQ(1u, command_runner_.commands_ran_.size());

// See that an entry in the logfile is created. the input_mtime is 1 since that was
// the mtime of in1 when the command was started
log_entry = build_log.LookupByOutput("out");
ASSERT_TRUE(NULL != log_entry);
ASSERT_EQ(1u, log_entry->mtime);

builder.command_runner_.release();
}

{
Builder builder(&state, config_, &build_log, &deps_log, &fs_, &status_, 0);
builder.command_runner_.reset(&command_runner_);
command_runner_.commands_ran_.clear();

// Trigger the build again - "out" should rebuild despite having a newer mtime than
// "in1", since "in1" was touched during the build of out (simulated by changing its
// mtime in the the test builder's WaitForCommand() which runs before FinishCommand()
command_runner_.commands_ran_.clear();
state.Reset();
EXPECT_TRUE(builder.AddTarget("out", &err));
ASSERT_EQ("", err);
EXPECT_TRUE(builder.Build(&err));
ASSERT_EQ(1u, command_runner_.commands_ran_.size());

// Check that the logfile entry is still correct
log_entry = build_log.LookupByOutput("out");
ASSERT_TRUE(NULL != log_entry);
ASSERT_EQ(fs_.files_["in1"].mtime, log_entry->mtime);
builder.command_runner_.release();
}

{
Builder builder(&state, config_, &build_log, &deps_log, &fs_, &status_, 0);
builder.command_runner_.reset(&command_runner_);
command_runner_.commands_ran_.clear();

// And a subsequent run should not have any work to do
command_runner_.commands_ran_.clear();
state.Reset();
EXPECT_TRUE(builder.AddTarget("out", &err));
ASSERT_EQ("", err);
EXPECT_TRUE(builder.AlreadyUpToDate());

builder.command_runner_.release();
}
}

TEST_F(BuildWithDepsLogTest, TestInputMtimeRaceConditionWithDepFile) {
string err;
const char* manifest =
"rule long-cc2\n"
" command = long-cc2\n"
"build out: long-cc2 in1\n"
" deps = gcc\n"
" depfile = in1.d\n";

fs_.Create("header.h", "");
fs_.Create("in1.d", "out: header.h");

State state;
ASSERT_NO_FATAL_FAILURE(AddCatRule(&state));
ASSERT_NO_FATAL_FAILURE(AssertParse(&state, manifest));

BuildLog build_log;
ASSERT_TRUE(build_log.Load("build_log", &err));
ASSERT_TRUE(build_log.OpenForWrite("build_log", *this, &err));

DepsLog deps_log;
ASSERT_TRUE(deps_log.Load("ninja_deps", &state, &err));
ASSERT_TRUE(deps_log.OpenForWrite("ninja_deps", &err));

{
Builder builder(&state, config_, &build_log, &deps_log, &fs_, &status_, 0);
builder.command_runner_.reset(&command_runner_);
command_runner_.commands_ran_.clear();

// Run the build, out gets built, dep file is created
EXPECT_TRUE(builder.AddTarget("out", &err));
ASSERT_EQ("", err);
EXPECT_TRUE(builder.Build(&err));
ASSERT_EQ(1u, command_runner_.commands_ran_.size());

// See that an entry in the logfile is created. the mtime is 3 due to header.h being
// updated during the build of out (simulated by changing its mtime in the the test
// builder's WaitForCommand() which runs before FinishCommand()
BuildLog::LogEntry* log_entry = build_log.LookupByOutput("out");
ASSERT_TRUE(NULL != log_entry);
ASSERT_EQ(3u, log_entry->mtime);

builder.command_runner_.release();
}

{
// Trigger the build again - "out" won't rebuild since its newest mtime (header.h)
// wasn't known when out was originally built and was only discovered via the deps file
// when the command completed. Subsequent runs will see out's recorded mtime equal to
// the actual most recent mtime.
Builder builder(&state, config_, &build_log, &deps_log, &fs_, &status_, 0);
builder.command_runner_.reset(&command_runner_);
command_runner_.commands_ran_.clear();

command_runner_.commands_ran_.clear();
state.Reset();
EXPECT_TRUE(builder.AddTarget("out", &err));
ASSERT_EQ("", err);
EXPECT_TRUE(builder.AlreadyUpToDate());

builder.command_runner_.release();
}
}

/// Check that a restat rule generating a header cancels compilations correctly.
TEST_F(BuildTest, RestatDepfileDependency) {
ASSERT_NO_FATAL_FAILURE(AssertParse(&state_,
Expand Down
Loading