Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't perform any new queries while reading a query result on disk #91919

Merged
merged 5 commits into from
Jan 8, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 69 additions & 0 deletions compiler/rustc_query_system/src/dep_graph/graph.rs
Original file line number Diff line number Diff line change
Expand Up @@ -171,6 +171,62 @@ impl<K: DepKind> DepGraph<K> {
K::with_deps(None, op)
}

/// Used to wrap the deserialization of a query result from disk,
/// This method enforces that no new `DepNodes` are created during
/// query result deserialization.
///
/// Enforcing this makes the query dep graph simpler - all nodes
/// must be created during the query execution, and should be
/// created from inside the 'body' of a query (the implementation
/// provided by a particular compiler crate).
///
/// Consider the case of three queries `A`, `B`, and `C`, where
/// `A` invokes `B` and `B` invokes `C`:
///
/// `A -> B -> C`
///
/// Suppose that decoding the result of query `B` required re-computing
/// the query `C`. If we did not create a fresh `TaskDeps` when
/// decoding `B`, we would still be using the `TaskDeps` for query `A`
/// (if we needed to re-execute `A`). This would cause us to create
/// a new edge `A -> C`. If this edge did not previously
/// exist in the `DepGraph`, then we could end up with a different
/// `DepGraph` at the end of compilation, even if there were no
/// meaningful changes to the overall program (e.g. a newline was added).
/// In addition, this edge might cause a subsequent compilation run
/// to try to force `C` before marking other necessary nodes green. If
/// `C` did not exist in the new compilation session, then we could
/// get an ICE. Normally, we would have tried (and failed) to mark
/// some other query green (e.g. `item_children`) which was used
/// to obtain `C`, which would prevent us from ever trying to force
/// a non-existent `D`.
Aaron1011 marked this conversation as resolved.
Show resolved Hide resolved
///
/// It might be possible to enforce that all `DepNode`s read during
/// deserialization already exist in the previous `DepGraph`. In
/// the above example, we would invoke `D` during the deserialization
/// of `B`. Since we correctly create a new `TaskDeps` from the decoding
/// of `B`, this would result in an edge `B -> D`. If that edge already
/// existed (with the same `DepPathHash`es), then it should be correct
/// to allow the invocation of the query to proceed during deserialization
/// of a query result. We would merely assert that the dep-graph fragment
/// that would have been added by invoking `C` while decoding `B`
/// is equivalent to the dep-graph fragment that we already instantiated for B
/// (at the point where we successfully marked B as green).
///
/// However, this would require additional complexity
/// in the query infrastructure, and is not currently needed by the
/// decoding of any query results. Should the need arise in the future,
/// we should consider extending the query system with this functionality.
pub fn with_query_deserialization<OP, R>(&self, op: OP) -> R
where
OP: FnOnce() -> R,
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a fast path if the dep-graph is disabled?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should never be called, if the dep-graph is disabled, right? Maybe we could add a debug assertion that the dep-graph is indeed enabled.

let mut deps = TaskDeps::default();
deps.read_allowed = false;
let deps = Lock::new(deps);
K::with_deps(Some(&deps), op)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we prefer to use an enum for deps:

enum TaskDepsRef<'a> {
  Allow(&'a Lock<TaskDeps>),
  Ignore,
  Forbid,
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #92681

}

/// Starts a new dep-graph task. Dep-graph tasks are specified
/// using a free function (`task`) and **not** a closure -- this
/// is intentional because we want to exercise tight control over
Expand Down Expand Up @@ -251,6 +307,7 @@ impl<K: DepKind> DepGraph<K> {
reads: SmallVec::new(),
read_set: Default::default(),
phantom_data: PhantomData,
read_allowed: true,
}))
};
let result = K::with_deps(task_deps.as_ref(), || task(cx, arg));
Expand Down Expand Up @@ -362,6 +419,11 @@ impl<K: DepKind> DepGraph<K> {
if let Some(task_deps) = task_deps {
let mut task_deps = task_deps.lock();
let task_deps = &mut *task_deps;

if !task_deps.read_allowed {
panic!("Illegal read of: {:?}", dep_node_index);
}

if cfg!(debug_assertions) {
data.current.total_read_count.fetch_add(1, Relaxed);
}
Expand Down Expand Up @@ -1115,6 +1177,12 @@ pub struct TaskDeps<K> {
reads: EdgesVec,
read_set: FxHashSet<DepNodeIndex>,
phantom_data: PhantomData<DepNode<K>>,
/// Whether or not we allow `DepGraph::read_index` to run.
/// This is normally true, except inside `with_query_deserialization`,
/// where it set to `false` to enforce that no new `DepNode` edges are
/// created. See the documentation of `with_query_deserialization` for
/// more details.
read_allowed: bool,
}

impl<K> Default for TaskDeps<K> {
Expand All @@ -1125,6 +1193,7 @@ impl<K> Default for TaskDeps<K> {
reads: EdgesVec::new(),
read_set: FxHashSet::default(),
phantom_data: PhantomData,
read_allowed: true,
}
}
}
Expand Down
9 changes: 7 additions & 2 deletions compiler/rustc_query_system/src/query/plumbing.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ use crate::query::job::{
report_cycle, QueryInfo, QueryJob, QueryJobId, QueryJobInfo, QueryShardJobId,
};
use crate::query::{QueryContext, QueryMap, QuerySideEffects, QueryStackFrame};

use rustc_data_structures::fingerprint::Fingerprint;
use rustc_data_structures::fx::{FxHashMap, FxHasher};
#[cfg(parallel_compiler)]
Expand Down Expand Up @@ -515,7 +514,13 @@ where
// Some things are never cached on disk.
if query.cache_on_disk {
let prof_timer = tcx.dep_context().profiler().incr_cache_loading();
let result = query.try_load_from_disk(tcx, prev_dep_node_index);

michaelwoerister marked this conversation as resolved.
Show resolved Hide resolved
// The call to `with_query_deserialization` enforces that no new `DepNodes`
// are created during deserialization. See the docs of that method for more
// details.
let result = dep_graph
.with_query_deserialization(|| query.try_load_from_disk(tcx, prev_dep_node_index));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can even forbid queries inside the invocation code. Can this call be moved around the NotYetStarted branch of try_execute_query? It's not supposed to invoke any user-provided code directly, only call into the DepGraph to setup a TaskDeps.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this earlier, but there are some invocations of def_span within the query infrastructure itself. I think those could be removed in a follow-up PR, since this PR should catch almost all of the cases.


prof_timer.finish_with_query_invocation_id(dep_node_index.into());

if let Some(result) = result {
Expand Down