Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Dataflow analysis framework #1476

Open
wants to merge 176 commits into
base: main
Choose a base branch
from
Open

feat: Dataflow analysis framework #1476

wants to merge 176 commits into from

Conversation

acl-cqc
Copy link
Contributor

@acl-cqc acl-cqc commented Aug 28, 2024

Forwards analysis only ATM, parametrized over the abstract domain hence intended to support not only constant folding but (in the future) devirtualization, intergraph-edge-insertion, etc. See #1603 for an "example" use for constant-folding, where it does better than the existing code.

Much complexity is to do with "native" (irrespective of the underlying domain) treatment of Sum types necessary for proper understanding of control flow (e.g. conditionals, loops, CFGs).

Note: we'll be able to separate the DFContext from the HugrView, which'll be neater, if we do #1636 first.


Intended as a development of #1157, with significant changes:

  • Constant-folding and ValueHandle now stripped out, these will follow in a second PR

  • Everything is now in hugr-passes

  • Underlying domain of values abstracted over a trait AbstractValue (ValueHandle will implement this), which represents non-Sum values

  • datalog uses PartialValue wrapped around the AbstractValue to represent (Partial)Sums and make into a BoundedLattice

  • The old PV is gone (PartialValue directly implements BoundedLattice)

  • Interpretation of leaf (extension) ops is handled by the DFContext trait (although MakeTuple, and Untuple are handled by the framework - really prelude MakeTuple is just core Tag and Untuple is a single-Case Conditional with passthrough wires....); the framework handles routing of sums through these ops and all containers, also loading constants (with the DFContext handling non-Sum leaf Values).

  • Various refactoring of handling values (inc. in datalog) - variant_values+as_sum + more use of rows rather than indexing (this got rid of a bunch of unwraps and so on), significant refactoring of join/meet (and no _unsafe).

  • I've managed to refactor tests not to use ValueHandle etc. - they are only dealing with sum/loop/conditional routing after all. dataflow/test.rs uses about the simplest possible TestContext which provides zero information after any leaf-op - so we only get the framework-provided handling of Tag/MakeTuple/etc.

propolutate_out_wires largely superceded by passing root-node inputs into Machine::run, but still available for tests.

@acl-cqc acl-cqc force-pushed the acl/const_fold2 branch 2 times, most recently from 1594e6f to e1c49d7 Compare August 28, 2024 17:31
@acl-cqc acl-cqc requested a review from doug-q September 2, 2024 09:09
@acl-cqc acl-cqc changed the title DRAFT(v2?) Datalog-style constant-folding skeleton feat: Dataflow analysis framework and use for constant-folding Sep 2, 2024
Copy link

codecov bot commented Sep 2, 2024

Codecov Report

Attention: Patch coverage is 85.45619% with 161 lines in your changes missing coverage. Please review.

Project coverage is 85.87%. Comparing base (5562c91) to head (da2981c).

Files with missing lines Patch % Lines
hugr-passes/src/dataflow/datalog.rs 80.00% 30 Missing and 27 partials ⚠️
hugr-passes/src/dataflow/partial_value.rs 86.72% 47 Missing and 2 partials ⚠️
hugr-passes/src/dataflow.rs 43.18% 25 Missing ⚠️
hugr-passes/src/dataflow/results.rs 74.19% 13 Missing and 3 partials ⚠️
hugr-passes/src/dataflow/value_row.rs 70.83% 14 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1476      +/-   ##
==========================================
+ Coverage   85.79%   85.87%   +0.07%     
==========================================
  Files         136      142       +6     
  Lines       25180    26287    +1107     
  Branches    22092    23199    +1107     
==========================================
+ Hits        21603    22573     +970     
- Misses       2427     2532     +105     
- Partials     1150     1182      +32     
Flag Coverage Δ
python 92.42% <ø> (ø)
rust 84.99% <85.45%> (+0.13%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

* DFContext reinstate fn hugr(), drop AsRef requirement (fixes StackOverflow)
* test_tail_loop_iterates_twice: use tail_loop_builder_exts, fix from #1332(?)
* Fix only-one-DataflowContext asserts using Arc::ptr_eq
@acl-cqc
Copy link
Contributor Author

acl-cqc commented Oct 28, 2024

I had to recombine DFContext with the Hugr to get round some Rust mutable/immutable reference ownership issues in #1603; I tried AsRef<Hugr> (sufficient to get you the HugrView methods - or indeed just requiring HugrView, as implementing AsRef gets you that anyway), but have gone back to Deref<impl HugrView> as this should allow running analysis on a region (e.g. DescendantsView). I guess I could be persuaded to do something like

trait DFContext<V> {
  type View : HugrView; // as at present
  fn view() -> &Self::View
}

but the extra .view() before every HugrView call seems a bit of a pain (@doug-q do you feel strongly?).

/// If this PartialSum had multiple possible tags; or if `typ` was not a [TypeEnum::Sum]
/// supporting the single possible tag with the correct number of elements and no row variables;
/// or if converting a child element failed via [PartialValue::try_into_value].
pub fn try_into_value<VE, SE, V2: TryFrom<V, Error = VE> + TryFrom<Sum<V2>, Error = SE>>(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is highly parametric. The first case will be the "easy" one, where V2 is hugr_core::ops::Value, in which case VE = Infallible and SE = ConstTypeError. But I think it's important to keep this open for other analysis domains.

It might be more legible if written <V2, VE, SE> ..... where V2: TryFrom<V, Error=VE> + TryFrom<Sum<V2>, Error=SE>>, happy to do that if anyone shouts.

The bigger issue may be nomenclature - this relates to ExtractValueError so perhaps should be (try_)extract_value? And also to Machine::try_read_wire_value - so, try_extract_wire_value?? However I quite like the word "concrete" so could perhaps use that naming to tie the three together - "try_into_concrete", "try_read_wire_concrete", and, erm, "ConcretizationError" (hmmm, not so good....). Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be more legible if written <V2, VE, SE> ..... where V2: TryFrom<V, Error=VE> + TryFrom<Sum<V2>, Error=SE>>, happy to do that if anyone shouts.

I think you should!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - yes good to standardize the order of the args with try_read_wire_value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts on naming?


impl<V: AbstractValue> Machine<V> {
/// Provide initial values for some wires.
// Likely for test purposes only - should we make non-pub or #[cfg(test)] ?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure quite what to do here. @doug-q may have opinions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We think this potentially useful (albeit not very powerful, as any prepopulated value will be joined with anything computed). A version that "breaks" the wire (replacing any value computed, without join) would be more effective, also dangerous....

.push((root, IncomingPort::from(i), PartialValue::Top));
}
}
// Note/TODO, if analysis is running on a subregion then we should do similar
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll raise an issue for this. It's not clear how to identify nonlocal edges into a subregion, or to differentiate between nonlocal edges whose source we can't see, and ports with no edge at all, where perhaps "Bottom" is appropriate (it really can't happen, that bit of Hugr can't run....).

@acl-cqc acl-cqc self-assigned this Nov 6, 2024
_e: &ExtensionOp,
_ins: &[PartialValue<V>],
_outs: &mut [PartialValue<V>],
) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to provide default implementations for trait methods, you can just write fn f(...); instead of fn f(...) { }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this impl is totally ok as a default - it means, I know nothing about any leaf op

@cqc-alec cqc-alec requested review from croyzor and removed request for cqc-alec November 7, 2024 09:37
@acl-cqc
Copy link
Contributor Author

acl-cqc commented Nov 8, 2024

Note we could separate the DFContext from HugrView - which would be significantly neater - if we did #1636 first, but we need that because ValidationLevel::run_validated_pass only gives us a &mut impl HugrMut.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants