You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We want to calculate clusters on the dataflow graph that determine which parts of code are highly dependent on each other (ie "belong together"). Open question: when can we ignore opposite-facing directed edges, and when do we have to traverse them?
Output should be a set of flowR node IDs that form a cluster, to be evaluated further later on.
Step 1: Simple/naive cluster calculation using reachability analysis.
Step 2: The result will likely be one large cluster because there are shared dependencies on setup steps, reused functions etc. We can implement a "bottleneck" node calculation that splits clusters on these sorts of nodes and creates separate clusters that all individually contain the "bottleneck" node. Open question: what constitutes a "bottleneck" node, ie when is it reused enough, and when is the cluster around it small enough, to be splittable?
Implementation as a separate "post analysis" module rather than a pipeline step.
The text was updated successfully, but these errors were encountered:
We want to calculate clusters on the dataflow graph that determine which parts of code are highly dependent on each other (ie "belong together"). Open question: when can we ignore opposite-facing directed edges, and when do we have to traverse them?
Output should be a set of flowR node IDs that form a cluster, to be evaluated further later on.
Step 1: Simple/naive cluster calculation using reachability analysis.
Step 2: The result will likely be one large cluster because there are shared dependencies on setup steps, reused functions etc. We can implement a "bottleneck" node calculation that splits clusters on these sorts of nodes and creates separate clusters that all individually contain the "bottleneck" node. Open question: what constitutes a "bottleneck" node, ie when is it reused enough, and when is the cluster around it small enough, to be splittable?
Implementation as a separate "post analysis" module rather than a pipeline step.
The text was updated successfully, but these errors were encountered: