Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-6418] Add simple per-stage visualization to the UI [WIP] #5547

Closed
wants to merge 6 commits into from
Closed

[SPARK-6418] Add simple per-stage visualization to the UI [WIP] #5547

wants to merge 6 commits into from

Conversation

ghost
Copy link

@ghost ghost commented Apr 17, 2015

Hello!

I am working on adding a graph visualization to the Spark stage page (screenshots for 10, 100, and 1000 task graphs attached). Below are some details about my implementation:

The x-axis is the time axis (with the launch time of the first task as t = 0) and the y-axis is the task # (pretty self-explanatory). At present, I'm limiting the max number of tasks displayed on the graph to 1000 but I intend to add functionality to select the graph task-range during the second iteration.

You can hover over a specific part of the graph to see a tool-tip indicating the task #, phase, and the time taken by that phase (pretty useful when there are a lot of tasks on the graph). Also, the x-axis unit is milliseconds (if the longest task takes less than a second), seconds (if the longest task takes more than a second and less than a minute), and so on for minutes and hours.

I also intend to add shuffle read and write times to the graph (once my midterms are over), and would love to get some suggestions, feedback, and criticism.

SCREENSHOTS

10 tasks:

10

100 tasks:
100

1000 tasks:

1000

@kayousterhout
Copy link
Contributor

cc @andrewor14 @pwendell @sryza

@kayousterhout
Copy link
Contributor

Jenkins, this is ok to test

@andrewor14
Copy link
Contributor

ok to test

@SparkQA
Copy link

SparkQA commented Apr 20, 2015

Test build #30610 has finished for PR 5547 at commit 7fac1eb.

  • This patch fails RAT tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@andrewor14
Copy link
Contributor

@pwendell talked about this and #2342 a little bit offline. Our feeling is that this is a more elegant representation of task times than #2342, especially when there are many tasks within a stage. One concern I have, however, what happens when you zoom (does it currently support zooming?). It would make little sense to zoom without keeping the axes, but my impression is that implementing this is pretty hard since we're directly using d3.

Bonus: It doesn't have to be part of this patch, but it would really cool if there's a mode where we can align the breakdown of the task times along the vertical axis. Right now you can't really compare the serialization time of the first task with that of the last task, let alone track whether it has grown incrementally over time. Realistically we will implement this separately say for 1.5, but I imagine this bonus feature is gonna be immensely useful.

@@ -0,0 +1,118 @@
function renderJobsGraphs(data) {
/* show visualization toggle */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use 2 spaces for indents throughout all javascript files, instead of tab characters?

@pwendell
Copy link
Contributor

Thanks a lot for submitting this. It is a cool feature - we'll need to think about whether we like this charting library vs the one in the timeline view PR. I am going to defer to @kayousterhout to give a more thorough review, but I mentioned a few things inline.

I am a bit concerned about the scalability here. I tried a job locally with 1000 tasks and it took more than 10 seconds to generate the graph. It would be good to explore what part takes a long time. I did some quick profiling and it looks like getOffsetHeight in the dimple library was the culprit... that may be tough to improve on.

Also, it might be nice to memoize the rendered graph in case someone opens and closes the tab multiple times.

@punya
Copy link
Contributor

punya commented Apr 22, 2015

Given the increasing complexity of the status pages' UI logic, does it make sense to move from manually toggling CSS classes using jQuery, to a modern single page application framework such as Angular, React or Ember?

@punya
Copy link
Contributor

punya commented Apr 23, 2015

Also, if you're looking for a reasonable path to panning/zooming, you might want to take a look at http://plottablejs.org/.

@SparkQA
Copy link

SparkQA commented Apr 24, 2015

Test build #30910 has finished for PR 5547 at commit 5c3a2a6.

  • This patch fails RAT tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@ghost
Copy link
Author

ghost commented Apr 24, 2015

Thank you all for your feedback, and I apologize for my late reply (it’s been a rough week of midterms).

@pwendell - I’ve addressed all your inline comments (memoization, Javascript indentation, JSON lists, etc.) in my latest commit. As per the load time of the graph, it’s improved a bit after moving from string representations to JSON arrays, but only by a small factor.

When you say you’re skeptical about the graph scalability, what is the maximum number of tasks you want displayed on the graph? I’m thinking of keeping it to 1000 (at the most), and having the users select a task range if they want to view a different region of tasks (say tasks 1200-2000 for example).

My reason for the above is that the task stages become too cluttered above a certain number, so it’s better to keep a limit, or alternatively, increase the max height of the graph (which would involve a lot more scrolling though).

@andrewor14 - The visualization doesn’t currently support zooming, and it will definitely be pretty challenging to implement it on top of D3.js. However, the task-range functionality I mentioned above can serve as a pseudo-zoom feature since a user can select a task range and hence zoom into the graph.

Also, breaking down the task times along the vertical axis shouldn’t be that difficult so we can definitely add that later on if required (provided this patch gets accepted haha).

@punya - I haven’t looked into using Angular/Ember yet, and I’ll definitely check out plottable.js.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@ghost
Copy link
Author

ghost commented Apr 27, 2015

After some further research, I have come to the conclusion that if we want to have over a 1000 tasks displayed on the graph, d3.js or libraries on top of it aren't the best choice since rendering that many SVG elements is bound to be slow above a certain limit.

@punya
Copy link
Contributor

punya commented Apr 27, 2015

If there are over 1000 tasks, it seems like it would be more valuable to see

  • statistical information about all the tasks
  • details about a smaller subset chosen in some way (like a drill down view)
    Rendering 10k tasks is possible using a canvas, but it's unclear to me what
    a user would do with that much density of information.

Punya
On Mon, Apr 27, 2015 at 6:33 PM Pradyumn Shroff notifications@github.com
wrote:

After some further research, I have come to the conclusion that if we want
to have over a 1000 tasks displayed on the graph, d3.js or libraries on top
of it aren't the best choice since rendering that many SVG elements is
bound to be slow above a certain limit.


Reply to this email directly or view it on GitHub
#5547 (comment).

@ghost
Copy link
Author

ghost commented Apr 27, 2015

That's exactly my viewpoint but to be very honest, I haven't used Spark much so I'm not sure what the average use case is.

@ghost
Copy link
Author

ghost commented Apr 29, 2015

The Spark administrators have to decided to go forward with #2342 so I'm closing this pull-request.

@ghost ghost closed this Apr 29, 2015
asfgit pushed a commit that referenced this pull request May 4, 2015
This patch adds the functionality to display the RDD DAG on the SparkUI.

This DAG describes the relationships between
- an RDD and its dependencies,
- an RDD and its operation scopes, and
- an RDD's operation scopes and the stage / job hierarchy

An operation scope here refers to the existing public APIs that created the RDDs (e.g. `textFile`, `treeAggregate`). In the future, we can expand this to include higher level operations like SQL queries.

*Note: This blatantly stole a few lines of HTML and JavaScript from #5547 (thanks shroffpradyumn!)*

Here's what the job page looks like:
<img src="https://issues.apache.org/jira/secure/attachment/12730286/job-page.png" width="700px"/>
and the stage page:
<img src="https://issues.apache.org/jira/secure/attachment/12730287/stage-page.png" width="300px"/>

Author: Andrew Or <andrew@databricks.com>

Closes #5729 from andrewor14/viz2 and squashes the following commits:

666c03b [Andrew Or] Round corners of RDD boxes on stage page (minor)
01ba336 [Andrew Or] Change RDD cache color to red (minor)
6f9574a [Andrew Or] Add tests for RDDOperationScope
1c310e4 [Andrew Or] Wrap a few more RDD functions in an operation scope
3ffe566 [Andrew Or] Restore "null" as default for RDD name
5fdd89d [Andrew Or] children -> child (minor)
0d07a84 [Andrew Or] Fix python style
afb98e2 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2
0d7aa32 [Andrew Or] Fix python tests
3459ab2 [Andrew Or] Fix tests
832443c [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2
429e9e1 [Andrew Or] Display cached RDDs on the viz
b1f0fd1 [Andrew Or] Rename OperatorScope -> RDDOperationScope
31aae06 [Andrew Or] Extract visualization logic from listener
83f9c58 [Andrew Or] Implement a programmatic representation of operator scopes
5a7faf4 [Andrew Or] Rename references to viz scopes to viz clusters
ee33d52 [Andrew Or] Separate HTML generating code from listener
f9830a2 [Andrew Or] Refactor + clean up + document JS visualization code
b80cc52 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2
0706992 [Andrew Or] Add link from jobs to stages
deb48a0 [Andrew Or] Translate stage boxes taking into account the width
5c7ce16 [Andrew Or] Connect RDDs across stages + update style
ab91416 [Andrew Or] Introduce visualization to the Job Page
5f07e9c [Andrew Or] Remove more return statements from scopes
5e388ea [Andrew Or] Fix line too long
43de96e [Andrew Or] Add parent IDs to StageInfo
6e2cfea [Andrew Or] Remove all return statements in `withScope`
d19c4da [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2
7ef957c [Andrew Or] Fix scala style
4310271 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2
aa868a9 [Andrew Or] Ensure that HadoopRDD is actually serializable
c3bfcae [Andrew Or] Re-implement scopes using closures instead of annotations
52187fc [Andrew Or] Rat excludes
09d361e [Andrew Or] Add ID to node label (minor)
71281fa [Andrew Or] Embed the viz in the UI in a toggleable manner
8dd5af2 [Andrew Or] Fill in documentation + miscellaneous minor changes
fe7816f [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz
205f838 [Andrew Or] Reimplement rendering with dagre-d3 instead of viz.js
5e22946 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz
6a7cdca [Andrew Or] Move RDD scope util methods and logic to its own file
494d5c2 [Andrew Or] Revert a few unintended style changes
9fac6f3 [Andrew Or] Re-implement scopes through annotations instead
f22f337 [Andrew Or] First working implementation of visualization with vis.js
2184348 [Andrew Or] Translate RDD information to dot file
5143523 [Andrew Or] Expose the necessary information in RDDInfo
a9ed4f9 [Andrew Or] Add a few missing scopes to certain RDD methods
6b3403b [Andrew Or] Scope all RDD methods
asfgit pushed a commit that referenced this pull request May 4, 2015
This patch adds the functionality to display the RDD DAG on the SparkUI.

This DAG describes the relationships between
- an RDD and its dependencies,
- an RDD and its operation scopes, and
- an RDD's operation scopes and the stage / job hierarchy

An operation scope here refers to the existing public APIs that created the RDDs (e.g. `textFile`, `treeAggregate`). In the future, we can expand this to include higher level operations like SQL queries.

*Note: This blatantly stole a few lines of HTML and JavaScript from #5547 (thanks shroffpradyumn!)*

Here's what the job page looks like:
<img src="https://issues.apache.org/jira/secure/attachment/12730286/job-page.png" width="700px"/>
and the stage page:
<img src="https://issues.apache.org/jira/secure/attachment/12730287/stage-page.png" width="300px"/>

Author: Andrew Or <andrew@databricks.com>

Closes #5729 from andrewor14/viz2 and squashes the following commits:

666c03b [Andrew Or] Round corners of RDD boxes on stage page (minor)
01ba336 [Andrew Or] Change RDD cache color to red (minor)
6f9574a [Andrew Or] Add tests for RDDOperationScope
1c310e4 [Andrew Or] Wrap a few more RDD functions in an operation scope
3ffe566 [Andrew Or] Restore "null" as default for RDD name
5fdd89d [Andrew Or] children -> child (minor)
0d07a84 [Andrew Or] Fix python style
afb98e2 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2
0d7aa32 [Andrew Or] Fix python tests
3459ab2 [Andrew Or] Fix tests
832443c [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2
429e9e1 [Andrew Or] Display cached RDDs on the viz
b1f0fd1 [Andrew Or] Rename OperatorScope -> RDDOperationScope
31aae06 [Andrew Or] Extract visualization logic from listener
83f9c58 [Andrew Or] Implement a programmatic representation of operator scopes
5a7faf4 [Andrew Or] Rename references to viz scopes to viz clusters
ee33d52 [Andrew Or] Separate HTML generating code from listener
f9830a2 [Andrew Or] Refactor + clean up + document JS visualization code
b80cc52 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2
0706992 [Andrew Or] Add link from jobs to stages
deb48a0 [Andrew Or] Translate stage boxes taking into account the width
5c7ce16 [Andrew Or] Connect RDDs across stages + update style
ab91416 [Andrew Or] Introduce visualization to the Job Page
5f07e9c [Andrew Or] Remove more return statements from scopes
5e388ea [Andrew Or] Fix line too long
43de96e [Andrew Or] Add parent IDs to StageInfo
6e2cfea [Andrew Or] Remove all return statements in `withScope`
d19c4da [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2
7ef957c [Andrew Or] Fix scala style
4310271 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2
aa868a9 [Andrew Or] Ensure that HadoopRDD is actually serializable
c3bfcae [Andrew Or] Re-implement scopes using closures instead of annotations
52187fc [Andrew Or] Rat excludes
09d361e [Andrew Or] Add ID to node label (minor)
71281fa [Andrew Or] Embed the viz in the UI in a toggleable manner
8dd5af2 [Andrew Or] Fill in documentation + miscellaneous minor changes
fe7816f [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz
205f838 [Andrew Or] Reimplement rendering with dagre-d3 instead of viz.js
5e22946 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz
6a7cdca [Andrew Or] Move RDD scope util methods and logic to its own file
494d5c2 [Andrew Or] Revert a few unintended style changes
9fac6f3 [Andrew Or] Re-implement scopes through annotations instead
f22f337 [Andrew Or] First working implementation of visualization with vis.js
2184348 [Andrew Or] Translate RDD information to dot file
5143523 [Andrew Or] Expose the necessary information in RDDInfo
a9ed4f9 [Andrew Or] Add a few missing scopes to certain RDD methods
6b3403b [Andrew Or] Scope all RDD methods
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request May 28, 2015
This patch adds the functionality to display the RDD DAG on the SparkUI.

This DAG describes the relationships between
- an RDD and its dependencies,
- an RDD and its operation scopes, and
- an RDD's operation scopes and the stage / job hierarchy

An operation scope here refers to the existing public APIs that created the RDDs (e.g. `textFile`, `treeAggregate`). In the future, we can expand this to include higher level operations like SQL queries.

*Note: This blatantly stole a few lines of HTML and JavaScript from apache#5547 (thanks shroffpradyumn!)*

Here's what the job page looks like:
<img src="https://issues.apache.org/jira/secure/attachment/12730286/job-page.png" width="700px"/>
and the stage page:
<img src="https://issues.apache.org/jira/secure/attachment/12730287/stage-page.png" width="300px"/>

Author: Andrew Or <andrew@databricks.com>

Closes apache#5729 from andrewor14/viz2 and squashes the following commits:

666c03b [Andrew Or] Round corners of RDD boxes on stage page (minor)
01ba336 [Andrew Or] Change RDD cache color to red (minor)
6f9574a [Andrew Or] Add tests for RDDOperationScope
1c310e4 [Andrew Or] Wrap a few more RDD functions in an operation scope
3ffe566 [Andrew Or] Restore "null" as default for RDD name
5fdd89d [Andrew Or] children -> child (minor)
0d07a84 [Andrew Or] Fix python style
afb98e2 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2
0d7aa32 [Andrew Or] Fix python tests
3459ab2 [Andrew Or] Fix tests
832443c [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2
429e9e1 [Andrew Or] Display cached RDDs on the viz
b1f0fd1 [Andrew Or] Rename OperatorScope -> RDDOperationScope
31aae06 [Andrew Or] Extract visualization logic from listener
83f9c58 [Andrew Or] Implement a programmatic representation of operator scopes
5a7faf4 [Andrew Or] Rename references to viz scopes to viz clusters
ee33d52 [Andrew Or] Separate HTML generating code from listener
f9830a2 [Andrew Or] Refactor + clean up + document JS visualization code
b80cc52 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2
0706992 [Andrew Or] Add link from jobs to stages
deb48a0 [Andrew Or] Translate stage boxes taking into account the width
5c7ce16 [Andrew Or] Connect RDDs across stages + update style
ab91416 [Andrew Or] Introduce visualization to the Job Page
5f07e9c [Andrew Or] Remove more return statements from scopes
5e388ea [Andrew Or] Fix line too long
43de96e [Andrew Or] Add parent IDs to StageInfo
6e2cfea [Andrew Or] Remove all return statements in `withScope`
d19c4da [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2
7ef957c [Andrew Or] Fix scala style
4310271 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2
aa868a9 [Andrew Or] Ensure that HadoopRDD is actually serializable
c3bfcae [Andrew Or] Re-implement scopes using closures instead of annotations
52187fc [Andrew Or] Rat excludes
09d361e [Andrew Or] Add ID to node label (minor)
71281fa [Andrew Or] Embed the viz in the UI in a toggleable manner
8dd5af2 [Andrew Or] Fill in documentation + miscellaneous minor changes
fe7816f [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz
205f838 [Andrew Or] Reimplement rendering with dagre-d3 instead of viz.js
5e22946 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz
6a7cdca [Andrew Or] Move RDD scope util methods and logic to its own file
494d5c2 [Andrew Or] Revert a few unintended style changes
9fac6f3 [Andrew Or] Re-implement scopes through annotations instead
f22f337 [Andrew Or] First working implementation of visualization with vis.js
2184348 [Andrew Or] Translate RDD information to dot file
5143523 [Andrew Or] Expose the necessary information in RDDInfo
a9ed4f9 [Andrew Or] Add a few missing scopes to certain RDD methods
6b3403b [Andrew Or] Scope all RDD methods
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request Jun 12, 2015
This patch adds the functionality to display the RDD DAG on the SparkUI.

This DAG describes the relationships between
- an RDD and its dependencies,
- an RDD and its operation scopes, and
- an RDD's operation scopes and the stage / job hierarchy

An operation scope here refers to the existing public APIs that created the RDDs (e.g. `textFile`, `treeAggregate`). In the future, we can expand this to include higher level operations like SQL queries.

*Note: This blatantly stole a few lines of HTML and JavaScript from apache#5547 (thanks shroffpradyumn!)*

Here's what the job page looks like:
<img src="https://issues.apache.org/jira/secure/attachment/12730286/job-page.png" width="700px"/>
and the stage page:
<img src="https://issues.apache.org/jira/secure/attachment/12730287/stage-page.png" width="300px"/>

Author: Andrew Or <andrew@databricks.com>

Closes apache#5729 from andrewor14/viz2 and squashes the following commits:

666c03b [Andrew Or] Round corners of RDD boxes on stage page (minor)
01ba336 [Andrew Or] Change RDD cache color to red (minor)
6f9574a [Andrew Or] Add tests for RDDOperationScope
1c310e4 [Andrew Or] Wrap a few more RDD functions in an operation scope
3ffe566 [Andrew Or] Restore "null" as default for RDD name
5fdd89d [Andrew Or] children -> child (minor)
0d07a84 [Andrew Or] Fix python style
afb98e2 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2
0d7aa32 [Andrew Or] Fix python tests
3459ab2 [Andrew Or] Fix tests
832443c [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2
429e9e1 [Andrew Or] Display cached RDDs on the viz
b1f0fd1 [Andrew Or] Rename OperatorScope -> RDDOperationScope
31aae06 [Andrew Or] Extract visualization logic from listener
83f9c58 [Andrew Or] Implement a programmatic representation of operator scopes
5a7faf4 [Andrew Or] Rename references to viz scopes to viz clusters
ee33d52 [Andrew Or] Separate HTML generating code from listener
f9830a2 [Andrew Or] Refactor + clean up + document JS visualization code
b80cc52 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2
0706992 [Andrew Or] Add link from jobs to stages
deb48a0 [Andrew Or] Translate stage boxes taking into account the width
5c7ce16 [Andrew Or] Connect RDDs across stages + update style
ab91416 [Andrew Or] Introduce visualization to the Job Page
5f07e9c [Andrew Or] Remove more return statements from scopes
5e388ea [Andrew Or] Fix line too long
43de96e [Andrew Or] Add parent IDs to StageInfo
6e2cfea [Andrew Or] Remove all return statements in `withScope`
d19c4da [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2
7ef957c [Andrew Or] Fix scala style
4310271 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2
aa868a9 [Andrew Or] Ensure that HadoopRDD is actually serializable
c3bfcae [Andrew Or] Re-implement scopes using closures instead of annotations
52187fc [Andrew Or] Rat excludes
09d361e [Andrew Or] Add ID to node label (minor)
71281fa [Andrew Or] Embed the viz in the UI in a toggleable manner
8dd5af2 [Andrew Or] Fill in documentation + miscellaneous minor changes
fe7816f [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz
205f838 [Andrew Or] Reimplement rendering with dagre-d3 instead of viz.js
5e22946 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz
6a7cdca [Andrew Or] Move RDD scope util methods and logic to its own file
494d5c2 [Andrew Or] Revert a few unintended style changes
9fac6f3 [Andrew Or] Re-implement scopes through annotations instead
f22f337 [Andrew Or] First working implementation of visualization with vis.js
2184348 [Andrew Or] Translate RDD information to dot file
5143523 [Andrew Or] Expose the necessary information in RDDInfo
a9ed4f9 [Andrew Or] Add a few missing scopes to certain RDD methods
6b3403b [Andrew Or] Scope all RDD methods
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
This patch adds the functionality to display the RDD DAG on the SparkUI.

This DAG describes the relationships between
- an RDD and its dependencies,
- an RDD and its operation scopes, and
- an RDD's operation scopes and the stage / job hierarchy

An operation scope here refers to the existing public APIs that created the RDDs (e.g. `textFile`, `treeAggregate`). In the future, we can expand this to include higher level operations like SQL queries.

*Note: This blatantly stole a few lines of HTML and JavaScript from apache#5547 (thanks shroffpradyumn!)*

Here's what the job page looks like:
<img src="https://issues.apache.org/jira/secure/attachment/12730286/job-page.png" width="700px"/>
and the stage page:
<img src="https://issues.apache.org/jira/secure/attachment/12730287/stage-page.png" width="300px"/>

Author: Andrew Or <andrew@databricks.com>

Closes apache#5729 from andrewor14/viz2 and squashes the following commits:

666c03b [Andrew Or] Round corners of RDD boxes on stage page (minor)
01ba336 [Andrew Or] Change RDD cache color to red (minor)
6f9574a [Andrew Or] Add tests for RDDOperationScope
1c310e4 [Andrew Or] Wrap a few more RDD functions in an operation scope
3ffe566 [Andrew Or] Restore "null" as default for RDD name
5fdd89d [Andrew Or] children -> child (minor)
0d07a84 [Andrew Or] Fix python style
afb98e2 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2
0d7aa32 [Andrew Or] Fix python tests
3459ab2 [Andrew Or] Fix tests
832443c [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2
429e9e1 [Andrew Or] Display cached RDDs on the viz
b1f0fd1 [Andrew Or] Rename OperatorScope -> RDDOperationScope
31aae06 [Andrew Or] Extract visualization logic from listener
83f9c58 [Andrew Or] Implement a programmatic representation of operator scopes
5a7faf4 [Andrew Or] Rename references to viz scopes to viz clusters
ee33d52 [Andrew Or] Separate HTML generating code from listener
f9830a2 [Andrew Or] Refactor + clean up + document JS visualization code
b80cc52 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2
0706992 [Andrew Or] Add link from jobs to stages
deb48a0 [Andrew Or] Translate stage boxes taking into account the width
5c7ce16 [Andrew Or] Connect RDDs across stages + update style
ab91416 [Andrew Or] Introduce visualization to the Job Page
5f07e9c [Andrew Or] Remove more return statements from scopes
5e388ea [Andrew Or] Fix line too long
43de96e [Andrew Or] Add parent IDs to StageInfo
6e2cfea [Andrew Or] Remove all return statements in `withScope`
d19c4da [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2
7ef957c [Andrew Or] Fix scala style
4310271 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2
aa868a9 [Andrew Or] Ensure that HadoopRDD is actually serializable
c3bfcae [Andrew Or] Re-implement scopes using closures instead of annotations
52187fc [Andrew Or] Rat excludes
09d361e [Andrew Or] Add ID to node label (minor)
71281fa [Andrew Or] Embed the viz in the UI in a toggleable manner
8dd5af2 [Andrew Or] Fill in documentation + miscellaneous minor changes
fe7816f [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz
205f838 [Andrew Or] Reimplement rendering with dagre-d3 instead of viz.js
5e22946 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz
6a7cdca [Andrew Or] Move RDD scope util methods and logic to its own file
494d5c2 [Andrew Or] Revert a few unintended style changes
9fac6f3 [Andrew Or] Re-implement scopes through annotations instead
f22f337 [Andrew Or] First working implementation of visualization with vis.js
2184348 [Andrew Or] Translate RDD information to dot file
5143523 [Andrew Or] Expose the necessary information in RDDInfo
a9ed4f9 [Andrew Or] Add a few missing scopes to certain RDD methods
6b3403b [Andrew Or] Scope all RDD methods
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants