-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect #41964
Conversation
8caf083
to
b54d2b8
Compare
5c70890
to
6718961
Compare
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala
Show resolved
Hide resolved
6718961
to
fcf0ca6
Compare
@juliuszsompolski @grundprinzip Can you help review this PR? This builds off of #41443, so the commit to look at is fcf0ca6. Also cc @rednaxelafx @gengliangwang |
Since the Spark connect is more about SQL, shall we show the SQL execution links on the UI? |
...connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala
Outdated
Show resolved
Hide resolved
5842eb0
to
7118923
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reviewed connect service parts, didn't review ui
package.
.../connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectServer.scala
Show resolved
Hide resolved
...connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala
Outdated
Show resolved
Hide resolved
...connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala
Outdated
Show resolved
Hide resolved
16ee63f
to
ff161f8
Compare
ff161f8
to
78a4a0f
Compare
@jasonli-db What is your opinion of this one? |
|
5023786
to
f0c7479
Compare
...nnect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerListener.scala
Outdated
Show resolved
Hide resolved
f0c7479
to
a07af7b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed and LGTM on the connect part; didn't review details of the UI and state store part.
val executionIdOpt: Option[String] = Option(jobStart.properties) | ||
.flatMap { p => Option(p.getProperty(SQLExecution.EXECUTION_ID_KEY)) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline with @jasonli-db and @jdesjean that for executions that don't start a Spark Job (e.g. SELECT 1), they will not be recorded. For that reason, as a followup, it would be best if jobTags could also be added to the SparkListenerSQLExecutionStart event.
@gengliangwang Finish time is finish execution, close time is when all the results were transferred; execution time is the time query was executing, while duration is end to end. It is meaningful for queries with large results, or e.g. complex queries that spend a lot of time in optimizer. It's the same with Thriftserver UI. |
3f11caf
to
69f76df
Compare
ff8f815
to
59b415c
Compare
Thanks for catching this. It's been fixed |
@juliuszsompolski @gengliangwang I've updated the screenshots. I added the operationID and sparkSessionTags. For now, I kept the jobTag since it seems it could still be used if only internally. I'll remove it if/when we come to a final consensus to do so. Please take a look! |
@jasonli-db I have to say to the current table is quite wide. Can we at least put the operationID column behind the column "State" since it is not commonly useful? A good way to resolve this is to have checkboxes and make operationID/sparkSessionTags/jobTag as additional metrics: |
59b415c
to
5280131
Compare
...server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerAppStatusStore.scala
Show resolved
Hide resolved
</span> | ||
</td> | ||
<td> | ||
{if (info.isExecutionActive) "RUNNING" else info.state} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I think it makes more sense, thanks!
@jasonli-db Thanks for the work. I will merge the PR once the tests are passed. |
e4376c0
to
ccab221
Compare
ccab221
to
c44bc52
Compare
Thanks, merging to master |
@jasonli-db This PR has conflict with the branch-3.5 |
## What changes were proposed in this pull request? Add a new Spark UI page to display session and execution information for Spark Connect. This builds of the work in SPARK-43923 (apache#41443) that adds the relevant SparkListenerEvents and mirrors the ThriftServerPage in the Spark UI for JDBC/ODBC. <img width="1709" alt="Screenshot 2023-07-27 at 11 29 22 PM" src="https://github.com/apache/spark/assets/65624911/934b7c69-3b44-460b-8fbb-36a9eb3f0798"> <img width="1716" alt="Screenshot 2023-07-27 at 11 29 15 PM" src="https://github.com/apache/spark/assets/65624911/33dbe6ab-44bf-49a5-ad4c-5ba4a476a1f0"> ### Why are the changes needed? This gives users a way to access session and execution information for Spark Connect via the UI and provides the frontend interface for the related SparkListenerEvents. ### Does this PR introduce _any_ user-facing change? Yes, it will add a new tab/page in the Spark UI ### How was this patch tested? Unit tests Closes apache#41964 from jasonli-db/spark-connect-ui. Authored-by: Jason Li <jason.li@databricks.com> Signed-off-by: Gengliang Wang <gengliang@apache.org> (cherry picked from commit f8786f0)
Thanks @gengliangwang . Since the conflict was trivial (imports), I resolved and raised #42224 for branch-3.5. |
## What changes were proposed in this pull request? Add a new Spark UI page to display session and execution information for Spark Connect. This builds of the work in SPARK-43923 (#41443) that adds the relevant SparkListenerEvents and mirrors the ThriftServerPage in the Spark UI for JDBC/ODBC. <img width="1709" alt="Screenshot 2023-07-27 at 11 29 22 PM" src="https://github.com/apache/spark/assets/65624911/934b7c69-3b44-460b-8fbb-36a9eb3f0798"> <img width="1716" alt="Screenshot 2023-07-27 at 11 29 15 PM" src="https://github.com/apache/spark/assets/65624911/33dbe6ab-44bf-49a5-ad4c-5ba4a476a1f0"> ### Why are the changes needed? This gives users a way to access session and execution information for Spark Connect via the UI and provides the frontend interface for the related SparkListenerEvents. ### Does this PR introduce _any_ user-facing change? Yes, it will add a new tab/page in the Spark UI ### How was this patch tested? Unit tests Closes #41964 from jasonli-db/spark-connect-ui. Authored-by: Jason Li <jason.lidatabricks.com> Signed-off-by: Gengliang Wang <gengliangapache.org> (cherry picked from commit f8786f0) Closes #42224 from juliuszsompolski/SPARK-44394-3.5. Authored-by: Jason Li <jason.li@databricks.com> Signed-off-by: Gengliang Wang <gengliang@apache.org>
## What changes were proposed in this pull request? Add a new Spark UI page to display session and execution information for Spark Connect. This builds of the work in SPARK-43923 (apache#41443) that adds the relevant SparkListenerEvents and mirrors the ThriftServerPage in the Spark UI for JDBC/ODBC. <img width="1709" alt="Screenshot 2023-07-27 at 11 29 22 PM" src="https://github.com/apache/spark/assets/65624911/934b7c69-3b44-460b-8fbb-36a9eb3f0798"> <img width="1716" alt="Screenshot 2023-07-27 at 11 29 15 PM" src="https://github.com/apache/spark/assets/65624911/33dbe6ab-44bf-49a5-ad4c-5ba4a476a1f0"> ### Why are the changes needed? This gives users a way to access session and execution information for Spark Connect via the UI and provides the frontend interface for the related SparkListenerEvents. ### Does this PR introduce _any_ user-facing change? Yes, it will add a new tab/page in the Spark UI ### How was this patch tested? Unit tests Closes apache#41964 from jasonli-db/spark-connect-ui. Authored-by: Jason Li <jason.li@databricks.com> Signed-off-by: Gengliang Wang <gengliang@apache.org>
What changes were proposed in this pull request?
Add a new Spark UI page to display session and execution information for Spark Connect. This builds of the work in SPARK-43923 (#41443) that adds the relevant SparkListenerEvents and mirrors the ThriftServerPage in the Spark UI for JDBC/ODBC.
Why are the changes needed?
This gives users a way to access session and execution information for Spark Connect via the UI and provides the frontend interface for the related SparkListenerEvents.
Does this PR introduce any user-facing change?
Yes, it will add a new tab/page in the Spark UI
How was this patch tested?
Unit tests