-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-46807][DOCS] Add automation notice to generated SQL error class docs #44847
Closed
nchammas
wants to merge
2
commits into
apache:master
from
nchammas:SPARK-46807-sql-error-automation-notice
Closed
[SPARK-46807][DOCS] Add automation notice to generated SQL error class docs #44847
nchammas
wants to merge
2
commits into
apache:master
from
nchammas:SPARK-46807-sql-error-automation-notice
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I don't think an Apache license header is required for generated files. I can remove that if the committers would like. |
cc @dongjoon-hyun @HyukjinKwon - The build is passing, and I believe it includes the core tests that check the error docs. |
HyukjinKwon
approved these changes
Jan 23, 2024
Merged to master. |
4 tasks
cloud-fan
pushed a commit
that referenced
this pull request
Apr 19, 2024
### What changes were proposed in this pull request? Consolidate all error documentation into a single page of error states and conditions. Each condition and sub-condition will have a link anchor that allows for direct references. Here are some examples: ``` sql-error-conditions.html#cannot-update-field sql-error-conditions.html#cannot-update-field-array-type sql-error-conditions.html#cannot-update-field-interval-type sql-error-conditions.html#cannot-update-field-map-type ``` The table is styled to make it easier to read: - Sub-conditions are indented relative to their parent condition. - Long condition names like `UDTF_INVALID_ALIAS_IN_REQUESTED_ORDERING_STRING_FROM_ANALYZE_METHOD` wrap in a visually pleasing manner, _without_ interfering with how search engines will index those names or preventing users from copy-pasting the whole name as they would normally. The new documentation is generated dynamically via `docs/util/build-error-docs.py` as part of the documentation build. No generated files will be tracked in git. TODO: - [ ] Figure out what, if anything, we will do about the links we are breaking by deleting the old error pages. - [x] Make sure we are using the correct terminology once SPARK-46810 is resolved. - [x] Add opening prose for the main error conditions page. - [x] Update the main Jekyll build to generate the new documentation on the fly. ### Why are the changes needed? The current error documentation has several problems that make it difficult to maintain and difficult to use: 1. The current documentation consists of many Markdown files that are both programmatically generated and checked in to git. This combination leads to problems like the checked in files getting out of sync with the source they are generated from, and leads to unnecessary code churn and problems like the one that precipitated #44847. 2. The individual pages like `docs/sql-error-conditions-cannot-load-state-store-error-class.md` are sparse and don't add much value as standalone pages. They also clutter the top-level namespace. There are around 60 of these individual pages, one for each error condition that has sub-conditions. The number of pages to manage leads to other, more subtle problems as well. Though there are 60 individual pages, only 22 of them are captured in `docs/_data/menu-sql.yaml`. 3. The current process to generate this documentation is awkward. We are using a) a Spark Scala test that requires b) a special environment variable to be set so that c) we read some JSON, in order to d) generate Markdown files, which in turn e) get compiled into HTML. Fundamentally, we are just generating HTML from JSON. It seems inappropriate to couple this process to the Spark build. ### Does this PR introduce _any_ user-facing change? Yes, it overhauls the error documentation. Unless we add redirects, several links that work against the Spark 3.5 documentation will no longer work against the Spark 4.0 documentation. One developer-facing change included in this PR is that the main documentation build with Jekyll now depends on our Python development environment. This is in order to generate the error documentation, since that depends on our ability to render Markdown. Previously, only the SQL and Python API docs depended on the Python environment. ### How was this patch tested? I built the docs and reviewed the results in my browser. <img width="400" src="https://github.com/apache/spark/assets/1039369/effdeea9-a5f8-4471-873f-9746480e1b97"> <br> <img width="400" src="https://github.com/apache/spark/assets/1039369/278af3de-a6a3-4872-84ee-97371f92232e"> <br> <img width="400" src="https://github.com/apache/spark/assets/1039369/e11dc3bf-bfda-4d70-bd3d-605d48d06d26"> <br> <img width="400" src="https://github.com/apache/spark/assets/1039369/d935fdd7-af56-45f6-9d6b-0c5471b675b6"> ### Was this patch authored or co-authored using generative AI tooling? No. Closes #44971 from nchammas/SPARK-46935-revamp-error-docs. Authored-by: Nicholas Chammas <nicholas.chammas@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Add an HTML comment to all generated SQL error class documents warning the reader that they should not edit the file.
Why are the changes needed?
This should help prevent problems like the one in #44825.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
I ran the following command and reviewed the modified documents:
SPARK_GENERATE_GOLDEN_FILES=1 build/sbt "core/testOnly *SparkThrowableSuite -- -t \"Error classes match with document\""
Was this patch authored or co-authored using generative AI tooling?
No.