Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-49941][CORE] Rename errorClass to condition in errors of the JSON format #48431

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented Oct 12, 2024

What changes were proposed in this pull request?

In the PR, I propose to rename the errorClass to condition in errors in the JSON formats: MINIMAL and STANDARD.

For example:

{
  "condition" : "DIVIDE_BY_ZERO",
  "sqlState" : "22012",
  "messageParameters" : { "config" : "CONFIG"}
}

Why are the changes needed?

To follow new naming convention introduced by #44902.

Does this PR introduce any user-facing change?

Yes.

How was this patch tested?

By running the affected tests:

$ build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite"

Was this patch authored or co-authored using generative AI tooling?

No.

@MaxGekk MaxGekk changed the title [WIP] Rename errorClass to condition in errors of the JSON format [SPARK-49941][CORE] Rename errorClass to condition in errors of the JSON format Oct 12, 2024
@MaxGekk MaxGekk marked this pull request as ready for review October 12, 2024 19:19
@MaxGekk
Copy link
Member Author

MaxGekk commented Oct 12, 2024

@srielau @panbingkun @nchammas @cloud-fan Could you review the PR, please.

@MaxGekk MaxGekk changed the title [SPARK-49941][CORE] Rename errorClass to condition in errors of the JSON format [SPARK-49941][CORE] Rename errorClass to condition in errors of the JSON format Oct 12, 2024
@panbingkun
Copy link
Contributor

LGTM.
nit: Do we need to update the following?
image

@MaxGekk
Copy link
Member Author

MaxGekk commented Oct 14, 2024

@panbingkun Thank you for review.

  1. UIUtils: The name errorClass is regexp group name. Not related to the changes.
  2. SQLJsonProtocolSuite: it is just an example in a test. It could handle old input.

@panbingkun
Copy link
Contributor

panbingkun commented Oct 15, 2024

  1. UIUtils

Yeah!

  • Yes, it is indeed the group name of a regular expression. Not related to the changes, and calling it as errorClass will not affect the final functionality.
    (PS: As a follow-up, I'm not sure if we need to rename it to condiiton to reduce misunderstandings? Because from UT, it seems that its purpose is to obtain the value of condiiton (In the past, it was called errorClass)
    val e1 = "Job aborted due to stage failure: Task 0 in stage 1.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1.0 (TID 1) (10.221.98.22 executor driver): org.apache.spark.SparkArithmeticException: [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set \"spark.sql.ansi.enabled\" to \"false\" to bypass this error.\n== SQL (line 1, position 8) ==\nselect a/b from src\n ^^^\n\n\tat org.apache.spark.sql.errors.QueryExecutionErrors$.divideByZeroError(QueryExecutionErrors.scala:226)\n\tat org.apache.spark.sql.errors.QueryExecutionErrors.divideByZeroError(QueryExecutionErrors.scala)\n\tat org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(generated.java:54)\n\tat org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)\n\tat org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43)\n\tat org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:388)\n\tat org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)\n\tat org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)\n\tat org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)\n\tat org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)\n\tat org.apache.spark.rdd.RDD.iterator(RDD.scala:328)\n\tat org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)\n\tat org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)\n\tat org.apache.spark.scheduler.Task.run(Task.scala:141)\n\tat org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:592)\n\tat org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1474)\n\tat org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:595)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:750)\n\nDriver stacktrace:"
    val cell1 = UIUtils.errorMessageCell(e1)
    assert(cell1 === <td>{"DIVIDE_BY_ZERO"}{UIUtils.detailsUINode(isMultiline = true, e1)}</td>)

    (Because after this PR, I believe that in the error log, there should only be condiiton and no errorClass.)
  • Okay.

Copy link
Contributor

@nchammas nchammas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but is there a summary of why we are going with condition rather than errorCondition?

I remember seeing a discussion somewhere about this, but it's not on the ticket associated with this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants