perf: Add metric for time spent casting in native scan #919

andygrove · 2024-09-06T16:58:27Z

Which issue does this PR close?

N/A

Rationale for this change

Make it easy to see how much of the native ScanExec time is spent casting columns to different types (this usually means unpacking dictionaries).

Example from TPC-DS q9:

DataFusion metrics in native explain output:

metrics=[
  output_rows=2097152, 
  elapsed_compute=21.847892ms, 
  cast_time=21.631731ms]

Full plan:

AggregateExec: mode=Partial, gby=[], aggr=[count, avg, avg], metrics=[output_rows=1, elapsed_compute=9.481194ms]
  ProjectionExec: expr=[col_1@1 as col_0, col_2@2 as col_1], metrics=[output_rows=400519, elapsed_compute=50.596µs]
    FilterExec: col_0@0 IS NOT NULL AND col_0@0 >= 81 AND col_0@0 <= 100, metrics=[output_rows=400519, elapsed_compute=4.753725ms]
      ScanExec: source=[CometScan parquet  (unknown)], schema=[col_0: Int32, col_1: Decimal128(7, 2), col_2: Decimal128(7, 2)], metrics=[output_rows=2097152, elapsed_compute=21.847892ms, cast_time=21.631731ms]

What changes are included in this PR?

How are these changes tested?

viirya · 2024-09-09T22:43:06Z

spark/src/main/scala/org/apache/spark/sql/comet/CometMetricNode.scala

+  def scanMetrics(sc: SparkContext): Map[String, SQLMetric] = {
+    Map(
+      "cast_time" ->
+        SQLMetrics.createNanoTimingMetric(sc, "Total time for casting arrays"))


"arrays" in "casting arrays" may be confused with Spark array type and considered as time spent on casting (Spark) "arrays".

Maybe "casting columns"?

comphead · 2024-09-09T23:09:07Z

native/core/src/execution/operators/scan.rs

@@ -345,16 +347,20 @@ struct ScanStream<'a> {
    baseline_metrics: BaselineMetrics,
    /// Cast options
    cast_options: CastOptions<'a>,
+    /// Timer for cast operations


is this elapsed time?

comphead · 2024-09-09T23:20:25Z

spark/src/main/scala/org/apache/spark/sql/comet/CometScanExec.scala

@@ -198,9 +198,12 @@ case class CometScanExec(
    // Tracking scan time has overhead, we can't afford to do it for each row, and can only do
    // it for each batch.
    if (supportsColumnar) {
-      Some("scanTime" -> SQLMetrics.createNanoTimingMetric(sparkContext, "scan time"))
+      Seq(


should be it a Map instead of Seq?

andygrove · 2024-09-10T21:12:08Z

@viirya @comphead I have addressed feedback. Thanks.

comphead

lgtm thanks @andygrove

add native metric for time spent casting in native scan

fb923a6

andygrove marked this pull request as draft September 6, 2024 16:58

andygrove added 3 commits September 6, 2024 11:59

save progress

4e8c841

save progress

42a0acb

fix test

609e0bb

andygrove marked this pull request as ready for review September 9, 2024 20:54

andygrove requested review from comphead and viirya September 9, 2024 22:29

viirya reviewed Sep 9, 2024

View reviewed changes

comphead reviewed Sep 9, 2024

View reviewed changes

address feedback

a3fd3f8

andygrove changed the title ~~perf: Add native metric for time spent casting in native scan~~ perf: Ad metric for time spent casting in native scan Sep 10, 2024

andygrove changed the title ~~perf: Ad metric for time spent casting in native scan~~ perf: Add metric for time spent casting in native scan Sep 10, 2024

viirya approved these changes Sep 10, 2024

View reviewed changes

comphead approved these changes Sep 10, 2024

View reviewed changes

andygrove merged commit c0ec49b into apache:main Sep 10, 2024
76 checks passed

andygrove deleted the scan-cast-timer branch September 10, 2024 23:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: Add metric for time spent casting in native scan #919

perf: Add metric for time spent casting in native scan #919

andygrove commented Sep 6, 2024 •

edited

Loading

viirya Sep 9, 2024

comphead Sep 9, 2024

comphead Sep 9, 2024

andygrove commented Sep 10, 2024

comphead left a comment

perf: Add metric for time spent casting in native scan #919

perf: Add metric for time spent casting in native scan #919

Conversation

andygrove commented Sep 6, 2024 • edited Loading

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

How are these changes tested?

viirya Sep 9, 2024

Choose a reason for hiding this comment

comphead Sep 9, 2024

Choose a reason for hiding this comment

comphead Sep 9, 2024

Choose a reason for hiding this comment

andygrove commented Sep 10, 2024

comphead left a comment

Choose a reason for hiding this comment

andygrove commented Sep 6, 2024 •

edited

Loading