Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom aggregation with doric syntax #312

Merged
merged 3 commits into from
Jan 24, 2023

Conversation

alfonsorr
Copy link
Member

Description

This pull request allows the creation of aggregation functions with the syntax of doric.

Example of mean implemented with customAgg in doric

val complexAggWithoutNames = customAgg[Long, Row, Double](
       col[Long]("id"),
       struct(lit(0L), lit(0L)),
       (x, y) =>
         struct(
           x.getChild[Long]("col1") + y,
           x.getChild[Long]("col2") + 1L.lit
         ),
       (x, y) =>
         struct(
           x.getChild[Long]("col1") + y.getChild[Long]("col1"),
           x.getChild[Long]("col2") + y.getChild[Long]("col2")
         ),
       x => x.getChild[Long]("col1") / x.getChild[Long]("col2")
     )

Related Issues and dependencies

How Has This Been Tested?

The test is developed with simple and complex zero values.

  • This pull request contains the appropriate tests?:
    • YES

@alfonsorr alfonsorr self-assigned this Dec 19, 2022
@alfonsorr alfonsorr requested a review from a team as a code owner December 19, 2022 14:10
@github-actions github-actions bot added spark_2.4 PR changes to spark 2.4 spark_3.0 PR changes to spark 3.0 spark_3.1 PR changes to spark 3.1 spark_3.2 PR changes to spark 3.2 spark_3.3 PR changes to spark 3.3 labels Dec 19, 2022
@github-actions
Copy link

github-actions bot commented Dec 19, 2022

:octocat: This is an auto-generated comment created by:

  • Date : 2023-01-24 09:39:49 +0000 (UTC)
  • Workflow : PR comment
  • Job name : create_test_summary_report
  • Run : 3994850431
  • Commit : a9308e7 Added documentation to new functionality
Actor Triggering actor Sender
eruizalo
eruizalo
eruizalo
eruizalo
eruizalo
eruizalo
Triggered by:

Test summary report 📊

Spark version testing
2.4.1 588 passed, 2 skipped
2.4.2 588 passed, 2 skipped
2.4.3 588 passed, 2 skipped
2.4.4 588 passed, 2 skipped
2.4.5 588 passed, 2 skipped
2.4.6 589 passed, 2 skipped
2.4.7 589 passed, 2 skipped
2.4 589 passed, 2 skipped
3.0.0 621 passed, 2 skipped
3.0.1 621 passed, 2 skipped
3.0.2 621 passed, 2 skipped
3.0 621 passed, 2 skipped
3.1.0 649 passed, 2 skipped
3.1.1 649 passed, 2 skipped
3.1.2 649 passed, 2 skipped
3.1 649 passed, 2 skipped
3.2.0 653 passed, 2 skipped
3.2.1 653 passed, 2 skipped
3.2 653 passed, 2 skipped
3.3.0 653 passed, 2 skipped
3.3 653 passed, 2 skipped

Copy link
Collaborator

@eruizalo eruizalo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so powerful, I think we should advertise it as it deserves with some documentation.
Otherwise it could sank into oblivion.

@codecov
Copy link

codecov bot commented Dec 20, 2022

Codecov Report

Merging #312 (a9308e7) into main (c78276a) will increase coverage by 0.07%.
The diff coverage is 100.00%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #312      +/-   ##
==========================================
+ Coverage   97.35%   97.42%   +0.07%     
==========================================
  Files          58       60       +2     
  Lines        1134     1163      +29     
  Branches       22       14       -8     
==========================================
+ Hits         1104     1133      +29     
  Misses         30       30              
Flag Coverage Δ
spark-2.4.x 94.69% <0.00%> (-0.19%) ⬇️
spark-3.0.x 96.48% <0.00%> (-0.18%) ⬇️
spark-3.1.x 97.29% <0.00%> (-0.18%) ⬇️
spark-3.2.x 97.53% <100.00%> (+0.06%) ⬆️
spark-3.3.x ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...c/main/scala/doric/syntax/AggregationColumns.scala 100.00% <ø> (ø)
core/src/main/scala/doric/doric.scala 94.74% <100.00%> (+0.62%) ⬆️
...3.2_3.3/scala/doric/sqlExpressions/CustomAgg.scala 100.00% <100.00%> (ø)
..._3.3/scala/doric/syntax/AggregationColumns32.scala 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c78276a...a9308e7. Read the comment docs.

@alfonsorr alfonsorr force-pushed the feature/custom_typed_aggregation branch from f2590dd to 6a636d7 Compare December 22, 2022 12:53
@eruizalo eruizalo force-pushed the feature/custom_typed_aggregation branch from 6a636d7 to c6faf3f Compare January 22, 2023 13:09
@alfonsorr alfonsorr enabled auto-merge (squash) January 23, 2023 14:11
@eruizalo eruizalo force-pushed the feature/custom_typed_aggregation branch from c6faf3f to a9308e7 Compare January 24, 2023 09:28
@alfonsorr alfonsorr merged commit 6893276 into hablapps:main Jan 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spark_2.4 PR changes to spark 2.4 spark_3.0 PR changes to spark 3.0 spark_3.1 PR changes to spark 3.1 spark_3.2 PR changes to spark 3.2 spark_3.3 PR changes to spark 3.3
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[Feature request]: User defined aggregations using doric syntax
2 participants