Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigDecimalBuilder and arithmetic operations. #9950

Merged
merged 54 commits into from
Jun 4, 2024

Conversation

GregoryTravis
Copy link
Contributor

@GregoryTravis GregoryTravis commented May 14, 2024

Pull Request Description

Implements BigDecimalBuilder and arithmetic operations.

Important Notes

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

  • The documentation has been updated, if necessary.
  • Screenshots/screencasts have been attached, if there are any visual changes. For interactive or animated visual changes, a screencast is preferred.
  • All code follows the
    Scala,
    Java,
    TypeScript,
    and
    Rust
    style guides. In case you are using a language not listed above, follow the Rust style guide.
  • Unit tests have been written where possible.

_ -> x

java_to_enso x = case x of
_ : BigDecimal -> Decimal.Value x
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

java_to_enso is a conversion function that wraps java.math.BigDecimal into Enso's Decimal by calling the Value constructor. That's correct wrapping. What kind of problem is associated with it? Are you searching for alternatives?

You can call the factory in Java, if you want. If you create a method:

public static Object factory(Function<BigDecimal,Object> f) {
  return f.apply(BigDecimal.valueOf(1));
}

and call it from Enso as

IO.println <| factory Decimal.Value

then you should get Decimal.Value 1. Other than this, I am not sure what to do/change.

@@ -651,6 +651,8 @@ add_specs suite_builder =

Decimal.new "12" . div (Decimal.new "0") . should_fail_with Arithmetic_Error

((Decimal.new "1") / (Decimal.new "3")) . should_fail_with Arithmetic_Error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it fail? I don't understand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fails because Decimal defaults to infinite precision unless you use an explicit Math_Context value.

The solution to this might be to use a Math_Context by default, but doing so might be unexpected, and lose some of the value of Decimal, which is that it attempts to handle precision automatically. I think we'll have to use this in anger before we can make that decision, to decide what the proper defaults should be.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I don't think it's a problem for it to fail by default, I just asked because I did not understand.

What is the error message in that case? I think we should try to ensure the error message is clear and ideally proposes the fix (use Math_Context to limit precision). Do you think that would be possible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

@radeusgd radeusgd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, but let's see how much the benchmarks are affected.

@GregoryTravis
Copy link
Contributor Author

The benchmark comparison below shows that for Auto (untyped) columns, there is some overhead. When we know that the column is not a Decimal column, then we can skip the enso_to_java call. For Auto columns, we don't know, and there is a moderate increase. (There are also some large increases here and there, for various benchmarks, but we can likely attribute this to specialization differences.)

Before:

==> develop-1-out <==
[info] Benchmark                                                        Mode  Cnt    Score    Error  Units
[info] Column_from_vector_1000000.Floats_type_Auto                      avgt   10  247.595 ± 18.008  ms/op
[info] Column_from_vector_1000000.Floats_type_Float                     avgt   10   31.470 ±  4.074  ms/op
[info] Column_from_vector_1000000.Integers_type_Auto                    avgt   10   69.273 ±  4.829  ms/op
[info] Column_from_vector_1000000.Integers_type_Float                   avgt   10   31.690 ±  3.473  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_64_bit          avgt   10   64.998 ±  8.192  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_checked_16_bit  avgt   10   37.326 ±  4.339  ms/op
[info] Benchmark results reported into /Users/gmt/dev/enso/enso/std-bits/benchmarks/./bench-report.xml
[success] Total time: 438 s (07:18), completed May 31, 2024, 11:54:02 AM

==> develop-2-out <==
[info] Benchmark                                                        Mode  Cnt    Score    Error  Units
[info] Column_from_vector_1000000.Floats_type_Auto                      avgt   10   75.944 ±  8.773  ms/op
[info] Column_from_vector_1000000.Floats_type_Float                     avgt   10   31.072 ±  4.970  ms/op
[info] Column_from_vector_1000000.Integers_type_Auto                    avgt   10  241.330 ± 20.980  ms/op
[info] Column_from_vector_1000000.Integers_type_Float                   avgt   10   30.721 ±  1.055  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_64_bit          avgt   10   60.078 ±  2.408  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_checked_16_bit  avgt   10   36.734 ±  0.363  ms/op
[info] Benchmark results reported into /Users/gmt/dev/enso/enso/std-bits/benchmarks/./bench-report.xml
[success] Total time: 454 s (07:34), completed May 31, 2024, 12:01:45 PM

==> develop-3-out <==
[info] Benchmark                                                        Mode  Cnt   Score   Error  Units
[info] Column_from_vector_1000000.Floats_type_Auto                      avgt   10  68.399 ± 5.905  ms/op
[info] Column_from_vector_1000000.Floats_type_Float                     avgt   10  31.594 ± 3.769  ms/op
[info] Column_from_vector_1000000.Integers_type_Auto                    avgt   10  67.762 ± 8.037  ms/op
[info] Column_from_vector_1000000.Integers_type_Float                   avgt   10  31.338 ± 3.247  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_64_bit          avgt   10  60.283 ± 2.907  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_checked_16_bit  avgt   10  34.154 ± 0.773  ms/op
[info] Benchmark results reported into /Users/gmt/dev/enso/enso/std-bits/benchmarks/./bench-report.xml
[success] Total time: 456 s (07:36), completed May 31, 2024, 12:09:29 PM

==> develop-4-out <==
[info] Benchmark                                                        Mode  Cnt    Score    Error  Units
[info] Column_from_vector_1000000.Floats_type_Auto                      avgt   10  243.994 ± 24.811  ms/op
[info] Column_from_vector_1000000.Floats_type_Float                     avgt   10   29.229 ±  1.239  ms/op
[info] Column_from_vector_1000000.Integers_type_Auto                    avgt   10   65.617 ±  4.283  ms/op
[info] Column_from_vector_1000000.Integers_type_Float                   avgt   10   30.766 ±  1.230  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_64_bit          avgt   10   60.589 ±  2.214  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_checked_16_bit  avgt   10   36.111 ±  0.419  ms/op
[info] Benchmark results reported into /Users/gmt/dev/enso/enso/std-bits/benchmarks/./bench-report.xml
[success] Total time: 433 s (07:13), completed May 31, 2024, 12:16:49 PM

After:

==> bd-builder-1-out <==
[info] Benchmark                                                        Mode  Cnt    Score    Error  Units
[info] Column_from_vector_1000000.Floats_type_Auto                      avgt   10   79.058 ±  6.305  ms/op
[info] Column_from_vector_1000000.Floats_type_Float                     avgt   10   29.113 ±  1.639  ms/op
[info] Column_from_vector_1000000.Integers_type_Auto                    avgt   10  301.773 ±  8.683  ms/op
[info] Column_from_vector_1000000.Integers_type_Float                   avgt   10   31.930 ±  2.706  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_64_bit          avgt   10   64.694 ± 19.604  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_checked_16_bit  avgt   10   35.522 ±  4.283  ms/op
[info] Benchmark results reported into /Users/gmt/dev/enso/enso/std-bits/benchmarks/./bench-report.xml
[success] Total time: 436 s (07:16), completed May 31, 2024, 11:12:04 AM

==> bd-builder-2-out <==
[info] Benchmark                                                        Mode  Cnt   Score    Error  Units
[info] Column_from_vector_1000000.Floats_type_Auto                      avgt   10  99.010 ± 13.458  ms/op
[info] Column_from_vector_1000000.Floats_type_Float                     avgt   10  32.937 ±  4.830  ms/op
[info] Column_from_vector_1000000.Integers_type_Auto                    avgt   10  90.345 ± 12.654  ms/op
[info] Column_from_vector_1000000.Integers_type_Float                   avgt   10  31.036 ±  0.938  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_64_bit          avgt   10  60.885 ±  5.146  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_checked_16_bit  avgt   10  36.196 ±  2.464  ms/op
[info] Benchmark results reported into /Users/gmt/dev/enso/enso/std-bits/benchmarks/./bench-report.xml
[success] Total time: 458 s (07:38), completed May 31, 2024, 11:19:51 AM

==> bd-builder-3-out <==
[info] Benchmark                                                        Mode  Cnt    Score    Error  Units
[info] Column_from_vector_1000000.Floats_type_Auto                      avgt   10  100.144 ± 13.658  ms/op
[info] Column_from_vector_1000000.Floats_type_Float                     avgt   10   33.181 ±  3.559  ms/op
[info] Column_from_vector_1000000.Integers_type_Auto                    avgt   10   88.074 ± 10.780  ms/op
[info] Column_from_vector_1000000.Integers_type_Float                   avgt   10   36.510 ±  9.231  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_64_bit          avgt   10   66.077 ±  9.958  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_checked_16_bit  avgt   10   49.034 ± 28.563  ms/op
[info] Benchmark results reported into /Users/gmt/dev/enso/enso/std-bits/benchmarks/./bench-report.xml
[success] Total time: 456 s (07:36), completed May 31, 2024, 11:27:35 AM

==> bd-builder-4-out <==
[info] Benchmark                                                        Mode  Cnt    Score   Error  Units
[info] Column_from_vector_1000000.Floats_type_Auto                      avgt   10   97.633 ± 8.217  ms/op
[info] Column_from_vector_1000000.Floats_type_Float                     avgt   10   30.699 ± 1.601  ms/op
[info] Column_from_vector_1000000.Integers_type_Auto                    avgt   10  251.654 ± 6.396  ms/op
[info] Column_from_vector_1000000.Integers_type_Float                   avgt   10   37.055 ± 6.273  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_64_bit          avgt   10   64.939 ± 8.321  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_checked_16_bit  avgt   10   39.428 ± 5.024  ms/op
[info] Benchmark results reported into /Users/gmt/dev/enso/enso/std-bits/benchmarks/./bench-report.xml
[success] Total time: 448 s (07:28), completed May 31, 2024, 11:35:13 AM

@GregoryTravis
Copy link
Contributor Author

I also followed @JaroslavTulach's suggestion to try doing the conversion in Java rather than Enso. This was consistently slower for Auto columns, because of the overhead involved in determining if the value is a Decimal.

Conversion in Enso:

==> bd-builder-1-out <==
[info] Benchmark                                                        Mode  Cnt    Score    Error  Units
[info] Column_from_vector_1000000.Floats_type_Auto                      avgt   10   79.058 ±  6.305  ms/op
[info] Column_from_vector_1000000.Floats_type_Float                     avgt   10   29.113 ±  1.639  ms/op
[info] Column_from_vector_1000000.Integers_type_Auto                    avgt   10  301.773 ±  8.683  ms/op
[info] Column_from_vector_1000000.Integers_type_Float                   avgt   10   31.930 ±  2.706  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_64_bit          avgt   10   64.694 ± 19.604  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_checked_16_bit  avgt   10   35.522 ±  4.283  ms/op
[info] Benchmark results reported into /Users/gmt/dev/enso/enso/std-bits/benchmarks/./bench-report.xml
[success] Total time: 436 s (07:16), completed May 31, 2024, 11:12:04 AM

==> bd-builder-2-out <==
[info] Benchmark                                                        Mode  Cnt   Score    Error  Units
[info] Column_from_vector_1000000.Floats_type_Auto                      avgt   10  99.010 ± 13.458  ms/op
[info] Column_from_vector_1000000.Floats_type_Float                     avgt   10  32.937 ±  4.830  ms/op
[info] Column_from_vector_1000000.Integers_type_Auto                    avgt   10  90.345 ± 12.654  ms/op
[info] Column_from_vector_1000000.Integers_type_Float                   avgt   10  31.036 ±  0.938  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_64_bit          avgt   10  60.885 ±  5.146  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_checked_16_bit  avgt   10  36.196 ±  2.464  ms/op
[info] Benchmark results reported into /Users/gmt/dev/enso/enso/std-bits/benchmarks/./bench-report.xml
[success] Total time: 458 s (07:38), completed May 31, 2024, 11:19:51 AM

==> bd-builder-3-out <==
[info] Benchmark                                                        Mode  Cnt    Score    Error  Units
[info] Column_from_vector_1000000.Floats_type_Auto                      avgt   10  100.144 ± 13.658  ms/op
[info] Column_from_vector_1000000.Floats_type_Float                     avgt   10   33.181 ±  3.559  ms/op
[info] Column_from_vector_1000000.Integers_type_Auto                    avgt   10   88.074 ± 10.780  ms/op
[info] Column_from_vector_1000000.Integers_type_Float                   avgt   10   36.510 ±  9.231  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_64_bit          avgt   10   66.077 ±  9.958  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_checked_16_bit  avgt   10   49.034 ± 28.563  ms/op
[info] Benchmark results reported into /Users/gmt/dev/enso/enso/std-bits/benchmarks/./bench-report.xml
[success] Total time: 456 s (07:36), completed May 31, 2024, 11:27:35 AM

==> bd-builder-4-out <==
[info] Benchmark                                                        Mode  Cnt    Score   Error  Units
[info] Column_from_vector_1000000.Floats_type_Auto                      avgt   10   97.633 ± 8.217  ms/op
[info] Column_from_vector_1000000.Floats_type_Float                     avgt   10   30.699 ± 1.601  ms/op
[info] Column_from_vector_1000000.Integers_type_Auto                    avgt   10  251.654 ± 6.396  ms/op
[info] Column_from_vector_1000000.Integers_type_Float                   avgt   10   37.055 ± 6.273  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_64_bit          avgt   10   64.939 ± 8.321  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_checked_16_bit  avgt   10   39.428 ± 5.024  ms/op
[info] Benchmark results reported into /Users/gmt/dev/enso/enso/std-bits/benchmarks/./bench-report.xml
[success] Total time: 448 s (07:28), completed May 31, 2024, 11:35:13 AM

Conversion in Java:

==> unwrap-1-out <==
[info] Benchmark                                                        Mode  Cnt    Score    Error  Units
[info] Column_from_vector_1000000.Floats_type_Auto                      avgt   10  104.597 ±  8.526  ms/op
[info] Column_from_vector_1000000.Floats_type_Float                     avgt   10   29.955 ±  3.730  ms/op
[info] Column_from_vector_1000000.Integers_type_Auto                    avgt   10  488.053 ± 45.333  ms/op
[info] Column_from_vector_1000000.Integers_type_Float                   avgt   10   30.483 ±  2.630  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_64_bit          avgt   10   62.697 ±  7.275  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_checked_16_bit  avgt   10   36.915 ±  3.878  ms/op
[info] Benchmark results reported into /Users/gmt/dev/enso/enso/std-bits/benchmarks/./bench-report.xml
[success] Total time: 468 s (07:48), completed May 31, 2024, 10:31:26 AM

==> unwrap-2-out <==
[info] Benchmark                                                        Mode  Cnt    Score   Error  Units
[info] Column_from_vector_1000000.Floats_type_Auto                      avgt   10  142.805 ± 3.764  ms/op
[info] Column_from_vector_1000000.Floats_type_Float                     avgt   10   28.391 ± 0.933  ms/op
[info] Column_from_vector_1000000.Integers_type_Auto                    avgt   10  142.432 ± 1.591  ms/op
[info] Column_from_vector_1000000.Integers_type_Float                   avgt   10   29.657 ± 5.220  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_64_bit          avgt   10   61.872 ± 5.949  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_checked_16_bit  avgt   10   37.365 ± 7.797  ms/op
[info] Benchmark results reported into /Users/gmt/dev/enso/enso/std-bits/benchmarks/./bench-report.xml
[success] Total time: 453 s (07:33), completed May 31, 2024, 10:39:09 AM

==> unwrap-3-out <==
[info] Benchmark                                                        Mode  Cnt    Score    Error  Units
[info] Column_from_vector_1000000.Floats_type_Auto                      avgt   10  160.711 ± 34.721  ms/op
[info] Column_from_vector_1000000.Floats_type_Float                     avgt   10   30.132 ±  1.331  ms/op
[info] Column_from_vector_1000000.Integers_type_Auto                    avgt   10  146.983 ±  5.286  ms/op
[info] Column_from_vector_1000000.Integers_type_Float                   avgt   10   34.722 ±  6.795  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_64_bit          avgt   10   61.491 ±  7.501  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_checked_16_bit  avgt   10   35.412 ±  0.756  ms/op
[info] Benchmark results reported into /Users/gmt/dev/enso/enso/std-bits/benchmarks/./bench-report.xml
[success] Total time: 454 s (07:34), completed May 31, 2024, 10:46:51 AM

==> unwrap-4-out <==
[info] Benchmark                                                        Mode  Cnt    Score    Error  Units
[info] Column_from_vector_1000000.Floats_type_Auto                      avgt   10  174.826 ± 47.514  ms/op
[info] Column_from_vector_1000000.Floats_type_Float                     avgt   10   31.441 ±  4.557  ms/op
[info] Column_from_vector_1000000.Integers_type_Auto                    avgt   10  192.710 ± 40.308  ms/op
[info] Column_from_vector_1000000.Integers_type_Float                   avgt   10   31.303 ±  0.687  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_64_bit          avgt   10   62.761 ±  4.445  ms/op
[info] Column_from_vector_1000000.Integers_type_Integer_checked_16_bit  avgt   10   37.496 ±  2.895  ms/op
[info] Benchmark results reported into /Users/gmt/dev/enso/enso/std-bits/benchmarks/./bench-report.xml
[success] Total time: 460 s (07:40), completed May 31, 2024, 10:54:39 AM

@GregoryTravis GregoryTravis added CI: Ready to merge This PR is eligible for automatic merge CI: Clean build required CI runners will be cleaned before and after this PR is built. and removed CI: Ready to merge This PR is eligible for automatic merge labels Jun 4, 2024
@GregoryTravis GregoryTravis merged commit 5fad355 into develop Jun 4, 2024
39 checks passed
@GregoryTravis GregoryTravis deleted the wip/gmt/8355-bigdecimal-builder branch June 4, 2024 17:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: Clean build required CI runners will be cleaned before and after this PR is built.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants