Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Collect Parquet dictionary binary as view #17475

Merged
merged 1 commit into from
Jul 8, 2024

Commits on Jul 7, 2024

  1. perf: collect Parquet dictionary binary as view

    This optimizes how Parquet dictionary over binary is collected. Now, instead of
    pushing the items one at the time into a buffer. The dictionary is used as a
    buffer and views are made into that buffer. This should not only speed up the
    Parquet decoder, but should also reduce memory consumption and speed up
    subsequent operations.
    
    I did a small benchmark, but this does not really mean much.
    
    ```
    Benchmark 1: After Optimization
      Time (mean ± σ):      2.007 s ±  0.005 s    [User: 1.712 s, System: 0.523 s]
      Range (min … max):    2.000 s …  2.013 s    10 runs
    
    Benchmark 2: Before Optimization
      Time (mean ± σ):      2.285 s ±  0.009 s    [User: 1.956 s, System: 0.595 s]
      Range (min … max):    2.274 s …  2.306 s    10 runs
    
    Summary
      After Optimization ran
        1.14 ± 0.01 times faster than Before Optimization
    ```
    coastalwhite committed Jul 7, 2024
    Configuration menu
    Copy the full SHA
    faff0ca View commit details
    Browse the repository at this point in the history