Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor tuple forming function for bulk decompression of text columns #6448

Merged
merged 1 commit into from
Dec 21, 2023

Conversation

akuzm
Copy link
Member

@akuzm akuzm commented Dec 20, 2023

This commit makes some mechanical changes to the tuple forming function make_next_tuple(), to prepare it for the subsequent introduction of bulk decompression for text columns. It also simplifies the layout of data for the compressed columns to contain less indirections. This is important because tuple forming is a hot function that we try to keep simple.

Disable-check: force-changelog-file

This commit makes some mechanical changes to the tuple forming function
make_next_tuple(), to prepare it for the subsequent introduction of bulk
decompression for text columns. It also simplifies the layout of data
for the compressed columns to contain less indirections. This is
important because tuple forming is a hot function that we try to
keep simple.
Copy link

@konskov, @gayyappan: please review this pull request.

Powered by pull-review

Comment on lines +13 to +23
/* How to obtain the decompressed datum for individual row. */
typedef enum
{
/* For row-by-row decompression. */
DecompressionIterator *iterator;

DT_Default = -2,
DT_Iterator = -1,
DT_Invalid = 0,
/*
* For bulk decompression and vectorized filters, mutually exclusive
* with the above.
* Any positive number is also valid for the decompression type. It means
* arrow array of a fixed-size by-value type, with size given by the number.
*/
ArrowArray *arrow;
} DecompressionType;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will later have two new options for arrow arrays for text columns, with dictionary encoding and without.

Comment on lines +173 to +174
/* No variable-width columns support bulk decompression. */
Assert(false);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here will be the implementation for text arrow arrays.

@akuzm akuzm mentioned this pull request Dec 20, 2023
7 tasks
Copy link

codecov bot commented Dec 20, 2023

Codecov Report

Attention: 5 lines in your changes are missing coverage. Please review.

Comparison is base (4f2f658) 87.33% compared to head (ec41710) 87.31%.

Files Patch % Lines
tsl/src/nodes/decompress_chunk/compressed_batch.c 92.75% 3 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6448      +/-   ##
==========================================
- Coverage   87.33%   87.31%   -0.03%     
==========================================
  Files         187      187              
  Lines       41820    41775      -45     
  Branches     9313     9289      -24     
==========================================
- Hits        36525    36474      -51     
- Misses       3623     3627       +4     
- Partials     1672     1674       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@akuzm akuzm merged commit ed59c05 into timescale:main Dec 21, 2023
47 of 48 checks passed
@akuzm akuzm deleted the tuple-forming branch December 21, 2023 10:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants