Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++][Parquet] Read string columns directly into STRING_VIEW arrays and cast to LARGE_STRING if necessary #43068

Open
1 of 2 tasks
felipecrv opened this issue Jun 26, 2024 · 1 comment

Comments

@felipecrv
Copy link
Contributor

felipecrv commented Jun 26, 2024

Describe the enhancement requested

This would fix two issues for the price of one:

  1. Reading from Parquet into schemas that use the new STRING_VIEW type
  2. Reading LARGE_STRING_ARRAY from Parquet ([C++] Parquet reader is unable to read LargeString columns #39682)

This issue also depends on:

Component(s)

C++, Parquet

@mapleFU
Copy link
Member

mapleFU commented Jun 26, 2024

Related: apache/arrow-rs#5530

This can also applying "zero-copy" here for non Delta string encoding

@felipecrv felipecrv changed the title [C++][Parquet] Read string columns directly into STRING_VIEW arrays and cast to LARGE_STRING_VIEW if necessary [C++][Parquet] Read string columns directly into STRING_VIEW arrays and cast to LARGE_STRING if necessary Jul 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants