Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: ensure parquet options for scanning are used from SessionState #21

Closed
wants to merge 3 commits into from

Conversation

alexwilcoxson-rel
Copy link
Collaborator

Description

When creating the ParquetExec plan in DeltaScanBuilder. The parquet options on the ParquetExec are left with their default values. This PR uses the SessionState on the builder to clone the ParquetOptions from.

This allows you to create your SessionContext/SessionState with additional Parquet reader options enabled (row filter pushdown, page index, rog group bloom filter pruning, etc).

Related Issue(s)

Documentation

HawaiianSpork and others added 3 commits June 20, 2024 22:35
By casting the read record batch to the delta schema datafusion can read tables where the underlying parquet files can be cast to the desired schema.
HawaiianSpork
HawaiianSpork previously approved these changes Jun 24, 2024
@HawaiianSpork HawaiianSpork force-pushed the schema_adapter branch 2 times, most recently from 27b9639 to d48425e Compare June 28, 2024 02:30
@alexwilcoxson-rel alexwilcoxson-rel changed the base branch from schema_adapter to main July 24, 2024 15:30
@alexwilcoxson-rel alexwilcoxson-rel dismissed HawaiianSpork’s stale review July 24, 2024 15:30

The base branch was changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants