Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT][Table-Read-Schema 1/3] Split reading tabular file formats into 2 method calls #1010

Merged
merged 11 commits into from
Jun 8, 2023

Conversation

jaychia
Copy link
Contributor

@jaychia jaychia commented Jun 6, 2023

  • Splits Table reading of JSON/CSV/Parquet into 2 separate calls
  • One call infers the schema
  • One call reads the file and enforces a schema

This reduces code complexity and allows us to better control behavior of our code

@codecov
Copy link

codecov bot commented Jun 6, 2023

Codecov Report

Merging #1010 (cfa288a) into main (9128b06) will decrease coverage by 0.46%.
The diff coverage is 96.49%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1010      +/-   ##
==========================================
- Coverage   84.87%   84.42%   -0.46%     
==========================================
  Files         184      187       +3     
  Lines       15922    16415     +493     
==========================================
+ Hits        13514    13858     +344     
- Misses       2408     2557     +149     
Impacted Files Coverage Δ
daft/table/table_io.py 94.50% <92.85%> (-3.30%) ⬇️
daft/table/schema_inference.py 97.05% <97.05%> (ø)
daft/execution/logical_op_runners.py 97.22% <100.00%> (-0.08%) ⬇️
daft/runners/partitioning.py 80.90% <100.00%> (-3.02%) ⬇️
daft/runners/runner_io.py 91.66% <100.00%> (-0.23%) ⬇️

... and 35 files with indirect coverage changes

@jaychia jaychia changed the title Split reading tabular file formats into 2 method calls [Table-Read-Schema 1/3] Split reading tabular file formats into 2 method calls Jun 7, 2023
@jaychia jaychia force-pushed the jay/read-with-schemas-4 branch from a348a99 to f4a7835 Compare June 7, 2023 12:32
@jaychia jaychia merged commit 4e8b1ad into main Jun 8, 2023
@jaychia jaychia deleted the jay/read-with-schemas-4 branch June 8, 2023 12:49
@jaychia jaychia added the enhancement New feature or request label Jun 14, 2023
@jaychia jaychia changed the title [Table-Read-Schema 1/3] Split reading tabular file formats into 2 method calls [FEAT][Table-Read-Schema 1/3] Split reading tabular file formats into 2 method calls Jun 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant