[BUG] Avoid reconstructing sql query in read_sql #2818
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When Daft executes a read_sql scan task, it calls the
def read_sql
function in table_io.py. This function then calls the.read
method on the SQLConnection object. However, the.read
method will reconstruct the sql query and add another layer of subqueries, which is unnecessary.This is because the
.read
method constructs a sql query given additional predicates, projections, and limits, then executes it. However, the scan task is already given a constructed query with pushdowns applied, so this reconstruction is unnecessary.This PR removes the
.read
method and instead exposes theexecute_sql_query
method. Having.read
do construction and execution together is confusing.