Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improving duckdb + autopandas performance #469

Merged
merged 15 commits into from
May 1, 2023
Merged

improving duckdb + autopandas performance #469

merged 15 commits into from
May 1, 2023

Conversation

edublancas
Copy link

@edublancas edublancas commented Apr 28, 2023

Describe your changes

  • improved performance with using duckdb and autopandas (via DuckDB's native .df())
  • added notebook with DuckDB benchmarks
  • moved ResultSet constructor to the top
  • refactored ResultSet, no longer a subclass of list (using composition instead of inheritance)
  • added missing ResultSet tests

Issue number

Closes #451

Checklist before requesting a review


📚 Documentation preview 📚: https://jupysql--469.org.readthedocs.build/en/469/

@edublancas edublancas mentioned this pull request Apr 28, 2023
@edublancas edublancas marked this pull request as ready for review April 28, 2023 18:20
Copy link

@idomic idomic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix

src/tests/test_resultset.py Show resolved Hide resolved
src/sql/connection.py Show resolved Hide resolved
benchmarks/duckdb.ipynb Show resolved Hide resolved
@edublancas edublancas requested a review from idomic April 28, 2023 20:03
Copy link

@yafimvo yafimvo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@edublancas Nice improvement! ~6.5 sec vs ~260 ms

Should we see the benchmark in the docs?

@idomic
Copy link

idomic commented May 1, 2023

Should we see the benchmark in the docs?

It's intentionally not part of the docs, as this is how it should operate regardless (let's say without JupySQL).

@idomic idomic merged commit ad625e3 into master May 1, 2023
@idomic idomic deleted the duckdb-df branch May 1, 2023 13:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

improving performance when converting DuckDB's results to pandas
3 participants