Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Spark SQL as a dialect #968

Merged
merged 30 commits into from
Dec 24, 2023
Merged
Show file tree
Hide file tree
Changes from 28 commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
6da5cc5
add basic spark support to library
gilandose Dec 18, 2023
280b646
adding tests
gilandose Dec 18, 2023
145ae6a
formatting
gilandose Dec 18, 2023
7cd9e61
add spark connection
gilandose Dec 18, 2023
0f6a328
add spark connection
gilandose Dec 18, 2023
0b7e3bf
fixed test and formating
gilandose Dec 18, 2023
e60f0ea
added docs
gilandose Dec 19, 2023
c4acca1
exclude execution
gilandose Dec 19, 2023
96846d4
documentation updates
gilandose Dec 19, 2023
c0723d7
adjust doc string for close
gilandose Dec 19, 2023
3de5f61
add generic
gilandose Dec 19, 2023
0b968dc
integrated better with existing functionality
gilandose Dec 19, 2023
e6a25a7
finishing integration tests
gilandose Dec 20, 2023
a971600
pass config and alias correctly
gilandose Dec 20, 2023
a64b2a3
fixed issue with backticks and also implemented fake cursor
gilandose Dec 20, 2023
0a9b4de
change configuration name
gilandose Dec 20, 2023
189e4a0
fix env variable error integration tests CI
gilandose Dec 20, 2023
dc5e208
fixing lint errors
gilandose Dec 20, 2023
a4ef89b
change log formating
gilandose Dec 20, 2023
45e7b6c
metadata ipynb
gilandose Dec 20, 2023
96173de
addressing comments
gilandose Dec 21, 2023
df5925e
update changelog
gilandose Dec 21, 2023
27dca2b
changelog
gilandose Dec 21, 2023
998b7d8
Merge remote-tracking branch 'upstream/master'
gilandose Dec 21, 2023
743dee8
fix row count
gilandose Dec 21, 2023
ac20efd
spelling
gilandose Dec 21, 2023
7bf2b5b
spelling
gilandose Dec 21, 2023
36aaa10
remove pypark dev dependency
gilandose Dec 21, 2023
db024fb
review comments
gilandose Dec 23, 2023
cfb5431
missed readStream in connection.py
gilandose Dec 24, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@

## 0.10.7dev

* [Feature] Add Spark Connection as a dialect for Jupysql ([#965](https://github.com/ploomber/jupysql/issues/965))
(by [@gilandose](https://github.com/gilandose))

## 0.10.6 (2023-12-21)

* [Fix] Fix error when `%sql` includes a query with negative numbers ([#958](https://github.com/ploomber/jupysql/issues/958))
Expand Down
1 change: 1 addition & 0 deletions doc/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ parts:
- file: integrations/duckdb-native
- file: integrations/compatibility
- file: integrations/chdb
- file: integrations/spark

- caption: API Reference
chapters:
Expand Down
16 changes: 16 additions & 0 deletions doc/api/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -234,6 +234,22 @@ value enables the ones from previous values plus new ones:
- `2`: All feedback
- Footer to distinguish pandas/polars data frames from JupySQL's result sets

## `lazy_execution`

Default: `False`
edublancas marked this conversation as resolved.
Show resolved Hide resolved

Return lazy relation to dataset rather than executing through JupySql.

```{code-cell} ipython3
%config SqlMagic.lazy_execution = True
df = %sql SELECT * FROM languages
```

```{code-cell} ipython3
%config SqlMagic.lazy_execution = False
res = %sql SELECT * FROM languages
```

## `named_parameters`

```{versionadded} 0.9
Expand Down
1 change: 1 addition & 0 deletions doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
"integrations/oracle.ipynb",
"integrations/snowflake.ipynb",
"integrations/redshift.ipynb",
"integrations/spark.ipynb",
]
nb_execution_in_temp = True
nb_execution_show_tb = True
Expand Down
18 changes: 17 additions & 1 deletion doc/integrations/compatibility.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,4 +114,20 @@ These table reflects the compatibility status of JupySQL `>=0.7`
- Listing tables with `%sqlcmd tables` ✅
- Listing columns with `%sqlcmd columns` ✅
- Parametrized SQL queries via `{{parameter}}` ✅
- Interactive SQL queries via `--interact` ✅
- Interactive SQL queries via `--interact` ✅

## Spark

- Running queries with `%%sql` ✅
- CTEs with `%%sql --save NAME` ✅
- Plotting with `%%sqlplot boxplot` ❓
- Plotting with `%%sqlplot bar` ✅
- Plotting with `%%sqlplot pie` ✅
- Plotting with `%%sqlplot histogram` ✅
- Plotting with `ggplot` ✅
- Profiling tables with `%sqlcmd profile` ✅
- Listing tables with `%sqlcmd tables` ❌
- Listing columns with `%sqlcmd columns` ❌
- Parametrized SQL queries via `{{parameter}}` ✅
- Interactive SQL queries via `--interact` ✅
- Persisting Dataframes via `--persist` ✅
Loading
Loading