Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support for SQL as execution engine #306

Merged
merged 226 commits into from
Apr 11, 2021
Merged
Show file tree
Hide file tree
Changes from 215 commits
Commits
Show all changes
226 commits
Select commit Hold shift + click to select a range
308d99c
Merging Recent SQL Executor changes
19thyneb Oct 15, 2020
daa9a0d
Fix to Validator
19thyneb Oct 16, 2020
804b0dc
Fix Bug with Widget Rendering
19thyneb Oct 17, 2020
676f3e0
Added Number of Observations to MetaData, Fixed Interestingness issue…
19thyneb Oct 19, 2020
8763df9
Re-added Licensing Headers
19thyneb Oct 19, 2020
c2b0b46
Adding Recent frame.py changes
19thyneb Oct 19, 2020
1b08461
Adjusted SQL Executor Tests
19thyneb Oct 19, 2020
38c5e7e
Update Frame with new Action Registering
19thyneb Oct 22, 2020
14d2f90
Resolving Conflicts in frame.py
19thyneb Oct 22, 2020
78d8e10
Merge branch 'sql-engine' into Database-Executor
thyneb19 Oct 22, 2020
d783b4c
Commenting out local SQL Executor tests
19thyneb Oct 22, 2020
c03e001
Merge branch 'Database-Executor' of https://github.com/thyneb19/lux i…
19thyneb Oct 22, 2020
8f0e643
Update correlation.py
dorisjlee Oct 22, 2020
d365d52
Update frame.py
dorisjlee Oct 22, 2020
7da2992
Fixing Code Format
19thyneb Oct 22, 2020
f1b7c8b
Cleaning up Pandas Executor imports
19thyneb Oct 22, 2020
d97f0e4
Fix Validation Bug
19thyneb Oct 22, 2020
582b370
Changed metadata variable name
19thyneb Oct 23, 2020
bc87eb8
Moving Current SQL Executor changes to new branch (#119)
thyneb19 Oct 23, 2020
554c71f
Merge remote-tracking branch 'upstream/sql-engine' into Database-Exec…
19thyneb Oct 23, 2020
d65687a
Added script to generate Postgresql database
19thyneb Oct 25, 2020
7243b2f
Update upload_car_data.py
19thyneb Oct 25, 2020
2add76f
Updated script name in travis.yml
19thyneb Oct 25, 2020
cf74beb
Removed unnecessary import from travis.yml
19thyneb Oct 25, 2020
14d52b8
Added psycopg2 to requirements.txt
19thyneb Oct 25, 2020
379517d
Creating Postgres test database in travis
19thyneb Oct 25, 2020
a72f236
Fixed directory issue
19thyneb Oct 25, 2020
4a9db88
Added test environment for Postgresql Executor (#124)
thyneb19 Oct 26, 2020
e947fd6
Updated SQL Executor Tests
19thyneb Oct 26, 2020
1234009
Added sql_executor example notebook, minor bug fix
19thyneb Oct 31, 2020
a5eff98
Cleaned SQL Executor Example Notebook
19thyneb Nov 2, 2020
5e18fd8
Update custom action reference to executor
19thyneb Nov 2, 2020
6890408
Added example notebook, fixed variable reference (#130)
thyneb19 Nov 3, 2020
8029ac1
Updated Tests, Added benchmarking for SQL Executor
19thyneb Nov 6, 2020
bca1f0c
Merge remote-tracking branch 'upstream/master' into Database-Executor
19thyneb Nov 7, 2020
4046579
Merge with upstream branch, added preliminary benchmarking code
19thyneb Nov 7, 2020
907f215
Added 2D Binning functionality to SQL Executor
19thyneb Nov 15, 2020
2b24129
Merge remote-tracking branch 'upstream/master' into Database-Executor
19thyneb Nov 19, 2020
21ba332
Updated 2D Binning Functionality
19thyneb Nov 19, 2020
b7ac30c
Added Heatmap generation to SQL Executor, Bug fix in PandasExecutor
19thyneb Nov 22, 2020
ad32db7
Updated Code Formatting with Black
19thyneb Nov 22, 2020
8948538
Merge branch 'sql-engine' into Database-Executor
thyneb19 Nov 22, 2020
6f032f2
Update Requirements to include psycopg2
19thyneb Nov 22, 2020
06dbf14
Merge branch 'Database-Executor' of https://github.com/thyneb19/lux i…
19thyneb Nov 22, 2020
aff131f
Update upload_car_data.py
19thyneb Nov 22, 2020
0ca0d98
Update Compiler tests to use correct test DB
19thyneb Nov 22, 2020
776489f
Removed Benchmarking Code
19thyneb Nov 22, 2020
7d00742
Fixing Black Formatting
19thyneb Nov 22, 2020
bb77423
Updating SQL-Engine branch to main branch, Adding Heatmap Functional…
thyneb19 Nov 25, 2020
7101df1
Moved Executor Parameters to Global Config
19thyneb Nov 25, 2020
4e3ecb7
Black formatting
19thyneb Nov 25, 2020
a48cdc7
Moved table_name parameter to frame.py. Removed executor_type parameter
19thyneb Nov 27, 2020
bccd61a
Fixed reference to table_name parameter
19thyneb Nov 28, 2020
5794718
Adjusted Functions to Set SQL Connection
19thyneb Nov 29, 2020
fb61c27
Merge branch 'master' into Database-Executor
19thyneb Nov 29, 2020
727315b
Update SQLExecutor name parameter
19thyneb Nov 29, 2020
52e6be4
Merging master branch with sql engine. Moving executor parameters to …
19thyneb Nov 29, 2020
2cc5b10
Merge branch 'master' into Database-Executor
19thyneb Nov 29, 2020
10e9ea1
Merge remote-tracking branch 'upstream/master' into Database-Executor
19thyneb Nov 29, 2020
b3e1e88
Parameter Bug Fix
19thyneb Nov 29, 2020
d774a41
Reference Fix in Warning
19thyneb Nov 29, 2020
9747848
Black Formatting
19thyneb Nov 29, 2020
fbafb96
Merge branch 'sql-engine' into Database-Executor
19thyneb Nov 29, 2020
ecfb434
Black formatting
19thyneb Nov 29, 2020
47ece2b
Merge remote-tracking branch 'upstream/master'
19thyneb Nov 30, 2020
3e68db3
Fix Executor Reference
19thyneb Nov 30, 2020
3a5473e
Update frame.py
dorisjlee Dec 2, 2020
68d1600
Moved set functions to global config
19thyneb Dec 2, 2020
b6a5e99
Cleaned up executor imports, Fixed issue in AltairRenderer
19thyneb Dec 3, 2020
9a1748c
Black formatting
19thyneb Dec 3, 2020
77a390d
Merged changes from Master branch, Moved Executor Parameters to Confi…
thyneb19 Dec 9, 2020
cb22f5c
Merge remote-tracking branch 'upstream/master'
19thyneb Dec 20, 2020
5223a11
Fixed Index Issue in Pandas Executor
19thyneb Dec 20, 2020
ad89594
Added tests for set_index functions
19thyneb Dec 21, 2020
4718c05
Black formatting
19thyneb Dec 21, 2020
2a7d4b2
Merge remote-tracking branch 'upstream/master'
19thyneb Dec 22, 2020
269c062
Update Pandas Executor to handle NA values
19thyneb Dec 22, 2020
79b381a
Merge branch 'master' into Database-Executor
19thyneb Dec 27, 2020
fd93cd4
Update to Config, and Compiler/Interestingness Tests
19thyneb Dec 27, 2020
7324e29
Merge branch 'sql-engine' into Database-Executor
19thyneb Dec 27, 2020
674d828
Black formatting
19thyneb Dec 27, 2020
2931442
Update Requirements.txt
19thyneb Dec 27, 2020
2fa2dca
Update to Sql-Engine (#190)
thyneb19 Dec 28, 2020
7cc9626
Update SQL Executor Documentation
19thyneb Jan 4, 2021
1d517d8
Updated SQLExecutor Example Notebook
19thyneb Jan 4, 2021
f5a358c
Merge branch 'sql-engine' into Database-Executor
19thyneb Jan 4, 2021
3a27d76
Black Formatting
19thyneb Jan 4, 2021
d187b39
Update to SQL Executor Example Notebook (#193)
thyneb19 Jan 4, 2021
1229809
Update to SQL Executor Tests
19thyneb Jan 4, 2021
c405f67
Update Travis file and SQL Executor Tests
19thyneb Jan 5, 2021
7eb665e
Merge branch 'sql-engine' into Database-Executor
thyneb19 Jan 5, 2021
86a35fa
Update .travis.yml
19thyneb Jan 5, 2021
d539b83
Merge branch 'Database-Executor' of https://github.com/thyneb19/lux i…
19thyneb Jan 5, 2021
d3819b7
Merge pull request #196 from thyneb19/Database-Executor
thyneb19 Jan 5, 2021
06a8b9c
Merging changes
dj-khandelwal Jan 7, 2021
50a36e8
fixed merge conflict issues. vis.data shows None DF.
dj-khandelwal Jan 8, 2021
7479129
Merging changes
dj-khandelwal Jan 8, 2021
3966521
Merge master into sql-engine + minor mergeconflict fixes
dj-khandelwal Jan 8, 2021
6d1ad6c
Removing the PYNB
dj-khandelwal Jan 8, 2021
085f036
Cleaning up obsolete code
dj-khandelwal Jan 8, 2021
289f670
Merging the master branch changes into sql-engine (#208)
dj-khandelwal Jan 9, 2021
812a27e
Merging changes after the travis yml script fix
dj-khandelwal Jan 10, 2021
0ed2f0e
Merge conflicts resolved
dj-khandelwal Jan 10, 2021
af0e742
Updating sql-engine after merge with the travis build fix (#213)
dj-khandelwal Jan 10, 2021
96a3742
Merged in SQL-Engine changes, Cleaned up method to connect Lux to SQL…
19thyneb Jan 10, 2021
522c616
Merge pull request #214 from thyneb19/Database-Executor
thyneb19 Jan 10, 2021
973a93e
Fixed SQLExecutor's Variable Handling
19thyneb Jan 12, 2021
97d1281
Black Formatting
19thyneb Jan 19, 2021
395dfd6
Merge pull request #236 from thyneb19/Database-Executor
thyneb19 Jan 19, 2021
04e449f
Merge remote-tracking branch 'upstream/master' into Database-Executor
19thyneb Jan 19, 2021
0defa44
Updated data_type reference in SQLExecutor
19thyneb Jan 19, 2021
02906f9
Update Datetime Numeric Check
19thyneb Jan 21, 2021
56ff766
Merge pull request #237 from thyneb19/Database-Executor
thyneb19 Jan 21, 2021
9a96c55
Update travis file to generate Postgres Test DB
19thyneb Jan 21, 2021
daf9c83
Merge remote-tracking branch 'upstream/master' into Database-Executor
19thyneb Jan 21, 2021
5d9a999
Update test_vis.py
19thyneb Jan 21, 2021
105942a
Merge pull request #238 from thyneb19/Database-Executor
thyneb19 Jan 21, 2021
fabdf2e
Merging from lux/sql-engine
dj-khandelwal Jan 23, 2021
15f290e
Improved SQLExecutor Warning Handling, Bugfix with 2D Binning
19thyneb Feb 8, 2021
cfdbf16
Adjustment to AltairRenderer
19thyneb Feb 10, 2021
660655c
Merge remote-tracking branch 'upstream/master' into Database-Executor
19thyneb Feb 10, 2021
04b43dc
Merge branch 'sql-engine' into Database-Executor
19thyneb Feb 10, 2021
1b8b99d
Black Formatting
19thyneb Feb 10, 2021
4fd9a00
Merge pull request #261 from thyneb19/Database-Executor
thyneb19 Feb 10, 2021
ca8065a
Merge branch 'sql-engine' of https://github.com/lux-org/lux into sql-…
dj-khandelwal Feb 11, 2021
8327fda
Added Better Null Value Handling to SQLExecutor
19thyneb Feb 15, 2021
a7e2220
SQLExecutor execute_binning Fixes
19thyneb Feb 18, 2021
4394ad6
Update SQLExecutor Tests
19thyneb Feb 18, 2021
d0ed14a
Black Formatting
19thyneb Feb 18, 2021
3e649c2
Merge branch 'sql-engine' into Database-Executor
19thyneb Feb 18, 2021
b985393
Merge pull request #267 from thyneb19/Database-Executor
thyneb19 Feb 18, 2021
6134ce5
Merge branch 'sql-engine' of https://github.com/lux-org/lux into sql-…
dj-khandelwal Feb 18, 2021
43e0e63
Removing Test Print Statement
19thyneb Feb 18, 2021
b3ee36a
Optimizing SQLExecutor 2D Binning
19thyneb Feb 20, 2021
de1e0a1
Black Formatting
19thyneb Feb 20, 2021
4572dd2
Merge branch 'sql-engine' into Database-Executor
19thyneb Feb 21, 2021
5b0cd2b
Merge pull request #283 from thyneb19/Database-Executor
thyneb19 Feb 21, 2021
c1ea16d
Merge branch 'sql-engine' of https://github.com/lux-org/lux into sql-…
dj-khandelwal Feb 22, 2021
c8dfa12
Added Null Value Filtering to SQLExecutor
19thyneb Feb 23, 2021
cfa67b7
Merge remote-tracking branch 'upstream/master' into Database-Executor
19thyneb Feb 23, 2021
180556c
Fixed Handling of String Filter Values
19thyneb Feb 25, 2021
45d6c07
Added Better Handling for Lazy Execution
19thyneb Feb 26, 2021
46f441c
Merge pull request #288 from thyneb19/Database-Executor
thyneb19 Feb 26, 2021
06e0919
Merge branch 'sql-engine' of https://github.com/lux-org/lux into sql-…
dj-khandelwal Feb 26, 2021
4fa1534
Updated Heatmap Threshold
19thyneb Feb 26, 2021
18eb80d
Updated SQLExecutor to Not Include Null Values in Metadata
19thyneb Feb 27, 2021
4c19e4d
Black formatting
19thyneb Feb 27, 2021
84d824e
Merge remote-tracking branch 'upstream/master' into Database-Executor
19thyneb Mar 2, 2021
d4ee7da
Merge pull request #292 from thyneb19/Database-Executor
thyneb19 Mar 2, 2021
7027d2c
Merge remote-tracking branch 'upstream/master' into Database-Executor
19thyneb Mar 6, 2021
650355f
Fixed Issue with SQLExecutor and Custom Actions
19thyneb Mar 6, 2021
a1dea61
Merge pull request #300 from thyneb19/Database-Executor
thyneb19 Mar 10, 2021
2dab7ed
Created LuxSQLTable Object
thyneb19 Mar 12, 2021
e36c8a1
Merge pull request #303 from thyneb19/Database-Executor
thyneb19 Mar 12, 2021
04591af
Merge remote-tracking branch 'upstream/master' into Database-Executor
thyneb19 Mar 13, 2021
fa917fb
Added query parameter to Vis objects
thyneb19 Mar 13, 2021
2876de9
Some minor datatype detection changes to SQLExecutor
dj-khandelwal Mar 13, 2021
df3ee00
Fresh pull merge
dj-khandelwal Mar 13, 2021
c3de6c6
Revert "Added query parameter to Vis objects"
thyneb19 Mar 14, 2021
ce2017b
Merge pull request #305 from thyneb19/Database-Executor
thyneb19 Mar 14, 2021
4019070
Update python-app.yml to set up Postgres in test instance
thyneb19 Mar 14, 2021
bca4e5e
Removed Example Notebooks for SQLExecutor
thyneb19 Mar 14, 2021
d19d2c2
Update to Script Uploading Car dataset to Postgres
thyneb19 Mar 14, 2021
a7134b0
Update python-app.yml
thyneb19 Mar 14, 2021
af0fe9d
Merge pull request #307 from thyneb19/Database-Executor
thyneb19 Mar 14, 2021
626cc5f
Update Lux SQLTable Frontend
thyneb19 Mar 16, 2021
0d1c557
Delete sql_benchmarking.csv
thyneb19 Mar 16, 2021
3a0fbd2
Update CONTRIBUTING.md
dorisjlee Mar 16, 2021
87fe7d7
Update CONTRIBUTING.md
dorisjlee Mar 16, 2021
916f73a
cleaning up PR
dorisjlee Mar 16, 2021
4ad6ec4
fix flights data upload
dorisjlee Mar 16, 2021
6fe8433
Some changes for length calculation
dj-khandelwal Mar 17, 2021
2fcfb8f
Merging changes after the update for the setup script for GitHub test…
dj-khandelwal Mar 17, 2021
b1b56d7
Test Commit of SQLTable
sophiahhuang Mar 17, 2021
be2d1a2
Cleaned Up Test Suite
thyneb19 Mar 17, 2021
e8f6106
Some changes to datatype, test SQL Executor, count queries to count(1…
dj-khandelwal Mar 18, 2021
e2e1482
removing travis.tml and date_utils import from sqlexecutor
dj-khandelwal Mar 18, 2021
a6bd2fe
Merge pull request #311 from dj-khandelwal/sql-engine
thyneb19 Mar 18, 2021
799f47d
Merge branch 'sql-engine' into Database-Executor
thyneb19 Mar 18, 2021
8f1e21f
Clean up LuxSQLTable
thyneb19 Mar 18, 2021
0e760c2
Merge remote-tracking branch 'upstream/master' into Database-Executor
thyneb19 Mar 18, 2021
7f5c1b8
Update test_vis.py
thyneb19 Mar 18, 2021
247387d
Merge pull request #312 from thyneb19/Database-Executor
thyneb19 Mar 18, 2021
460e6e2
Black Formatting
thyneb19 Mar 18, 2021
c8a9471
Merge pull request #313 from thyneb19/Database-Executor
thyneb19 Mar 18, 2021
34b5c45
Updated LuxSQLTable notification
sophiahhuang Mar 19, 2021
39d8f41
Merge branch 'sql-engine' into sql-engine
thyneb19 Mar 19, 2021
d68b16d
Remove Out of Date LuxSQLTableNotice
sophiahhuang Mar 19, 2021
f190a55
Black Reformatting
thyneb19 Mar 19, 2021
fcad97a
Merge pull request #314 from sophiahhuang/sql-engine
thyneb19 Mar 19, 2021
ae26397
Remove redundantly added parameters
thyneb19 Mar 19, 2021
562be13
Update config error handling and LuxSQLTable description
thyneb19 Mar 22, 2021
e6b2f29
Clean up Vis.py Add greater connection visibility in LuxSQLTable
thyneb19 Mar 23, 2021
d9073af
Merge remote-tracking branch 'upstream/master' into Database-Executor
thyneb19 Mar 23, 2021
922f2a7
Update executor.rst
thyneb19 Mar 23, 2021
7b67f06
Refactor length parameter to _length
thyneb19 Mar 25, 2021
0323be0
Update test_interestingness.py
thyneb19 Mar 25, 2021
bea42cc
Added _length Parameter to LuxSQLTable
thyneb19 Mar 25, 2021
e9afa92
Black Reformatting, Reverting_length change in LuxDataFrame
thyneb19 Mar 25, 2021
d296c9b
Merge remote-tracking branch 'upstream/master' into Database-Executor
thyneb19 Mar 26, 2021
7c7dcd3
Update LuxSQLTable __len__() and metadata computation
thyneb19 Mar 26, 2021
8d6cf4b
Removed unnecessary __repr__() function
thyneb19 Mar 26, 2021
e350ab4
Updated LuxSQLTable repr
thyneb19 Mar 26, 2021
5d1a2f4
Revert "Updated LuxSQLTable repr"
thyneb19 Mar 27, 2021
48c1b57
Revert "Revert "Updated LuxSQLTable repr""
thyneb19 Mar 27, 2021
b5998c7
Revert "Update LuxSQLTable __len__() and metadata computation"
thyneb19 Mar 27, 2021
75c5cae
Merge pull request #327 from thyneb19/Database-Executor
thyneb19 Mar 27, 2021
6f597c2
Revert "Revert "Update LuxSQLTable __len__() and metadata computation""
thyneb19 Mar 27, 2021
7999ad6
Cleaned up datatype and SQLExecutor checks
thyneb19 Apr 3, 2021
db736d1
Update LuxSQLTable __len__() and metadata computation"" (#331)
thyneb19 Apr 7, 2021
e92dbd6
Merge branch 'sql-engine' into Database-Executor
thyneb19 Apr 7, 2021
5399097
Merge remote-tracking branch 'upstream/master' into Database-Executor
thyneb19 Apr 7, 2021
2d24a3b
Merge pull request #347 from thyneb19/Database-Executor
thyneb19 Apr 8, 2021
6922d3b
Black Reformatting
thyneb19 Apr 8, 2021
bf3cb0f
Merge pull request #348 from thyneb19/Database-Executor
thyneb19 Apr 8, 2021
40b85b1
minor changes to requirements and cleanup
dorisjlee Apr 11, 2021
2298f13
Merge branch 'master' into sql-engine
dorisjlee Apr 11, 2021
801f3cd
Removed psycopg2 from Lux requirements
thyneb19 Apr 11, 2021
284f5ba
Merge branch 'sql-engine' into Database-Executor
thyneb19 Apr 11, 2021
1e02ad6
Merge pull request #352 from thyneb19/Database-Executor
thyneb19 Apr 11, 2021
68c7747
Merge remote-tracking branch 'upstream/master' into Database-Executor
thyneb19 Apr 11, 2021
f11e772
Revert "Merge remote-tracking branch 'upstream/master' into Database-…
thyneb19 Apr 11, 2021
a309361
Merge branch 'Database-Executor' of https://github.com/thyneb19/lux i…
thyneb19 Apr 11, 2021
1632005
Merge pull request #353 from thyneb19/Database-Executor
thyneb19 Apr 11, 2021
c94b5a6
add back merged overridden changes
dorisjlee Apr 11, 2021
50f8562
merge conflict fixed
dorisjlee Apr 11, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions .github/workflows/python-app.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,22 @@ jobs:

runs-on: ubuntu-latest

# Service containers to run with `container-job`
services:
# Label used to access the service container
postgres:
# Docker Hub image
image: postgres
# Provide the password for postgres
env:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: lux
POSTGRES_DB: postgres
# Set health checks to wait until postgres has started
options: --health-cmd pg_isready --health-interval 10s --health-timeout 5s --health-retries 5
ports:
- 5432:5432

steps:
- uses: actions/checkout@v2
- name: Set up Python 3.7
Expand All @@ -28,6 +44,11 @@ jobs:
pip install wheel
pip install -r requirements.txt
pip install -r requirements-dev.txt
pip install sqlalchemy
- name: Upload data to Postgres
run: |
python lux/data/upload_car_data.py
python lux/data/upload_aug_test_data.py
- name: Lint check with black
run: |
black --target-version py37 --line-length 105 --check .
Expand Down
4 changes: 1 addition & 3 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ lux/
```

# Code Formatting
In order to keep our codebase clean and readible, we are using PEP8 guidelines. To help us maintain and check code style, we are using [black](https://github.com/psf/black). Simply run `black .` before commiting. Failure to do so may fail the tests run on Travis. This package should have been installed for you as part of [requirements-dev](https://github.com/lux-org/lux/blob/master/requirements-dev.txt).
In order to keep our codebase clean and readible, we are using PEP8 guidelines. To help us maintain and check code style, we are using [black](https://github.com/psf/black). Simply run `black .` before commiting. Without running black, the checks on the continuous integration tests can fail. `black` should be installed for you as part of [requirements-dev](https://github.com/lux-org/lux/blob/master/requirements-dev.txt).

# Running the Test Suite

Expand All @@ -55,8 +55,6 @@ To run a single test file, run:
python -m pytest tests/<test_file_name>.py
```



# Submitting a Pull Request

You can commit your code and push to your forked repo. Once all of your local changes have been tested and formatted, you are ready to submit a PR. For Lux, we use the "Squash and Merge" strategy to merge in PR, which means that even if you make a lot of small commits in your PR, they will all get squashed into a single commit associated with the PR. Please make sure that comments and unnecessary file changes are not committed as part of the PR by looking at the "File Changes" diff view on the pull request page.
Expand Down
34 changes: 25 additions & 9 deletions doc/source/advanced/executor.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,41 +11,57 @@ Please refer to :mod:`lux.executor.Executor`, if you are interested in extending
SQL Executor
=============

Lux extends its visualization exploration operations to data within SQL databases. By using the SQL Executor, users can specify a SQL database to connect a Lux Dataframe for generating all the visualizations recommended in Lux.
Lux extends its visualization exploration operations to data within SQL databases. By using the SQL Executor, users can specify a SQL database to connect a LuxSQLTable for generating all the visualizations recommended in Lux.

Connecting Lux to a Database
----------------------------

Before Lux can operate on data within a Postgresql database, users have to connect their Lux Dataframe to their database.
Before Lux can operate on data within a Postgresql database, users have to connect their LuxSQLTable to their database.
To do this, users first need to specify a connection to their SQL database. This can be done using the psycopg2 package's functionality.

.. code-block:: python

import psycopg2
connection = psycopg2.connect("dbname=example_database user=example_user, password=example_password")

Once this connection is created, users can connect their Lux Dataframe to the database using the Lux Dataframe's set_SQL_connection command.
Once this connection is created, users can connect the lux config to the database using the set_SQL_connection command.

.. code-block:: python

lux_df.set_SQL_connection(connection, "my_table")
lux.config.set_SQL_connection(connection)

When the set_SQL_connection function is called, Lux will then populate the Dataframe with all the metadata it needs to run its intent from the database table.
When the set_SQL_connection function is called, Lux will then populate the LuxSQLTable with all the metadata it needs to run its intent from the database table.

Connecting a LuxSQLTable to a Table/View
--------------------------

LuxSQLTables can be connected to individual tables or views created within your Postgresql database. This can be done by either specifying the table/view name in the constructor.

.. code-block:: python

sql_tbl = LuxSQLTable(table_name = "my_table")

You can also connect a LuxSQLTable to a table/view by using the set_SQL_table function.

.. code-block:: python

sql_tbl = LuxSQLTable()
sql_tbl.set_SQL_table("my_table")

Choosing an Executor
--------------------------

Once a user has created a connection to their Postgresql database, they need to change Lux's execution engine so that the system can collect and process the data properly.
By default Lux uses the Pandas executor to process local data in the Lux Dataframe, but users need to use the SQL executor when their Lux Dataframe is connected to a database.
Users can specify the executor that a Lux Dataframe will use via the set_executor_type function as follows:
By default Lux uses the Pandas executor to process local data in the LuxDataframe, but users will use the SQL executor when their LuxSQLTable is connected to a database.
Users can specify the executor that Lux will use via the set_executor_type function as follows:

.. code-block:: python

lux_df.set_executor_type("SQL")

Once a Lux Dataframe has been connected to a Postgresql table and set to use the SQL Executor, users can take full advantage of Lux's visual exploration capabilities as-is. Users can set their intent to specify which variables they are most interested in and discover insightful visualizations from their database.
Once a LuxSQLTable has been connected to a Postgresql table and set to use the SQL Executor, users can take full advantage of Lux's visual exploration capabilities as-is. Users can set their intent to specify which variables they are most interested in and discover insightful visualizations from their database.

SQL Executor Limitations
--------------------------

While users can make full use of Lux's functionalities on data within a database table, they will not be able to use any of Pandas' Dataframe functions to manipulate the data. Since the Lux SQL Executor delegates most data processing to the Postgresql database, it does not pull in the entire dataset into the Lux Dataframe. As such there is no actual data within the Lux Dataframe to manipulate, only the relevant metadata required to for Lux to manage its intent. Thus, if users are interested in manipulating or querying their data, this needs to be done through SQL or an alternative RDBMS interface.
While users can make full use of Lux's functionalities on data within a database table, they will not be able to use any of Pandas' Dataframe functions to manipulate the data in the LuxSQLTable object. Since the Lux SQL Executor delegates most data processing to the Postgresql database, it does not pull in the entire dataset into the Lux Dataframe. As such there is no actual data within the LuxSQLTable to manipulate, only the relevant metadata required to for Lux to manage its intent. Thus, if users are interested in manipulating or querying their data, this needs to be done through SQL or an alternative RDBMS interface.
1 change: 1 addition & 0 deletions lux/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
# Register the commonly used modules (similar to how pandas does it: https://github.com/pandas-dev/pandas/blob/master/pandas/__init__.py)
from lux.vis.Clause import Clause
from lux.core.frame import LuxDataFrame
from lux.core.sqltable import LuxSQLTable
from ._version import __version__, version_info
from lux._config import config
from lux._config.config import warning_format
Expand Down
6 changes: 5 additions & 1 deletion lux/_config/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -343,6 +343,7 @@ def set_SQL_connection(self, connection):
connection : SQLAlchemy connectable, str, or sqlite3 connection
For more information, `see here <https://docs.sqlalchemy.org/en/13/core/connections.html>`__
"""
self.set_executor_type("SQL")
self.SQLconnection = connection

def set_executor_type(self, exe):
Expand All @@ -358,10 +359,13 @@ def set_executor_type(self, exe):
from lux.executor.SQLExecutor import SQLExecutor

self.executor = SQLExecutor()
else:
elif exe == "Pandas":
from lux.executor.PandasExecutor import PandasExecutor

self.SQLconnection = ""
self.executor = PandasExecutor()
else:
raise ValueError("Executor type must be either 'Pandas' or 'SQL'")

thyneb19 marked this conversation as resolved.
Show resolved Hide resolved

def warning_format(message, category, filename, lineno, file=None, line=None):
Expand Down
1 change: 0 additions & 1 deletion lux/action/correlation.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,6 @@ def correlation(ldf: LuxDataFrame, ignore_transpose: bool = True):
)
msr1 = measures[0].attribute
msr2 = measures[1].attribute

if ignore_transpose:
check_transpose = check_transpose_not_computed(vlist, msr1, msr2)
else:
Expand Down
2 changes: 1 addition & 1 deletion lux/action/custom.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ def custom_actions(ldf):
recommendations : Dict[str,obj]
object with a collection of visualizations that were previously registered.
"""
if len(lux.config.actions) > 0 and len(ldf) > 0:
if len(lux.config.actions) > 0 and (len(ldf) > 0 or lux.config.executor.name != "PandasExecutor"):
recommendations = []
for action_name in lux.config.actions.keys():
display_condition = lux.config.actions[action_name].display_condition
Expand Down
140 changes: 28 additions & 112 deletions lux/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
from lux.vis.Vis import Vis
from lux.vis.VisList import VisList
from lux.history.history import History
from lux.utils.date_utils import is_datetime_series
from lux.utils.message import Message
from lux.utils.utils import check_import_lux_widget
from typing import Dict, Union, List, Callable
Expand Down Expand Up @@ -57,8 +58,6 @@ class LuxDataFrame(pd.DataFrame):
]

def __init__(self, *args, **kw):
from lux.executor.PandasExecutor import PandasExecutor

self._history = History()
self._intent = []
self._inferred_intent = []
Expand All @@ -70,7 +69,14 @@ def __init__(self, *args, **kw):
super(LuxDataFrame, self).__init__(*args, **kw)

self.table_name = ""
lux.config.executor = PandasExecutor()
if lux.config.SQLconnection == "":
from lux.executor.PandasExecutor import PandasExecutor

lux.config.executor = PandasExecutor()
else:
from lux.executor.SQLExecutor import SQLExecutor

lux.config.executor = SQLExecutor()

self._sampled = None
self._toggle_pandas_display = True
Expand Down Expand Up @@ -110,14 +116,25 @@ def data_type(self):
return self._data_type

def maintain_metadata(self):
is_sql_tbl = lux.config.executor.name == "SQLExecutor"
if lux.config.SQLconnection != "" and is_sql_tbl:
from lux.executor.SQLExecutor import SQLExecutor

lux.config.executor = SQLExecutor()

# Check that metadata has not yet been computed
if not hasattr(self, "_metadata_fresh") or not self._metadata_fresh:
# only compute metadata information if the dataframe is non-empty
if len(self) > 0:
lux.config.executor.compute_stats(self)
if is_sql_tbl:
lux.config.executor.compute_dataset_metadata(self)
self._infer_structure()
self._metadata_fresh = True
else:
if len(self) > 0:
lux.config.executor.compute_stats(self)
lux.config.executor.compute_dataset_metadata(self)
self._infer_structure()
self._metadata_fresh = True

def expire_recs(self):
"""
Expand Down Expand Up @@ -168,12 +185,14 @@ def _infer_structure(self):
# If the dataframe is very small and the index column is not a range index, then it is likely that this is an aggregated data
is_multi_index_flag = self.index.nlevels != 1
not_int_index_flag = not pd.api.types.is_integer_dtype(self.index)
small_df_flag = len(self) < 100
is_sql_tbl = lux.config.executor.name == "SQLExecutor"

small_df_flag = len(self) < 100 and is_sql_tbl
if self.pre_aggregated == None:
self.pre_aggregated = (is_multi_index_flag or not_int_index_flag) and small_df_flag
if "Number of Records" in self.columns:
self.pre_aggregated = True
self.pre_aggregated = "groupby" in [event.name for event in self.history]
self.pre_aggregated = "groupby" in [event.name for event in self.history] and not is_sql_tbl

thyneb19 marked this conversation as resolved.
Show resolved Hide resolved
@property
def intent(self):
Expand Down Expand Up @@ -317,110 +336,6 @@ def current_vis(self):
def current_vis(self, current_vis: Dict):
self._current_vis = current_vis

#######################################################
########## SQL Metadata, type, model schema ###########
#######################################################

def set_SQL_table(self, t_name):
self.table_name = t_name
self.compute_SQL_dataset_metadata()

def compute_SQL_dataset_metadata(self):
self.get_SQL_attributes()
for attr in list(self.columns):
self[attr] = None
self._data_type = {}
#####NOTE: since we aren't expecting users to do much data processing with the SQL database, should we just keep this
##### in the initialization and do it just once
self.compute_SQL_data_type()
self.compute_SQL_stats()

def compute_SQL_stats(self):
# precompute statistics
self.unique_values = {}
self._min_max = {}

self.get_SQL_unique_values()
# self.get_SQL_cardinality()
for attribute in self.columns:
if self._data_type[attribute] == "quantitative":
self._min_max[attribute] = (
self[attribute].min(),
self[attribute].max(),
)

def get_SQL_attributes(self):
if "." in self.table_name:
table_name = self.table_name[self.table_name.index(".") + 1 :]
else:
table_name = self.table_name
query = f"SELECT column_name FROM INFORMATION_SCHEMA.COLUMNS where TABLE_NAME = '{table_name}'"
attributes = list(pd.read_sql(query, lux.config.SQLconnection)["column_name"])
for attr in attributes:
self[attr] = None

def get_SQL_cardinality(self):
cardinality = {}
for attr in list(self.columns):
card_query = pd.read_sql(
f"SELECT Count(Distinct({attr})) FROM {self.table_name}",
lux.config.SQLconnection,
)
cardinality[attr] = list(card_query["count"])[0]
self.cardinality = cardinality

def get_SQL_unique_values(self):
unique_vals = {}
for attr in list(self.columns):
unique_query = pd.read_sql(
f"SELECT Distinct({attr}) FROM {self.table_name}",
lux.config.SQLconnection,
)
unique_vals[attr] = list(unique_query[attr])
self.unique_values = unique_vals

def compute_SQL_data_type(self):
data_type = {}
sql_dtypes = {}
self.get_SQL_cardinality()
if "." in self.table_name:
table_name = self.table_name[self.table_name.index(".") + 1 :]
else:
table_name = self.table_name
# get the data types of the attributes in the SQL table
for attr in list(self.columns):
query = f"SELECT DATA_TYPE FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = '{table_name}' AND COLUMN_NAME = '{attr}'"
datatype = list(pd.read_sql(query, lux.config.SQLconnection)["data_type"])[0]
sql_dtypes[attr] = datatype

for attr in list(self.columns):
if attr in self._type_override:
data_type[attr] = self._type_override[attr]
elif str(attr).lower() in ["month", "year"]:
data_type[attr] = "temporal"
elif sql_dtypes[attr] in [
"character",
"character varying",
"boolean",
"uuid",
"text",
]:
data_type[attr] = "nominal"
elif sql_dtypes[attr] in [
"integer",
"real",
"smallint",
"smallserial",
"serial",
]:
if self.cardinality[attr] < 13:
data_type[attr] = "nominal"
else:
data_type[attr] = "quantitative"
elif "time" in sql_dtypes[attr] or "date" in sql_dtypes[attr]:
data_type[attr] = "temporal"
self._data_type = data_type

def _append_rec(self, rec_infolist, recommendations: Dict):
if recommendations["collection"] is not None and len(recommendations["collection"]) > 0:
rec_infolist.append(recommendations)
Expand Down Expand Up @@ -470,6 +385,7 @@ def maintain_recs(self, is_series="DataFrame"):

# Check that recs has not yet been computed
if not hasattr(rec_df, "_recs_fresh") or not rec_df._recs_fresh:
is_sql_tbl = lux.config.executor.name == "SQLExecutor"
rec_infolist = []
from lux.action.row_group import row_group
from lux.action.column_group import column_group
Expand All @@ -479,7 +395,7 @@ def maintain_recs(self, is_series="DataFrame"):
if rec_df.columns.name is not None:
rec_df._append_rec(rec_infolist, row_group(rec_df))
rec_df._append_rec(rec_infolist, column_group(rec_df))
elif not (len(rec_df) < 5 and not rec_df.pre_aggregated) and not (
elif not (len(rec_df) < 5 and not rec_df.pre_aggregated and not is_sql_tbl) and not (
self.index.nlevels >= 2 or self.columns.nlevels >= 2
):
from lux.action.custom import custom_actions
Expand Down
2 changes: 1 addition & 1 deletion lux/core/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,13 +43,13 @@ class LuxSeries(pd.Series):
"_prev",
"_history",
"_saved_export",
"name",
thyneb19 marked this conversation as resolved.
Show resolved Hide resolved
"_sampled",
"_toggle_pandas_display",
"_message",
"_pandas_only",
"pre_aggregated",
"_type_override",
"name",
]

_default_metadata = {
Expand Down
Loading