Configuration for topk and sort order #206

dorisjlee · 2021-01-08T08:35:07Z

Configuration setting for Top-K and sorting order applied globally for all recommendations
Ignore series visualizations for series created via df.iterrows() or mixed type.

codecov-io · 2021-01-08T08:40:43Z

Codecov Report

Merging #206 (efd8fd0) into master (3393b9f) will increase coverage by 0.26%.
The diff coverage is 95.12%.

@@            Coverage Diff             @@
##           master     #206      +/-   ##
==========================================
+ Coverage   77.27%   77.54%   +0.26%     
==========================================
  Files          40       40              
  Lines        2812     2841      +29     
==========================================
+ Hits         2173     2203      +30     
+ Misses        639      638       -1

Impacted Files	Coverage Δ
lux/action/univariate.py	`86.48% <ø> (ø)`
lux/_config/config.py	`79.04% <89.47%> (+1.77%)`	⬆️
lux/action/correlation.py	`85.00% <100.00%> (+0.38%)`	⬆️
lux/action/enhance.py	`100.00% <100.00%> (ø)`
lux/action/filter.py	`91.66% <100.00%> (+0.11%)`	⬆️
lux/action/generalize.py	`80.95% <100.00%> (+0.46%)`	⬆️
lux/core/series.py	`42.64% <100.00%> (+0.85%)`	⬆️
lux/vis/VisList.py	`51.33% <100.00%> (+3.00%)`	⬆️
lux/vislib/altair/Heatmap.py	`96.55% <0.00%> (+3.44%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3393b9f...efd8fd0. Read the comment docs.

thyneb19

Looks good! Had a question about what some of the parameters in the lux actions are used for and if we could streamline the process of setting lux.config parameters.

thyneb19 · 2021-01-08T17:07:00Z

doc/source/reference/config.rst

+            df = pd.read_csv("..")
+            df # recommendations already generated here
+
+            df.expire_recs()


Is there a way to call expire_recs() automatically when a config property is set, like have setter functions and include expire_recs() at the beginning?

Good point! The trouble is that we don't currently have a list of dataframes applicable in the session so we don't have a way of expiring these. I've opened an issue #209 for this to address in the future. Thanks for raising this!

thyneb19 · 2021-01-08T17:22:12Z

doc/source/reference/gen/lux.vislib.altair.AltairChart.AltairChart.rst

@@ -19,6 +19,7 @@ lux.vislib.altair.AltairChart.AltairChart
      ~AltairChart.apply_default_config
      ~AltairChart.encode_color
      ~AltairChart.initialize_chart
+      ~AltairChart.sanitize_dataframe


What are the sanitize_dataframe parameters used for?

The sanitize_dataframe is used to clean out the vis.data to a form that Altair can accept as input (e.g., no special character, no non-string columns, etc.).

* Similarity as a default action (#182) * similarity formatting fixed * added another similarity test case; fixed bug where colored heatmap dimension is temporal (invalidate all 2 msr 1 temporal case) * filter and similarity together * filter and similarity together * remove filter * black line length * file reorg and clean; change sim metric Co-authored-by: Caitlyn Chen <caitlynachen@berkeley.edu> Co-authored-by: Doris Lee <dorisjunglinlee@gmail.com> * bump numpy min version for travis * Special character issue (#184) * rename col * broken * fixed period replacement bug * add tests * refine tests * refine tests * remove cols * fix tests * add agg * fixed tests * clean up PR Co-authored-by: Caitlyn Chen <caitlynachen@berkeley.edu> Co-authored-by: Doris Lee <dorisjunglinlee@gmail.com> * Colored bar interestingness bug (#189) * rewrote chi2 contingency with pd.crosstab * catching KeyError issue with chi2 contingency * padding interestingness with warning instead of error * interestingness now reuses ndim and nmsr computed in Compiler * bug fix for parser with int values * improve Vis repr to better display inferred intent when data is absent but fully compiled intent (all clauses) * Add sampling parameters as a global config (#192) * update export tutorial to add explanation for standalone argument * minor fixes and remove cell output in notebooks * added contributing doc * fix bugs and uncomment some tests * remove raise warning * remove unnecessary import * split up rename test into two parts * fix setting warning, fix data_type bugs and add relevant tests * remove ordinal data type * add test for small dataframe resetting index * add loc and iloc tests * fix attribute access directly to dataframe * add small changes to code * added test for qcut and cut * add check if dtype is Interval * added qcut test * fix Record KeyError * add tests * take care of reset_index case * small edits * add data_model to column_group Clause * small edits for row_group * fixes to row group * add config for start and cap for samples * finish sampling config and tests * black formatting * add documentation for sampling config * remove small added issues * minor changes to docs * implement heatmap flag and add tests * black formatting and documentation edits Co-authored-by: Doris Lee <dorisjunglinlee@gmail.com> * Coalesce all data_type attributes of frame into one (#185) * coalesce data_types into data_type_lookup * black reformat * changed to better variable names * lux not defined error * fixed * black format * Update CONTRIBUTING.md * Bug Fix: User-provided Index causes KeyError in Pandas Execution (#191) * Moved Executor Parameters to Global Config * Black formatting * Moved table_name parameter to frame.py. Removed executor_type parameter executor_type parameter no longer necessary to maintain * Fixed reference to table_name parameter table_name is now a parameter within frame.py * Adjusted Functions to Set SQL Connection Moved set_SQL_connection function to config. Added set_SQL_table function within frame.py to let users specify which database table will be associated with their dataframe * Update SQLExecutor name parameter * Fix Executor Reference Update current_vis() to reference lux.config.executor * Update frame.py * Moved set functions to global config * Fixed Index Issue in Pandas Executor Issue caused when user sets an index. The Pandas Executor was not correctly renaming this new index column to Record in execute_aggregate() * Added tests for set_index functions * Black formatting * Update Pandas Executor to handle NA values Readded missing dropna parameter within execute_aggregate() groupby function call * Updated Pandas Coverage Tests Commented out set_index case which has not been addressed yet * Black Formatting * Update to Pandas Executor Index Handling Cleaned up how execute_aggregrate renames index columns. Now retrieves the index name from vis.data instead of filtering out non-index columns. Created separate test function for when user specifies an index in read_csv. Co-authored-by: 19thyneb <thyne.boonmark@gmail.com> Co-authored-by: Doris Lee <dorisjunglinlee@gmail.com> * Initialize Config once only during __init__ (#194) * basic matplotlib chart example * migrate register default action to init * config class * move actions * fixed tests * changes * alright * fix plot_config * black reformat * black reformat Co-authored-by: Doris Lee <dorisjunglinlee@gmail.com> Co-authored-by: Caitlyn Chen <caitlynachen@berkeley.edu> Co-authored-by: Ujjaini Mukhopadhyay <ujjaini@berkeley.edu> * Update README.md * Series Bugfix for describe and convert_dtypes (#197) * bugfix for describe and convert_dtypes * added back metadata series test * black * default to pandas display when df.dtypes printed * Update Lux Docs (#195) * add black to travis * reformat all code and adjust test * remove .idea * fix contributing doc * small change in contributing * update * reformat, update command to fix version * remove dev dependencies * first pass -- inline comments * _config/config.py * delete test notebook * action * line length 105 * executor * interestingness * processor * vislib * tests, travis, CONTRIBUTING * .format () changed * replace tabs with escape chars * update using black * more rewrites and merges into single line * update pyproject.toml and makefile * coalesce data_types into data_type_lookup * black reformat * changed to better variable names * lux not defined error * fixed * black format * config doc updated * fix link for executor * more links * fixed overview * more links fixed * pandas methods no longer included * updates to some docstrings * black reformat * minor fixes * minor fix Co-authored-by: Doris Lee <dorisjunglinlee@gmail.com> * Supporting dataframe with integer columns (#203) * bugfix for describe and convert_dtypes * added back metadata series test * black * default to pandas display when df.dtypes printed * various fixes to support int columns * fixed merge conflict issues. vis.data shows None DF. * Override Pandas DataFrames created from I/O pandas operations (#207) * update export tutorial to add explanation for standalone argument * minor fixes and remove cell output in notebooks * added contributing doc * fix bugs and uncomment some tests * remove raise warning * remove unnecessary import * split up rename test into two parts * fix setting warning, fix data_type bugs and add relevant tests * remove ordinal data type * add test for small dataframe resetting index * add loc and iloc tests * fix attribute access directly to dataframe * add small changes to code * added test for qcut and cut * add check if dtype is Interval * added qcut test * fix Record KeyError * add tests * take care of reset_index case * small edits * add data_model to column_group Clause * small edits for row_group * fixes to row group * add config for start and cap for samples * finish sampling config and tests * black formatting * add documentation for sampling config * remove small added issues * minor changes to docs * implement heatmap flag and add tests * black formatting and documentation edits * add pd.io equalities for DataFrames Co-authored-by: Doris Lee <dorisjunglinlee@gmail.com> * Merge master into sql-engine + minor mergeconflict fixes * Removing the PYNB * Cleaning up obsolete code * Configuration for topk and sort order (#206) * bugfix for describe and convert_dtypes * added back metadata series test * black * default to pandas display when df.dtypes printed * various fixes to support int columns * skip series vis for df.iterrows series element * config setting for modifying top K and sorting * note about regenerated config * Version lock for jupyter-client (#211) * move to single requirements-dev without lux-widget install manually * pin jedi version * pin jupyter-client version * add back old travis and requirement-dev * Mixed dtype issue (#205) * coalesce data_types into data_type_lookup * merge fixed * merge conflicts * add warning and suggestion on how to fix * formatting for warnings version * change to internal data * legibility update * test added * update test * test updated * xlrd in dev reqs * black * update link * changes to test logic, minor string format for warning Co-authored-by: Doris Lee <dorisjunglinlee@gmail.com> * Fixes issue where value_counts was not returning LuxSeries (#210) * add series equality and value counts test * black formatting * fix old value counts test instead * minor fix Co-authored-by: Doris Lee <dorisjunglinlee@gmail.com> * bump version * update README Co-authored-by: Caitlyn Chen <caitlynachen@gmail.com> Co-authored-by: Caitlyn Chen <caitlynachen@berkeley.edu> Co-authored-by: Doris Lee <dorisjunglinlee@gmail.com> Co-authored-by: Kunal Agarwal <32151899+westernguy2@users.noreply.github.com> Co-authored-by: jinimukh <46768380+jinimukh@users.noreply.github.com> Co-authored-by: thyneb19 <thyneboonmark@berkeley.edu> Co-authored-by: 19thyneb <thyne.boonmark@gmail.com> Co-authored-by: Ujjaini Mukhopadhyay <ujjaini@berkeley.edu>

dorisjlee added 9 commits January 6, 2021 12:02

bugfix for describe and convert_dtypes

c1944a2

added back metadata series test

5c8b284

black

49daeec

default to pandas display when df.dtypes printed

801b469

various fixes to support int columns

a8ab02e

merge upstream/master

7a203df

skip series vis for df.iterrows series element

74f2c7e

config setting for modifying top K and sorting

298a87e

Merge remote-tracking branch 'upstream/master' into series

efd8fd0

note about regenerated config

a495180

dorisjlee requested a review from thyneb19 January 8, 2021 12:13

thyneb19 reviewed Jan 8, 2021

View reviewed changes

dorisjlee merged commit 623fb51 into lux-org:master Jan 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configuration for topk and sort order #206

Configuration for topk and sort order #206

dorisjlee commented Jan 8, 2021 •

edited

Loading

codecov-io commented Jan 8, 2021 •

edited

Loading

thyneb19 left a comment

thyneb19 Jan 8, 2021

dorisjlee Jan 9, 2021

thyneb19 Jan 8, 2021

dorisjlee Jan 9, 2021

Configuration for topk and sort order #206

Configuration for topk and sort order #206

Conversation

dorisjlee commented Jan 8, 2021 • edited Loading

codecov-io commented Jan 8, 2021 • edited Loading

Codecov Report

thyneb19 left a comment

Choose a reason for hiding this comment

thyneb19 Jan 8, 2021

Choose a reason for hiding this comment

dorisjlee Jan 9, 2021

Choose a reason for hiding this comment

thyneb19 Jan 8, 2021

Choose a reason for hiding this comment

dorisjlee Jan 9, 2021

Choose a reason for hiding this comment

dorisjlee commented Jan 8, 2021 •

edited

Loading

codecov-io commented Jan 8, 2021 •

edited

Loading