-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve automatic bin determination for histograms #217
Labels
Comments
dorisjlee
added
enhancement
New feature or request
easy
Easy to fix; Good issues for newcomers
labels
Jan 11, 2021
micahtyong
added a commit
to micahtyong/lux
that referenced
this issue
Feb 22, 2021
dorisjlee
pushed a commit
that referenced
this issue
Mar 3, 2021
…d step attributes (#285) * Merge upstream * Sync with master * Fix bin size variance from #217 * Format and test * Change labelOverlap to True Co-authored-by: Dominik Moritz <domoritz@gmail.com> * Modify markbar; currently questioning whether or not it's needed * Remove markbar enitrely, rely on Altair automatic bin detection https://altair-viz.github.io/user_guide/generated/core/altair.BinParams.html * Modify code snippet * Revert "Remove markbar enitrely, rely on Altair automatic bin detection https://altair-viz.github.io/user_guide/generated/core/altair.BinParams.html" This reverts commit 9cb9418. * Implement bin size estimation via Freedman Diaconis's Rule * Use numpy to compute IQR for better performance (pandas too slow) * Add tests * Add test cases for histogram binning * Address changes from @domoritz review (small optimizations) * Black and format * Move histogram bin width computation to pandas executor (execute_binning) * Center bars between ticks in distribution setting * Renaming in execute_binning * Bin width computed accurately in execute_binning; no need for get_bin_size() * Revert to Freedman rule; maintain correct ticks Co-authored-by: Micah Yong <micahyong@Micahs-MacBook-Pro.local> Co-authored-by: Dominik Moritz <domoritz@gmail.com>
Closed via #285. |
dorisjlee
pushed a commit
that referenced
this issue
Mar 15, 2021
* Merge upstream * Sync with master * Fix bin size variance from #217 * Format and test * Change labelOverlap to True Co-authored-by: Dominik Moritz <domoritz@gmail.com> * Modify markbar; currently questioning whether or not it's needed * Remove markbar enitrely, rely on Altair automatic bin detection https://altair-viz.github.io/user_guide/generated/core/altair.BinParams.html * Modify code snippet * Revert "Remove markbar enitrely, rely on Altair automatic bin detection https://altair-viz.github.io/user_guide/generated/core/altair.BinParams.html" This reverts commit 9cb9418. * Implement bin size estimation via Freedman Diaconis's Rule * Use numpy to compute IQR for better performance (pandas too slow) * Add tests * Add test cases for histogram binning * Address changes from @domoritz review (small optimizations) * Black and format * Move histogram bin width computation to pandas executor (execute_binning) * Center bars between ticks in distribution setting * Renaming in execute_binning * Bin width computed accurately in execute_binning; no need for get_bin_size() * Revert to Freedman rule; maintain correct ticks * Sync exported code with new histogram bin determination rules * Sync exported code with new histogram bin determination rules * Modify histogram code test case Co-authored-by: Micah Yong <micahyong@Micahs-MacBook-Pro.local> Co-authored-by: Dominik Moritz <domoritz@gmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Currently, the formula for histogram binning sometimes results in bins that are very "skinny" and sometimes bins that are very "wide". We need to improve histogram bin width and size determination to ensure more accurate histograms are plotted.
This is especially true for the "Filter" action.
Example:
This needs to be customized for matplotlib and Altair.
The text was updated successfully, but these errors were encountered: