-
-
Notifications
You must be signed in to change notification settings - Fork 559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Release v1.0 #957
Closed
Closed
Release v1.0 #957
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
New procedure: when we merge PRs we also request that a note is added to the changelog so that we can keep track of all the changes without having to do a full review of all commits on the release day!
Fixes a mistake in the Rank2D documentation stating that covariance is the default ranking algorithm instead of Pearson correlation score. Resolves #660
* allowing once more mpl 3.x * explicitely exclude mpl 3.0.0 due to bug
* Added code review checklist Broke down the contributing.rst file into multiple smaller files for easier editing. In the advanced development topics section, I added a section for common code, testing, and documentation conventions, listing some things that Nathan mentioned and that are part of the ongoing visualizer audit process. Fixes #345
The test for the DispersionPlot quickmethod was never created. I just overlooked its creation. This PR adds an 'assert_images_similar' image comparison for the quickmethod.
Fix subsection headings in the documentation that caused the "API Reference" subsection not to be displayed in the Table of Contents.
Created a text-specific visualizer to project a vectorized corpus in two dimensions using UMAP (Uniform Manifold Approximation and Projection). This implementation is very similar to the TSNE implementation, but is fast, scalable, and can be applied directly to sparse matrices without a preprocessing step such as SVD.
Updates v1.0 changelog to reflect new PRs (fixing last name type, dispersion plot quick method, target color type update)
Wraps up the enhancement to the FeatureImportances visualizer which added a stacked bar chart optional parameter in the case of multi-dimensional importances. Updated the documentation to reflect the stack=True situation as well as issued a warning if stack is False but should be True. Updated tests for better coverage. Closes #531
Implements a helper function that returns continuous or discrete depending on the type of target variable, `y`. This function is similar to the functionality in the Manifold visualizer but makes use of sklearn.util.multiclass.type_of_target to make its determination, along with a limit to the number of discrete colors that can be drawn. Was undecided if this belonged in `yellowbrick.utils` or in `yellowbrick.target` -- am open to discussion on this topic. Fixes #73
…sot (#680) * fixes resolve colors bug in tsne visualizer identified by jerome massot * adds test coverage for user-supplied color list in TSNE * fixes analogous resolve colors bug in UMAP visualizer and symmetric test coverage
This PR implements a significant change in the way yellowbrick handles datasets, moving them from data that can be downloaded and loaded using example code to prime time members of the library that can be loaded into pandas data frames and series or into well-structured numpy arrays with correct data types. We have completely overhauled dataset management using the yellowbrick-datasets repository as our data management tool. Data is still stored on S3 but contains .csv.gz and meta.json files for loading into pandas if it's installed or .npz files for loading into valid numpy arrays. New `Dataset` and `Corpus` manage access to the data, downloading it if it's not already on disk and providing access to the contents in the source directory. We maintain our security checking with sha256 hashes and a new manifest.json method. Fixes #416
The RFECV visualizer had a bug when the hyperparameter step > 1. The step was correctly passed to the internal RFE estimator, which removed that number of features per iteration, however the feature subsets that were tried for cross-validation did not match the step resulting in a figure that looked like no step was actually applied. This patch fixes the bug and creates a test to ensure this works correctly. To manage the feature selection subspace, a new learned attribute, `n_feature_subsets_` was added. Fixes #664
…th some small rewrites (#692) enhance: add link in readme to testing instructions add installation section
Repairs breaking tests to resolve travis pyflakes error and appveyer value error, updates baseline images, adds skips for some unresolved tests from contrib package.
Resolves errors related to the release of pytest 4.2, which broke some custom YB test code that controlled how are tests are printed.
* Documents the yellowbrick.download script. Adds documentation for the `yellowbrick.download` script that is included for dataset management. The documentation is located in both the README.md and in the contributor's guide. This documentation will hopefully also assist developers who are having trouble with older dataset versions on their computers. Closes #693
POC for auto-generation of images in the scikit-yb docs, focused on feature and cluster visualizers.
Updates code to import data in regression notebook following recent overhaul of datasets module
Repairs broken links for rank2d and jointplots in walkthrough docs
This PR was primarily intended to add plot directives to the quickstart guide, ensuring that the images in the tutorial were always up to date with the library. The quickstart guide separates the code blocks from the plot directives so that the code is a linear narrative rather than the verbosity required for each independent code block. This duplicates the code a little bit but makes it more readable. To ensure that the code is correct I also created a notebook examples/walkthrough.ipynb with the code from the quickstart. Along the way there were some bugs with jointplot and rankd that I also fixed. The JointPlotVisualizer needs some more work but it is now stable. Additionally, in poof() if the visualizer didn't have an axes object it would exit silently. This made it hard to find bugs. Now instead of exiting we simply issue a warning and carry on.
Streamlines readme and adds gallery of visualizers
Adds tag and summary of hotfix v0.9.1 to changelog in docs
Updates documentation for DispersionPlot but leaves static image for use in the gallery. Part of #687
Makes some small modifications to PRCurve to allow users to specify the `iso_f1_values` rather than hardcoding them and to allow users to provide an optional X_test and y_test to the quick method. Part of #610
This PR is intended to refresh the contributor's docs a bit, adding in a few more of our tips and conventions, specifically around things like installing the library in editable mode, merging in PRs, and using feature branches. It follows up on #689 and a few of the conversations that have come up amongst the maintainers recently.
This PR closes #931 and updates yellowbrick's port of the kneed library.
* updates to headers and minor audit cleanup * elbow fix when score doesn't exist
Added two helper utilities: `is_fitted` and `check_fitted` that control the fitted estimator checking. If the user supplies `is_fitted='auto'`, the default, the model visualizer will check if it's fitted using a mechanism recommended by the scikit-learn team and Stack Overflow and will not fit a fitted estimator. Otherwise, the model visualizer will accept the recommendation of the user to fit or not fit the wrapped estimator. Fixes #297
* audit of the first half of the feature visualizers, moving rfecv and importances to model selection, still need to fix some broken tests * set kwargs properly in pcoords and scatter and remove unused import * remove unused mpl import in scatter
This PR fixes some minor bugs in Manifold, adjusts the proportions of PCA with respect to the feature strength heatmap, tweaks axis labels and tests.
This PR is towards #669 and #456 and #600 and #509 but focused on the `yellowbrick.features` module and completes the work started in #945. - Performed general linting and applied black formatting to the files & made code header updates. - Updated quick methods to return the visualizer rather than the axes - Update the docs to reflect the move of RFECV and FeatureImportances from features to model_selection module Closes #669
This bugfix closes #943, handling the case where there are no elbows detected by kneed.py.
This PR adds in a new page to the docs that illustrates the usage of the Yellowbrick quick methods.
Switches to using a markdown version of DESCRIPTION and moves to using the banner image throughout the docs and readme, removing the old individual files and adding in the affiliate images.
Shores up classifiers, introducing new base-level helpers for label encoding, test coverage for fitted and unfitted classification visualizers and label edge cases, super score calls in each subclass, and repaired some quick methods
This PR ensures poof always returns ax. Closes #375
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Version 1.0 Release.