-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bootstrap fixes #250
Merged
Merged
Bootstrap fixes #250
Changes from 6 commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
86a0286
Manual style fix pass
tcare f113ec2
Bootstrap script: enforce letters and underscores only
tcare 944f764
Improve error handling and argument validation
tcare 96d4ad1
Avoid sklearn import error after bootstrap script runs
tcare f8f88ce
Update bootstrap README with standardized args
tcare 6c9cfa4
Linting fixes
tcare 9fdf97c
Factor out diabetes CSV creation
tcare File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't be so strong in this message. We don't know the reason of the error for sure. We just guess. I would go with something like "Failed to load diabetes dataset, perhaps the project has already ..."
The thing is that it will still rename load_diabetes into load_we_dont_call_his_name which introduces a buggy code (that we handle with this try-except). Perhaps it would make sense to move out all this load_diabetes dataset creation to a separate module (imported in this file) and exclude that module/file from "files" in replace_project_name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, separating dataset creation is the right way to go. A quick hotfix is to call replaceprojectname on the training script to rename the specific import.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought ImportError would be enough of a limited scope to avoid any weirdness. However you made me realize that this would actually be the first time that we encounter sklearn in the flow, so this could hide a dependency error. At the very least, we need to check that sklearn exists and the loading function doesn't.
Re: moving it out, if we're going to do the non-hacky fix, I feel that this should be generic enough that you don't need to rely on having a predefined dataset to export and rather can use a csv.