-
-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stratify sampling when split train/test data #143
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If public API changes we need a test to control accidental misuse
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have corrected the change regressions & formatted the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the issue is that There are a few ideas that come to mind:
I am in favor of the first option which is only activated when we need |
Signed-off-by: Adam Li <adam2392@gmail.com>
I've implemented said changes. @YuxinB you must add yourself to the Assuming CIs work, I'll let @sampan501 and @PSSF23 review and merge if they are happy. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #143 +/- ##
==========================================
- Coverage 89.73% 89.66% -0.08%
==========================================
Files 41 41
Lines 3352 3367 +15
==========================================
+ Hits 3008 3019 +11
- Misses 344 348 +4
☔ View full report in Codecov by Sentry. |
examples/hypothesis_testing/plot_MI_gigantic_hypothesis_testing_forest.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the other thing that is missing is documentation for the stratify
parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously the tests for stratification were not working (they will always pass). I think I corrected them & added new ones @adam2392 requested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed duplicate checks as those would be covered by check_input
anyway. Explicitly setting the parameter to False
just to have new checks that check the same thing doesn't make sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adam2392 I believe the tests should be excluded from codecov? And do you think we need more tests?
Can you add docstring for stratify in the two classes? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I set the base forest default value to False
and removed it from the regressor class init
What set the default of |
Yeah just to give options. |
I only set the base forest value to false for regressor, classifier is still default true |
@adam2392 Do you know how to exclude |
Experimental we can include, but the tests should be not part of it. I think this might be a bug on codecov cuz it wasn't showing up before. |
The build-docs are failing because @YuxinB needs to add herself to the contributors doc:
The codecov/project doesn't need to pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM once CI green
You'll need to rename the references to the example file, since you changed the name of the example.
|
Done |
Thanks @YuxinB @PSSF23 and @sampan501 |
Fixes #
Changes proposed in this pull request:
Before submitting
section of the
CONTRIBUTING
docs.Writing docstrings section of the
CONTRIBUTING
docs.After submitting