-
-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX ignore nan values when summing posteriors #291
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems a simple and straightforward fix to me. I'll write unit tests about it, but first want to see if it cause other problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is not checking the ignore setting properly: test_impute_posteriors
. As it doesn't give an error when there's too many nan
's.
The fix is also incomplete, as np.nansum
gives zeros when all the tree posteriors are nan
's. impute_missing
is not working as it should. I'm forcing it to work on zero mask posteriors
, as the other two leaf correction methods would not result in zeros.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #291 +/- ##
==========================================
- Coverage 78.55% 78.53% -0.02%
==========================================
Files 24 24
Lines 2252 2250 -2
Branches 414 413 -1
==========================================
- Hits 1769 1767 -2
Misses 352 352
Partials 131 131 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments.
Re CI failures:
FAILED sktree/stats/tests/test_forestht.py::test_small_dataset_dependent[0] - AssertionError
FAILED sktree/tests/test_extensions.py::test_predict_proba_per_tree[HonestForestClassifier-2] - AssertionError
FAILED sktree/tests/test_extensions.py::test_predict_proba_per_tree[HonestForestClassifier-3] - AssertionError
I would recommend:
- test_small_dataset_dependent, just adding the
honest_prior='empirical'
to make the test backwards compatible. - test_predict_proba_per_tree, add an extra kwarg for
HonestForestClassifier
for thehonest_prior='empirical'
. You can also add a test fortest_predict_proba_per_tree
for HonestForestClassifier ifhonest_prior='ignore'
.
Co-authored-by: Adam Li <adam2392@gmail.com>
for more information, see https://pre-commit.ci
This reverts commit 3e2cda8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adam2392 you think this is mergeable?
Thanks for the PR @PSSF23 |
Close #290
np.nan
posteriors whenhonest_prior="ignore"