Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #1248 #1337

Merged
merged 2 commits into from
Oct 11, 2023
Merged

Fixes #1248 #1337

merged 2 commits into from
Oct 11, 2023

Conversation

smastelini
Copy link
Member

This was a rather cryptic bug that was really difficult to track down.

Thank you @FedericoMz for providing the reproducible example, I would never find the culprit without that.

In the end, the problem was not in EFDT but in the splitter used in nominal attributes. It previously used collection.defaultdict objects in an attempt to simplify and speed up the code.

However, during prediction actions (such as predict_proba_one and debug_one) if the collection.defaultdict in the splitter was checked for the existence of a previously non-observed branch, the missing value would be inadvertently created.

This created a mismatch between the number of existing and tracked tree branches.

This bug could potentially affect other trees, but was really hard to come by.

@smastelini smastelini merged commit 23143c6 into main Oct 11, 2023
9 of 11 checks passed
@smastelini smastelini deleted the efdt-bug branch October 11, 2023 20:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant