Initial implementation of the ordinal recoder. #11098
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No integration yet, just the recoder. Retrieving and storing the data in xgboost is more complicated than the recoder itself, will upstream it in future PRs.
The recoder still uses some utilities in XGBoost like the
Span
class and an iterator. If we want to extract it to a different project, we can find a different implementation of these utilities. Other things like error handling and memory allocation can be customized through the policy class.Tests require a container class in XGBoost, we can merge the container into the encoder module if needed. At the moment, the encoder is view-only and doesn't own any memory. Larger and more sophisticated tests will be done after the Python and R integration is finished.