Neural Oblivious Decsion Ensembles #151
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request introduces several significant changes to the
mambular
project, focusing on adding new modules, implementing utility functions, and simplifying the normalization layer selection in base models. The most important changes include the addition of the NODE model and the refactoring of normalization layer selection.New Modules and Functions:
mambular/arch_utils/data_aware_initialization.py
: Added theModuleWithInit
class, which provides a base class for PyTorch modules with data-aware initialization on the first batch.mambular/arch_utils/layer_utils/sparsemax.py
: Implemented the sparsemax function and its backward pass, providing a sparse alternative to softmax.mambular/arch_utils/node_utils.py
: Introduced theODST
class for differentiable decision tree models and theDenseBlock
class for stacking layers of decision trees.mambular/arch_utils/numpy_utils.py
: Added thecheck_numpy
function to ensure a tensor is converted to a NumPy array.Refactoring:
mambular/base_models/ft_transformer.py
: Simplified the normalization layer selection by using theget_normalization_layer
function. [1] [2]mambular/base_models/mlp.py
: Refactored the normalization layer selection to use theget_normalization_layer
function, reducing redundancy. [1] [2]