Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include original mamba #145

Merged
merged 22 commits into from
Oct 24, 2024
Merged

Include original mamba #145

merged 22 commits into from
Oct 24, 2024

Conversation

AnFreTh
Copy link
Collaborator

@AnFreTh AnFreTh commented Oct 24, 2024

This pull request includes several updates to the Mambular package, focusing on enhancing the README documentation, adding new features, and refactoring code for better modularity and maintainability. The key changes are as follows:

Model Updates:

  • Added a new model, MambAttention. Include it to the list of available models in the README (README.md).
  • Provided additional installation instructions for using original mamba implementations and specified compatible torch and cuda versions (README.md).
  • Included Quantile as a new distribution class for quantile regression in the README (README.md).
  • Included original mamba-ssm implementation and usage of both, Mamba and Mamba2 in Mambular and MambaTab models

Feature Additions:

  • Introduced the get_normalization_layer function to dynamically select normalization layers based on the configuration (mambular/arch_utils/get_norm_fn.py).
  • Added a new _init_weights function for initializing weights in the mamba_utils module (mambular/arch_utils/mamba_utils/init_weights.py).

Code Refactoring:

  • Refactored the Mamba class to use a configuration object for initialization, improving readability and flexibility (mambular/arch_utils/mamba_utils/mamba_arch.py).
  • Updated the ResidualBlock and MambaBlock classes to include detailed docstrings and additional parameters for better clarity and functionality (mambular/arch_utils/mamba_utils/mamba_arch.py). [1] [2]
  • Incorporated the pscan function from the mambapy package conditionally in the MambaBlock class to enhance selective scan sequences (mambular/arch_utils/mamba_utils/mamba_arch.py).

These changes collectively improve the usability, documentation, and maintainability of the Mambular package.

@AnFreTh AnFreTh merged commit 95e87dd into develop Oct 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant