Skip to content

A framework for time series modeling using adaptive expertise.

License

Notifications You must be signed in to change notification settings

KhaledAlkilane89/MixMamba

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

66 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MixMamba

This repository is the official implementation of the paper: MixMamba: Time Series Modeling with Adaptive Expertise

Introduction

The heterogeneity and non-stationary characteristics of time series data continue to challenge single models’ ability to capture complex temporal dynamics, especially in long-term forecasting. Therefore, we propose MixMamba that:

  • Leverages the Mamba model as an expert within a mixture-of-experts (MoE). This framework decomposes modeling into a pool of specialized experts, enabling the model to learn robust representations and capture the full spectrum of patterns present in time series data.
  • A dynamic gating network is introduced to adaptively allocates each data segment to the most suitable expert based on its characteristics allows the model to adjust dynamically to temporal changes in the underlying data distribution.
  • To prevent bias towards a limited subset of experts, a load balancing loss function is incorporated.

Schematic Architecture

MixMamba is a time series forecasting model that utilizes a mixture-of-experts (MoM) approach. The model's architecture consists of four primary stages:

  • Pre-processing: Raw time series data undergoes normalization and segmentation to create patches.
  • Embedding and Augmentation: Patches are embedded and augmented with positional information to provide context.
  • MoM Block: This central component consists of multiple Mamba experts coordinated by a gating network. Each Mamba expert employs a series of projections, convolutions, selective SSM, and a skip connection to learn temporal dependencies.
  • Prediction Head: A linear prediction head is used to generate final outputs based on the learned representations.

architecture

Algorithms

Visualization

Long-term Forecasting with 𝐿 = 96 and 𝑇 = 192 on ETTh1.

Long-term Forecasting with 𝐿 = 96 and 𝑇 = 192 on ETTh1.

Long-term Forecasting with 𝐿 = 96 and 𝑇 = 192 on Weather.

Long-term Forecasting with 𝐿 = 96 and 𝑇 = 192 on Weather

Short-term Forecasting on M4 (Yearly).

Description of picture 3

Install

Please follow the guide here to prepare the environment on Linux OS.

  1. Clone this repository
git clone https://github.com/KhaledAlkilane89/MixMamba.git
cd MixMamba
  1. Create environment and install package:
conda create -n mixmamba python=3.10 -y
conda activate mixmamba
pip install -r requirements.txt
  1. Datasets can be downloaded from either Google Drive or Baidu Drive. After downloading, place the data in the ./dataset folder.

Usage

Train and evaluate the model using the scripts provided in the ./scripts/ directory. Please refer to the following example for reproducing the experimental results:

  • Long-term forecasting: bash ./scripts/long_term_forecast/ETT_script/mixmamba_ETTh1.sh
  • Short-term Forecasting: bash ./scripts/short_term_forecast/mixmamba_M4.sh
  • Classification: bash ./scripts/classification/mixmamba.sh

Main Results

Long-term forecasting performance on various datasets

Multivariate Long-term Forecasting.

long_term_results

Multivariate Short-term Forecasting.

short_term_results

Classification.

classification

Model Analysis

  • Mixmamba performance under varied look-back window length $𝐿 ∈ {96, 192, 336, 720}$ on PEMS03 datasets ($𝑇 = 720$) (Upper left).
  • Comparison of memory usage (Up) and computation time (Down) on ETTm2 dataset (Batch size is set to 32) (Upper right).
  • Comparison of learned representations for different experts on ETTm1 dataset with $𝐿 = 96, 𝑇 = 720$ (Down left).
  • Hyperparameters analysis on exchange and ILI datasets ($𝐿 = 96, 𝑇 = 720$). (Down right)

Citation

If you use this code or data in your research, please cite:

@article{ALKILANE2024102589,
title = {MixMamba: Time series modeling with adaptive expertise},
author = {Khaled Alkilane and Yihang He and Der-Horng Lee},
journal = {Information Fusion},
volume = {112},
pages = {102589},
year = {2024},
issn = {1566-2535},
doi = {https://doi.org/10.1016/j.inffus.2024.102589},
url = {https://www.sciencedirect.com/science/article/pii/S1566253524003671}
}

Contact Information

For inquiries or to discuss potential code usage, please reach out to the following researchers:

Acknowledgement

We'd like to express our gratitude to the following GitHub repositories for their exceptional codebase:

About

A framework for time series modeling using adaptive expertise.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published