Skip to content

wangck20/GlobalMamba

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GlobalMamba: Global Image Serialization for Vision Mamba

This repository is the official implementation of GlobalMamba: Global Image Serialization for Vision Mamba

GlobalMamba: Global Image Serialization for Vision Mamba

Chengkun Wang, Wenzhao Zheng, Jie Zhou, Jiwen Lu

Motivation of GlobalMamba

Alt text Vim and VMamba adopt a flattening strategy similar to (a) and (b), transmuting two-dimensional images into one-dimensional sequences by row or column, while LocalMamba (c) performs the corresponding flattening within a local window. Notably, these sequences lack the inherent causal sequencing of tokens that is characteristic of the causal architecture of Mamba causal architecture. Differently, GlobalMamba (d) constructs a causal token sequence by frequency, while ensuring that tokens acquire global feature information.

Overall framework of GlobalMamba

Alt text

Environments of training

  • Python 3.10.13

    • conda create -n your_env_name python=3.10.13
  • torch 2.1.1 + cu118

    • pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118
  • Requirements: globalmamba_requirements.txt

    • pip install -r globalmamba/globalmamba_requirements.txt
  • Install causal_conv1d and mamba

    • pip install -e causal_conv1d>=1.1.0
    • pip install -e mamba-1p1p1

Train Your GlobalMamba

bash globalmamba/scripts/tiny.sh

bash globalmamba/scripts/small.sh

The above code trains GlobalMamba based on Vim. We have reorganized the original token sequence based on frequency in the models_mamba.py file, so you only need to transfer this part to other vision mamba frameworks for comparison.

Results

Alt text

Acknowledgement

This project is based on Vision Mamba (code), Mamba (code), Causal-Conv1d (code), DeiT (code). Thanks for their wonderful works.

Citation

If you find this project helpful, please consider citing the following paper:

@article{wang2024globalmamba,
    title={GlobalMamba: Global Image Serialization for Vision Mamba},
    author={Chengkun Wang and Wenzhao Zheng and Jie Zhou and Jiwen Lu},
    journal={arXiv preprint arXiv:2410.10316},
    year={2024}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published