Skip to content

yeerwen/MedCoSS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Oct 20, 2024
4ddc58f · Oct 20, 2024

History

17 Commits
Apr 12, 2024
Mar 18, 2024
Mar 4, 2024
Mar 3, 2024
Mar 3, 2024
Mar 3, 2024
Apr 12, 2024
Mar 3, 2024
Jun 18, 2024
Mar 3, 2024
Mar 3, 2024
Oct 20, 2024
Mar 3, 2024
Mar 3, 2024
Mar 15, 2024
Mar 3, 2024

Repository files navigation

MedCoSS

This is the official Pytorch implementation of our CVPR 2024 paper (Highlight) "Continual Self-supervised Learning: Towards Universal Multi-modal Medical Data Representation Learning".

MedCoSS illustration

Requirements

CUDA 11.5
Python 3.8
Pytorch 1.11.0
CuDNN 8.3.2.44

Data Preparation

Pre-processing

  • Pre-training data
    • Report: Following MGCA's procedure to pre-process the MIMIC-CXR dataset.
    • X-ray: Using Preprocess/MIMIC_CXR_JPG_Preprocess.py to pre-process the MIMC-CXR dataset.
    • CT:
      • Using Preprocess/DL_save_nifti.py (from downloaded files) to transfer the PNG image to the nii.gz form.
      • Using Preprocess/re_spacing_ITK.py to resample CT volumes.
      • Using Preprocess/splitting_to_patches.py to extract about 125k sub-volumes, and the pre-processed dataset will be saved in DL_patches_v2/.
      • Using Preprocess/DeepLesion_Resize.py to resize images.
    • MRI:
      • Using Preprocess/ADNI_Resize.py to resize images.
      • Using Preprocess/ADNI_split_slice.py to extract about 59k sub-volumes.
    • Pathological imaging: Using Preprocess/TCGA_Preprocess.py to pre-process seven TCGA datasets.
  • Fine-tuning data
    • PudMed20k dataset: None.
    • ChestXR dataset: None.
    • QaTav2 dataset: Using Preprocess/QaTav2.py to pre-process.
    • RICORD dataset: Using Preprocess/RICORD.py to pre-process. Data Splits can be obtained from /Downstream/Dim_3/RICORD/data_split.
    • LiTS dataset:
      • (1) Resampling all data to the same spacing of 1.5mm × 0.8mm × 0.8mm;
      • (2) Using the nnUNet v1 framework to pre-process.
    • VS dataset:
    • LA dataset: None.
    • NCH dataset: None.
    • GlaS dataset: Using Preprocess/GlaS.py to pre-process.

Pre-training

Pre-trained Model

Fine-tuning

  • Run sh run_ds.sh for fine-tuning. (one GPU with 11G. Before running it, you need to modify some addresses.)

To do

  • Dataset Links
  • Pre-processing Code
  • Pre-training Code Release
  • Pre-trained Model
  • Fine-tuning Code Release
  • Continual pre-training on new data

Citation

If this code is helpful for your study, please cite:

@article{ye2024medcoss,
  title={Continual Self-supervised Learning: Towards Universal Multi-modal Medical Data Representation Learning},
  author={Ye, Yiwen and Xie, Yutong and Zhang, Jianpeng and Chen, Ziyang and Wu, Qi and Xia, Yong},
  booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition},
  pages={11114-11124},
  year={2024},
}

Acknowledgements

The whole framework is based on MAE, Uni-Perceiver, and MGCA.

Contact

Yiwen Ye (ywye@mail.nwpu.edu.cn)

About

CVPR 2024 (Highlight)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published