Skip to content

Commit

Permalink
remove train.py
Browse files Browse the repository at this point in the history
  • Loading branch information
duducheng committed Aug 16, 2021
1 parent db36110 commit 5bb14cb
Show file tree
Hide file tree
Showing 5 changed files with 479 additions and 409 deletions.
43 changes: 22 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,50 +21,51 @@ Please note that this dataset is **NOT** intended for clinical use.
# Code Structure
* [`medmnist/`](medmnist/):
* [`dataset.py`](medmnist/dataset.py): PyTorch datasets and dataloaders of MedMNIST.
* [`models.py`](medmnist/models.py): *ResNet-18* and *ResNet-50* models.
* [`evaluator.py`](medmnist/evaluator.py): Standardized evaluation functions.
* [`info.py`](medmnist/info.py): Dataset information `dict` for each subset of MedMNIST.
* [`train.py`](train.py): The training and evaluation script to reproduce the baseline results in the paper.
* [`getting_started.ipynb`](getting_started.ipynb): Explore the MedMNIST dataset with jupyter notebook. It is **ONLY** intended for a quick exploration, i.e., it does not provide full training and evaluation functionalities (please refer to [`train.py`](train.py) instead).
* [`examples/`](examples/):
* [`getting_started.ipynb`](examples/getting_started.ipynb): Explore the MedMNIST dataset with jupyter notebook. It is **ONLY** intended for a quick exploration, i.e., it does not provide full training and evaluation functionalities. Please refer to our another repository [`MedMNIST/experiments`](https://github.com/MedMNIST/experiments) for all experiments, including PyTorch, auto-sklearn, AutoKeras and Google AutoML Vision, together with their weights!
* [`getting_started_without_PyTorch.ipynb`](examples/getting_started_without_PyTorch.ipynb): This notebook provides snippets about how to use MedMNIST data (the `.npz` files) without PyTorch.
* [`setup.py`](setup.py): The script to install medmnist as a module

# Requirements
The code requires only common Python environments for machine learning; Basicially, it was tested with
# Installation and Requirements
Setup the required environments and install `medmnist` as a standard Python package:

pip install git+https://github.com/MedMNIST/MedMNIST.git


The code requires only common Python environments for machine learning. Basicially, it was tested with
* Python 3 (Anaconda 3.6.3 specifically)
* PyTorch\==0.3.1
* numpy\==1.18.5, pandas\==0.25.3, scikit-learn\==0.22.2, tqdm, Pillow
* numpy\==1.18.5, pandas\==0.25.3, scikit-learn\==0.22.2, Pillow\==8.0.1, fire

Higher (or lower) versions should also work (perhaps with minor modifications).

Higher (or lower) versions should also work (perhaps with minor modifications).
It is suggested to use our PyTorch [`dataset`](medmnist/dataset.py) code to parse the `.npz` files; However, you are free to parse them with your own code (without PyTorch or even without Python!), as they are only standard NumPy serialization files. Please refer to [`getting_started_without_PyTorch.ipynb`](examples/getting_started_without_PyTorch.ipynb), which provides snippets about how to use MedMNIST data (the `.npz` files) without PyTorch.


# Dataset

Please download the dataset(s) via [`Zenodo`](https://doi.org/10.5281/zenodo.4269852). You could also use our code to download automatically.

It is suggested to use our [`dataset`](medmnist/dataset.py) code to parse the `.npz` files; however, you are free to parse them with your own code (including but not limited to Python), as they are only standard NumPy serialization files.

The dataset contains several subsets, and each subset (e.g., `pathmnist.npz`) is comprised of 6 keys: `train_images`, `train_labels`, `val_images`, `val_labels`, `test_images` and `test_labels`.
The MedMNIST dataset contains several subsets. Each subset (e.g., `pathmnist.npz`) is comprised of 6 keys: `train_images`, `train_labels`, `val_images`, `val_labels`, `test_images` and `test_labels`.
* `train_images` / `val_images` / `test_images`: `N` x 28 x 28 x 3 for RGB, `N` x 28 x 28 for gray-scale. `N` denotes the number of samples.
* `train_labels` / `val_labels` / `test_labels`: `N` x `L`. `N` denotes the number of samples. `L` denotes the number of task labels; for single-label (binary/multi-class) classification, `L=1`, and `{1,2,3,4,5,..,C}` denotes the category labels (`C=2` for binary); for multi-label classification `L!=1`, e.g., `L=14` for `chestmnist.npz`.

# How to run the experiments
# How to Use

* Download the dataset manually or automatically (by setting `download=True` in [`dataset.py`](medmnist/dataset.py)).
* List all available datasets

python -m medmnist list

* [recommended] Install medmnist as a module by using command `python setup.py install`
* Download the dataset manually or automatically (by setting `download=True` in [`dataset.py`](medmnist/dataset.py)).

* Run the demo code [`train.py`](./train.py) script in terminal.
* Please refer to our another repository [`MedMNIST/experiments`](https://github.com/MedMNIST/experiments) for all experiments, including PyTorch, auto-sklearn, AutoKeras and Google AutoML Vision together with their weights!

First, change directory to where [`train.py`](./train.py) locates. Then, use command `python train.py --data_name xxxmnist --input_root input --output_root output --num_epoch 100 --download True` to run the experiments, where `xxxmnist` is subset of our MedMNIST (e.g., `pathmnist`), `input` is the path of the data files, `output` is the folder to save the results, `num_epoch` is the number of epochs of training, and `download` is the bool value whether download the dataset.

For instance, to run PathMNIST

python train.py --data_name pathmnist --input_root <path/to/input/folder> --output_root <path/to/output/folder> --num_epoch 100 --download True

# Citation
If you find this project useful, please cite our paper as:

Jiancheng Yang, Rui Shi, Bingbing Ni. "MedMNIST Classification Decathlon: A Lightweight AutoML Benchmark for Medical Image Analysis," IEEE 18th International Symposium on Biomedical Imaging (ISBI), 2021.
Jiancheng Yang, Rui Shi, Bingbing Ni. "MedMNIST Classification Decathlon: A Lightweight AutoML Benchmark for Medical Image Analysis," IEEE 18th International Symposium on Biomedical Imaging (ISBI), 2021.

or using bibtex:

Expand Down
File renamed without changes.
Loading

0 comments on commit 5bb14cb

Please sign in to comment.