SSL-FL/data at main · rui-yan/SSL-FL

Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
data_split.py	data_split.py
view_data_split.ipynb	view_data_split.ipynb

Data Preparation

In this paper, we conduct experiments on Retina, Derm and COVID-FL datasets.

data
|-- Retina
    |-- central
    |-- 5_clients/
        |-- split_1/
            |-- client_1.csv
            |-- client_2.csv
            |-- client_3.csv
            |-- client_4.csv
            |-- client_5.csv
        |-- split_2
        |-- split_3
    |-- train
    |-- test
    |-- train.csv
    |-- test.csv
    |-- labels.csv
|-- COVID-FL
    |-- central
    |-- 12_clients
        |-- split_real
            |-- bimcv.csv  
            |-- cohen.csv  
            |-- eurorad.csv  
            |-- gz.csv  
            |-- ml-workgroup.csv  
            |-- ricord_c.csv  
            |-- rsna-0.csv  
            |-- rsna-1.csv  
            |-- rsna-2.csv  
            |-- rsna-3.csv  
            |-- rsna-4.csv  
            |-- sirm.csv
    |-- train
    |-- test
    |-- train.csv
    |-- test.csv
    |-- labels.csv
|-- tokenizer_weight
|-- ckpts

Each data folder contains 'n_clients' subfolder (n is the number of clients in the federated dataset), which includes data split information in a .csv file. The .csv file contains the filenames of the images belonging to each client in the data split.

If you want to train the model using your own custom datasets, please ensure that your data is organized according to the directory structure mentioned above. Alternatively, you may modify the data loader in SSL-FL/code/util/data_utils.py. You can also customize the data augmentation techniques used during training by modifying SSL-FL/code/util/datasets.py.

FL data construction

Here, data_split.py is used to simulate the IID and non-IID data partitions for Retina and Derm datasets. We will provide more details about the construction of COVID-FL dataset. You can visualize the generated data partitions in view_data_split.ipynb.

Download data used in the paper from Google Drive

Below are the download links for the Retina, COVID-FL, and Derm datasets.

	Retina	Derm	COVID-FL
Download Link	link	link	link

Use gdown to download data to your path (optional)

Step1: pip install gdown

Step2: gdown https://drive.google.com/uc?id=<the_file_id> where <the_file_id> can be obtained from the download links above.

Fed-BEiT and Fed-MAE pre-trained model checkpoints on target datasets:

To reproduce the results of Fed-BEiT and Fed-MAE as reported in the paper, you have two options.

First, you can perform pre-training using Fed-BEiT and Fed-MAE on the datasets, and then fine-tune the pre-trained models. Alternatively, you can fine-tune the pre-trained models with checkpoints provided below.

Federated pre-training with Retina

Method	Pre-training Data	Central	Split-1	Split-2	Split-3
Fed-BEiT	Retina	download	download	download	download
Fed-MAE	Retina	download	download	download	download

Federated pre-training with COVID-FL

Method	Pre-training Data	Central	Real-world Split
Fed-BEiT	COVID-FL	download	download
Fed-MAE	COVID-FL	download	download

pre-trained model checkpoints on ImageNet

To obtain the results with models pre-trained on ImageNet, you can download the model checkpoints with supervised training, BEiT and MAE pre-training. These checkpoints can be found on their official github pages. We also provide the links below.

model checkpoints supervised trained on ImageNet

Download the ViT-B/16 weights trained on ImageNet-22k:

wget https://storage.googleapis.com/vit_models/imagenet21k/ViT-B_16.npz

See more details in https://github.com/google-research/vision_transformer.

model checkpoints self-supervised trained on ImageNet

Download BEiT weights pre-trained on ImageNet-22k:

wget https://unilm.blob.core.windows.net/beit/beit_base_patch16_224_pt22k.pth Download MAE weights pretrained on ImageNet-22k:
wget https://dl.fbaipublicfiles.com/mae/pretrain/mae_pretrain_vit_base.pth

See more details in https://github.com/microsoft/unilm/tree/master/beit and https://github.com/facebookresearch/mae.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

data

data

README.md

Data Preparation

FL data construction

Download data used in the paper from Google Drive

Use gdown to download data to your path (optional)

Fed-BEiT and Fed-MAE pre-trained model checkpoints on target datasets:

Federated pre-training with Retina

Federated pre-training with COVID-FL

pre-trained model checkpoints on ImageNet

model checkpoints supervised trained on ImageNet

model checkpoints self-supervised trained on ImageNet

Files

data

Directory actions

More options

Directory actions

More options

Latest commit

History

data

Folders and files

parent directory

README.md

Data Preparation

FL data construction

Download data used in the paper from Google Drive

Use gdown to download data to your path (optional)

Fed-BEiT and Fed-MAE pre-trained model checkpoints on target datasets:

Federated pre-training with Retina

Federated pre-training with COVID-FL

pre-trained model checkpoints on ImageNet

model checkpoints supervised trained on ImageNet

model checkpoints self-supervised trained on ImageNet