This is an unofficial implementation of Palette: Image-to-Image Diffusion Models by Pytorch, and it is mainly inherited from its super-resolution version Image-Super-Resolution-via-Iterative-Refinement. The code template is from my another seed project: distributed-pytorch-template.
There are some implementation details with paper descriptions:
- We adapted the U-Net architecture used in
Guided-Diffusion
, which give a substantial boost to sample quality. - We used the attention mechanism in low-resolution features (16×16) like vanilla
DDPM
. - We encode the
$\gamma$ rather than$t$ inPalette
and embed it with affine transformation. - We fix the variance
$Σ_\theta(x_t, t)$ to a constant during the inference as described inPalette
.
- Diffusion Model Pipeline
- Train/Test Process
- Save/Load Training State
- Logger/Tensorboard
- Multiple GPU Training (DDP)
- EMA
- Metrics (now for FID, IS)
- Dataset (now for inpainting, uncropping, colorization)
I try to finish following tasks in order:
- Inpainting on CelebaHQ🚀 (available)
- Inpainting on Places2 with 128×128 centering mask🚀 (available)
- Uncropping on Places2🔥
- Colorization on ImageNet val set
Due to the lack of computational resources, we reduced the model parameters, while it does not fully converge. It leaves a lot of room for optimization. However, we can feel the excellent performance of this method through the stage results.
Results with 200 epochs and 930K iterations, and the first 100 samples in centering mask and irregular mask.
Results with 16 epochs and 660K iterations, and the several picked samples in centering mask.
Results with 8 epochs and 330K iterations, and the several picked samples in uncropping.
Tasks | Dataset | EMA | FID(-) | IS(+) |
---|---|---|---|---|
Inpainting with centering mask | Celeba-HQ | False | 5.7873 | 3.0705 |
Inpainting with irregular mask | Celeba-HQ | False | 5.4026 | 3.1221 |
Inpainting with centering mask | Places2 | False | ||
Uncropping | Places2 | True |
pip install -r requirement.txt
Dataset | Task | Iterations | URL |
---|---|---|---|
Celeba-HQ | Inpainting | 930K | Google Drive |
Places2 | Inpainting | 660K | Google Drive |
We get most of them from Kaggle, which may be slightly different from official version, and you also can download them from official website.
We use the default division of these datasets for training and evaluation. The file lists we use can be found in Celeba-HQ, Places2.
After you prepared own data, you need to modify the corresponding configure file to point to your data. Take the following as an example:
"which_dataset": { // import designated dataset using arguments
"name": ["data.dataset", "InpaintDataset"], // import Dataset() class
"args":{ // arguments to initialize dataset
"data_root": "your data path",
"data_len": -1,
"mask_mode": "hybrid"
}
},
More choices about dataloader and validation split also can be found in datasets
part of configure file.
- Download the checkpoints from given links.
- Set
resume_state
of configure file to the directory of previous checkpoint. Take the following as an example, this directory contains training states and saved model:
"path": { //set every part file path
"resume_state": "experiments/inpainting_celebahq_220426_150122/checkpoint/100"
},
- Set your network label in
load_everything
function ofmodel.py
, default is Network. Follow the tutorial settings, the optimizers and models will be loaded from 100.state and 100_Network.pth respectively.
netG_label = self.netG.__class__.__name__
self.load_network(network=self.netG, network_label=netG_label, strict=False)
- Run the script:
python run.py -p train -c config/inpainting_celebahq.json
We test the U-Net backbone used in SR3
and Guided Diffusion
, and Guided Diffusion
one have a more robust performance in our current experiments. More choices about backbone, loss and scheduler can be found in which_networks
part of configure file.
- Modify the configure file to point to your data following the steps in Data Prepare part.
- Set your model path following the steps in Resume Training part.
- Run the script:
python run.py -p test -c config/inpainting_celebahq.json
-
Create two folders saving ground truth images and sample images, and their file names need to correspond to each other.
-
Run the script:
python eval.py -s [ground image path] -d [sample image path]
Our work is based on the following theoretical works:
- Denoising Diffusion Probabilistic Models
- Palette: Image-to-Image Diffusion Models
- Diffusion Models Beat GANs on Image Synthesis
and we are benefiting a lot from the following projects: