StyleMC: Multi-Channel Based Fast Text-Guided Image Generation and Manipulation

Paper: https://arxiv.org/abs/2112.08493
Video: https://www.youtube.com/watch?v=ILm_5tvtzPI

Installation and Usage

The easiest way to get started with StyleMC is to run our Colab notebook or Replicate demo .

Alternatively, you can run StyleMC on your local as follows:

Install CLIP as a Python package following the official CLIP repository instructions
Clone this repo and install the dependencies

git clone [email protected]:catlab-team/stylemc.git
cd stylemc
pip install -r requirements.txt

Finding a global manipulation direction with a target text prompt:
Running find_direction.py finds and save a manipulation direction. You can then use the saved style direction to edit a randomly generated image or a real image.

python find_direction.py --text_prompt=[TEXT_PROMPT] --resolution=[RESOLUTION] --batch_size=[BATCH_SIZE] --identity_power=[ID_COEFF] --outdir=out --trunc=0.7 --seeds=1-129 --network=[PATH_TO_MODEL]

python find_direction.py --text_prompt="A man with mohawk hair" --resolution=256 --batch_size=4 --identity_power=high --outdir=out --trunc=0.7 --seeds=1-129 --network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/ffhq.pkl

(a) Manipulate a randomly generated image
Retrieve and save the style code of the image to be manipulated.

python generate_w.py --trunc=0.7 --seeds=8 --network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/ffhq.pkl
python w_s_converter.py --outdir=out --projected-w=encoder4editing/projected_w.npz --network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/ffhq.pkl

Perform the manipulation and save the resulting image using the original text prompt:

python generate_fromS.py --text_prompt="A man with mohawk hair" --change_power=50 --outdir=out --s_input=out/input.npz --network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/ffhq.pkl

(b) Manipulate a real image
Retrieve and save the style code of the image to be manipulated.

python encoder4editing/infer.py --input_image [PATH_TO_INPUT_IMAGE]
python w_s_converter.py --outdir=out --projected-w=encoder4editing/projected_w.npz --network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/ffhq.pkl

Perform the manipulation and save the resulting image using the original text prompt:

python generate_fromS.py --text_prompt="A man with mohawk hair" --change_power=50 --outdir=out --s_input=out/input.npz --network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/ffhq.pkl

Generation a video of the manipulation
Simply use the from_video flag to create video of the manipulation steps.

python generate_fromS.py --from_video --text_prompt="A man with mohawk hair" --change_power=50 --outdir=out --s_input=out/input.npz --network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/ffhq.pkl

Citation

This repo is built on the official StyleGAN2 repo, please refer to NVIDIA's repo for further details.

If you use this code for your research, please cite our paper:

@article{Kocasari2022StyleMCMB,
  title={StyleMC: Multi-Channel Based Fast Text-Guided Image Generation and Manipulation},
  author={Umut Kocasari and Alara Dirik and Mert Tiftikci and Pinar Yanardag},
  journal={2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
  year={2022},
  pages={3441-3450}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
dnnlib		dnnlib
docs		docs
encoder4editing		encoder4editing
images		images
metrics		metrics
torch_utils		torch_utils
training		training
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
calc_metrics.py		calc_metrics.py
cog.yaml		cog.yaml
dataset_tool.py		dataset_tool.py
find_direction.py		find_direction.py
generate_fromS.py		generate_fromS.py
generate_multi.py		generate_multi.py
generate_w.py		generate_w.py
helpers.py		helpers.py
id_loss.py		id_loss.py
legacy.py		legacy.py
model_irse.py		model_irse.py
predict.py		predict.py
projector.py		projector.py
requirements.txt		requirements.txt
style_mixing.py		style_mixing.py
text_prompts.txt		text_prompts.txt
train.py		train.py
w_s_converter.py		w_s_converter.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

StyleMC: Multi-Channel Based Fast Text-Guided Image Generation and Manipulation

Installation and Usage

Citation

About

Releases

Packages

Contributors 3

Languages

License

catlab-team/stylemc

Folders and files

Latest commit

History

Repository files navigation

StyleMC: Multi-Channel Based Fast Text-Guided Image Generation and Manipulation

Installation and Usage

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages