Skip to content
/ TADP Public

Text-Image Alignment for Diffusion-based Perception (TADP) - CVPR 2024

License

Notifications You must be signed in to change notification settings

damaggu/TADP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Aug 26, 2024
43bc08d · Aug 26, 2024

History

94 Commits
Apr 19, 2024
Mar 29, 2024
Apr 17, 2024
Apr 17, 2024
Aug 26, 2024
Apr 19, 2024
Apr 19, 2024
Apr 19, 2024
Apr 17, 2024
May 9, 2024
Apr 16, 2024
Apr 15, 2024
Apr 12, 2024
Jul 18, 2024
Jun 13, 2024
Apr 19, 2024
Apr 17, 2024
Apr 17, 2024
Mar 28, 2024
Apr 17, 2024
Apr 19, 2024
Apr 17, 2024
Apr 15, 2024
Apr 15, 2024
May 9, 2024
May 9, 2024
Apr 15, 2024
Apr 15, 2024
May 9, 2024

Repository files navigation

Text-image Alignment for Diffusion-based Perception (TADP)

Project Page Paper Open In Colab

PWC PWC PWC PWC PWC PWC


Official implementation of the paper Text-Image Alignment for Diffusion-based Perception (CVPR 2024).

Neehar Kondapaneni*, Markus Marks*, Manuel Knott*, Rogerio Guimaraes, Pietro Perona

methods

Setup

We have 2 seperate shell scripts for setting up the environment.

  • setup.sh for setting up the environment for Pascal VOC Semantic Segmentation and Watercolor2k and Comic2k Object Detection.
  • setup_mm.sh for setting up the environment for ADE20k Semantic Segmentation, NYUv2 Depth Estimation, Nighttime Driving, and Dark Zurich Semantic Segmentation (using MM libraries).
bash setup.sh

Inference

If you want to use our models for inference, there are two options available:

Single image inference

We provide a simple interface to load our model checkpoints and run inference with custom image and text inputs. Please refer to the demo/ directory for examples.

export PYTHONPATH=$PYTHONPATH:$(pwd)
python demo/depth_inference.py
python demo/seg_inference.py
python demo/detection_inference.py
python demo/seg_inference_driving.py

Whole data set testing

If you want to generate results for a whole dataset that was used in our study (e.g., ADE20k, NYUv2) using pre-generated captions, please refer to the test_tadp_mm.py and test_tadp_depth.py scripts.

Training

TODO

Experiments

All results that are reported in our paper can be reproduced using the scripts in the cvpr_experiments/ directory.

Acknowledgements

This code is based on VPD, diffusers, stable-diffusion, mmsegmentation, LAVT, and MIM-Depth-Estimation.

Citation

@article{kondapaneni2024tadp,
  title={Text-Image Alignment for Diffusion-Based Perception},
  author={Kondapaneni, Neehar and Marks, Markus and Knott, Manuel and Guimaraes, Rogerio and Perona, Pietro},
  journal={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2024},
  month={June},
  pages={13883-13893}
}

About

Text-Image Alignment for Diffusion-based Perception (TADP) - CVPR 2024

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published