SeGAN: Segmenting and Generating the Invisible

This project is presented as spotlight in CVPR2018.

Abstract

Humans have strong ability to make inferences about the appearance of the invisible and occluded parts of scenes. For example, when we look at the scene on the left we can make predictions about what is behind the coffee table, and can even complete the sofa based on the visible parts of the sofa, the coffee table, and what we know in general about sofas and coffee tables and how they occlude each other.

SeGAN can learn to

Generate the appearance of the occluded parts of objects,
Segment the invisible parts of objects,
Although trained on synthetic photo realistic images reliably segment natural images,
By reasoning about occluder-occludee relations infer depth layering.

Citation

If you find this project useful in your research, please consider citing:

@inproceedings{ehsani2018segan,
  title={Segan: Segmenting and generating the invisible},
  author={Ehsani, Kiana and Mottaghi, Roozbeh and Farhadi, Ali},
  booktitle={CVPR},
  year={2018}
}

Prerequisites

Using Torch 7 and dependencies from this repository.
Linux OS
NVIDIA GPU + CUDA + CuDNN

Installation

Clone the repository using the command:

 git clone https://github.com/ehsanik/SeGAN
 cd SeGAN

Download the dataset from here and extract it.
Make a link to the dataset.
```
 ln -s /PATH/TO/DATASET dyce_data
```
Download pretrained weights from here and extract it.
Make a link to the weights' folder.
```
 ln -s /PATH/TO/WEIGHTS weights
```

Dataset

We introduce DYCE, a dataset of synthetic occluded objects. This is a synthetic dataset with photo-realistic images and natural configuration of objects in scenes. All of the images of this dataset are taken in indoor scenes. The annotations for each image contain the segmentation mask for the visible and invisible regions of objects. The images are obtained by taking snapshots from our 3D synthetic scenes.

Statistics

The number of the synthetic scenes that we use is 11, where we use 7 scenes for training and validation, and 4 scenes for testing. Overall there are 5 living rooms and 6 kitchens, where 2 living rooms and 2 kitchen are used for testing. On average, each scene contains 60 objects and the number of visible objects per image is 17.5 (by visible we mean having at least 10 visible pixels). There is no common object instance in train and test scenes.

The dataset can be downloaded from here.

Train

To train your own model:

th main.lua -baseLR 1e-3 -end2end -istrain "train"

See data_settings.lua for additional commandline options.

Test

To test using the pretrained model and reproduce the results in the paper:

Model	Segmentation			Texture
Model	Visible ∪ Invisible	Visible	Invisible	L1	L2
Multipath	47.51	48.58	6.01	-	-
SeGAN(ours) w/ SV_predicted	68.78	64.76	15.59	0.070	0.023
SeGAN(ours) w/ SV_gt	75.71	68.05	23.26	0.026	0.008

th main.lua -weights_segmentation "weights/segment" -end2end -weights_texture "weights/texture" -istrain "test" -predictedSV

For testing using the groundtruth visible mask as input instead of the predicted mask:

th main.lua -weights_segmentation "weights/segment_gt_sv" -end2end -weights_texture "weights/texture_gt_sv" -istrain "test"

Acknowledgments

Code for GAN network borrows heavily from pix2pix.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
ROI		ROI
figs		figs
models		models
networks		networks
util		util
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
SettingsParser.lua		SettingsParser.lua
main.lua		main.lua

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SeGAN: Segmenting and Generating the Invisible

Abstract

Citation

Prerequisites

Installation

Dataset

Statistics

Train

Test

Acknowledgments

About

Releases

Packages

Languages

License

ehsanik/SeGAN

Folders and files

Latest commit

History

Repository files navigation

SeGAN: Segmenting and Generating the Invisible

Abstract

Citation

Prerequisites

Installation

Dataset

Statistics

Train

Test

Acknowledgments

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages