We explore the task of zero-shot semantic segmentation of 3D shapes by using large-scale off-the-shelf 2D im- age recognition models. Surprisingly, we find that modern zero-shot 2D object detectors are better suited for this task than contemporary text/image similarity predictors or even zero-shot 2D segmentation networks. Our key finding is that it is possible to extract accurate 3D segmentation maps from multi-view bounding box predictions by using the topological properties of the underlying surface. For this, we develop the Segmentation Assignment with Topological Reweighting (SATR) algorithm and evaluate it on ShapeNetPart and our proposed FAUST benchmarks. SATR achieves state-of-the-art performance and outperforms a baseline algorithm by 1.3% and 4% average mIoU on the FAUST coarse and fine-grained benchmarks, respectively, and by 5.2% average mIoU on the ShapeNetPart benchmark. Our source code and data will be publicly released. Project webpage: https://samir55.github.io/SATR/.
For additional detail, please see "SATR: Zero-Shot Semantic Segmentation of 3D Shapes"
by Ahmed Abdelreheem, Ivan Skorokhodov,
Maks Ovsjanikov, and Peter Wonka
from KAUST and LIX, Ecole Polytechnique.
conda create -n meshseg python=3.9
conda activate meshseg
conda install cudatoolkit=11.1 -c conda-forge
pip install torch==1.10.1+cu111 torchvision==0.11.2+cu111 torchaudio==0.10.1 -f https://download.pytorch.org/whl/cu111/torch_stable.html
At first, you may also try installing pre-built wheels found here. For example, you can run this command for CUDA 11.3 and PyTorch 1.10.0 for Kaolin 0.13.0
pip install kaolin==0.13.0 -f https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-1.10.0_cu111.html
But if this didn't work out for you, please do the steps below:
-
Clone Kaolin in some directory outside the repo.
git clone --recursive https://github.com/NVIDIAGameWorks/kaolin cd kaolin
-
then,
git checkout v0.13.0 # optional pip install -r tools/build_requirements.txt -r tools/viz_requirements.txt -r tools/requirements.txt python setup.py develop
git clone https://github.com/Samir55/SATR
cd SATR/
pip install -e .
cd GLIP/
python setup.py build develop --user
NOTE: Download the pretrained GLIP model from here, and put it in GLIP/MODEL/
- For FAUST, please download the FAUST benchmark dataset from this link and put them in
data\FAUST
. - For the ShapeNetPart dataset, please download the labelled meshes from this link. We use the official test split provided here.
Please create a suitable config file to run on an input mesh (see the configs
folder for examples). For instance, to run on a penguin example, use the following command from the repository root directory:
CUDA_VISIBLE_DEVICES=0 python scripts/single_dataset_example.py -cfg configs/demo/penguin.yaml -mesh_name penguin.obj -output_dir outputs/demo/penguin
To run on a single example (for instance, tr_scan_000) of the FAUST dataset on the coarse segmentation, please use the following command
CUDA_VISIBLE_DEVICES=0 python scripts/single_dataset_example.py -cfg configs/faust/coarse.yaml -mesh_name tr_scan_000.obj -output_dir path_to_output_dir
and for the fine-grained segmentation
CUDA_VISIBLE_DEVICES=0 python scripts/single_dataset_example.py -cfg configs/faust/fine_grained.yaml -mesh_name tr_scan_000.obj -output_dir path_to_output_dir
For the ShapeNetPart models, please run scripts/single_dataset_example.py
with the suitable config file for each category found in configs/shapenetpart
Given an output dir (for example coarse_output_dir
) containing the coarse or fine-grained predictions for the 100 scans, run the following:
python scripts/evaluate_faust.py -output_dir outputs/coarse_output_dir
or for the fine_grained:
python scripts/evaluate_faust.py --fine_grained -output_dir outputs/fine_grained_output_dir
This codebase used some of 3DHighlighter, GLIP HuggingFace demo, and Grounded-Segment-Anything repositories. Thanks to the authors for their awesome work!
If you find this work useful in your research, please consider citing:
@article{abdelreheem2023SATR,
author = {Abdelreheem, Ahmed and Skorokhodov, Ivan and Ovsjanikov, Maks and Wonka, Peter}
title = {SATR: Zero-Shot Semantic Segmentation of 3D Shapes},
journal = Computing Research Repository (CoRR),
volume = {abs/2304.04909},
year = {2023}
}