Initial commit

facebookresearch · Aug 20, 2019 · 257b8d8 · 257b8d8
commit 257b8d8
Show file tree

Hide file tree

Showing 66 changed files with 11,466 additions and 0 deletions.
diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md
@@ -0,0 +1,5 @@
+# Code of Conduct
+
+Facebook has adopted a Code of Conduct that we expect project participants to adhere to.
+Please read the [full text](https://code.fb.com/codeofconduct/)
+so that you can understand what actions will and will not be tolerated.
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -0,0 +1,31 @@
+# Contributing to votenet
+We want to make contributing to this project as easy and transparent as
+possible.
+
+## Pull Requests
+We actively welcome your pull requests.
+
+1. Fork the repo and create your branch from `master`.
+2. If you've added code that should be tested, add tests.
+3. If you've changed APIs, update the documentation.
+4. Ensure the test suite passes.
+5. Make sure your code lints.
+6. If you haven't already, complete the Contributor License Agreement ("CLA").
+
+## Contributor License Agreement ("CLA")
+In order to accept your pull request, we need you to submit a CLA. You only need
+to do this once to work on any of Facebook's open source projects.
+
+Complete your CLA here: <https://code.facebook.com/cla>
+
+## Issues
+We use GitHub issues to track public bugs. Please ensure your description is
+clear and has sufficient instructions to be able to reproduce the issue.
+
+Facebook has a [bounty program](https://www.facebook.com/whitehat/) for the safe
+disclosure of security bugs. In those cases, please go through the process
+outlined on that page and do not file a public issue.
+
+## License
+By contributing to votenet, you agree that your contributions will be licensed
+under the LICENSE file in the root directory of this source tree.
diff --git a/LICENSE b/LICENSE
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) Facebook, Inc. and its affiliates.
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/README.md b/README.md
@@ -0,0 +1,97 @@
+# Deep Hough Voting for 3D Object Detection in Point Clouds
+Created by <a href="http://charlesrqi.com" target="_blank">Charles R. Qi</a>, <a href="https://orlitany.github.io/" target="_blank">Or Litany</a>, <a href="http://kaiminghe.com/" target="_blank">Kaiming He</a> and <a href="https://geometry.stanford.edu/member/guibas/" target="_blank">Leonidas Guibas</a> from <a href="https://research.fb.com/category/facebook-ai-research/" target="_blank">Facebook AI Research</a> and <a href="http://www.stanford.edu" target="_blank">Stanford University</a>.
+
+![teaser](https://github.com/facebookresearch/votenet/blob/master/doc/teaser.jpg)
+
+## Introduction
+This repository is code release for our ICCV 2019 paper (arXiv report [here](https://arxiv.org/pdf/1904.09664.pdf)).
+
+Current 3D object detection methods are heavily influenced by 2D detectors. In order to leverage architectures in 2D detectors, they often convert 3D point clouds to regular grids (i.e., to voxel grids or to bird’s eye view images), or rely on detection in 2D images to propose 3D boxes. Few works have attempted to directly detect objects in point clouds. In this work, we return to first principles to construct a 3D detection pipeline for point cloud data and as generic as possible. However, due to the sparse nature of the data – samples from 2D manifolds in 3D space – we face a major challenge when directly predicting bounding box parameters from scene points: a 3D object centroid can be far from any surface point thus hard to regress accurately in one step. To address the challenge, we propose VoteNet, an end-to-end 3D object detection network based on a synergy of deep point set networks and Hough voting. Our model achieves state-of-the-art 3D detection on two large datasets of real 3D scans, ScanNet and SUN RGB-D with a simple design, compact model size and high efficiency. Remarkably, VoteNet outperforms previous methods by using purely geometric information without relying on color images.
+
+In this repository, we provide VoteNet model implementation (with Pytorch) as well as data preparation, training and evaluation scripts on SUN RGB-D and ScanNet.
+
+## Citation
+
+If you find our work useful in your research, please consider citing:
+
+    @inproceedings{qi2019deep,
+        author = {Qi, Charles R and Litany, Or and He, Kaiming and Guibas, Leonidas J},
+        title = {Deep Hough Voting for 3D Object Detection in Point Clouds},
+        booktitle = {Proceedings of the IEEE International Conference on Computer Vision},
+        year = {2019}
+    }
+
+## Installation
+
+Install [Pytorch](https://pytorch.org/get-started/locally/) and [Tensorflow](https://github.com/tensorflow/tensorflow) (for TensorBoard). It is required that you have access to GPUs. Matlab is required to prepare data for SUN RGB-D. The code is tested with Ubuntu 18.04, Pytorch v1.1, TensorFlow v1.14, CUDA 10.0 and cuDNN v7.4.
+
+Compile the CUDA layers for [PointNet++](http://arxiv.org/abs/1706.02413), which we used in the backbone network:
+
+    cd pointnet2
+    python setup.py install
+
+To see if the compilation is successful, try to run `python models/votenet.py` to see if a forward pass works.
+
+Install the following Python dependencies (with `pip install`):
+
+    matplotlib
+    cv2
+    plyfile
+    trimesh>=2.35.39,<2.35.40
+
+## Run demo
+
+You can download pre-trained models and sample point clouds [HERE](https://drive.google.com/file/d/1oem0w5y5pjo2whBhAqTtuaYuyBu1OG8l/view?usp=sharing).
+Unzip the file under the project root path (`/path/to/project/demo_files`) and then run:
+
+    python demo.py
+
+The demo uses a pre-trained model (on SUN RGB-D) to detect objects in a point cloud from an indoor room of a table and a few chairs (from SUN RGB-D val set). You can use 3D visualization software such as the [MeshLab](http://www.meshlab.net/) to open the dumped file under `demo_files/sunrgbd` to see the 3D detection output. Specifically, open `***_pc.ply` and `***_pred_confident_nms_bbox.ply` to see the input point cloud and predicted 3D bounding boxes.
+
+You can also run the following command to use another pretrained model on a ScanNet:
+
+    python demo.py --dataset scannet --num_point 40000
+
+Detection results will be dumped to `demo_files/scannet`.
+
+## Training and evaluating
+
+### Data preparation
+
+For SUN RGB-D, follow the [README](https://github.com/facebookresearch/votenet/blob/master/sunrgbd/README.md) under the `sunrgbd` folder.
+
+For ScanNet, follow the [README](https://github.com/facebookresearch/votenet/blob/master/scannet/README.md) under the `scannet` folder.
+
+### Train and test on SUN RGB-D
+
+To train a new VoteNet model on SUN RGB-D data (depth images):
+
+    CUDA_VISIBLE_DEVICES=0 python train.py --dataset sunrgbd --log_dir log_sunrgbd
+
+You can use `CUDA_VISIBLE_DEVICES=0,1,2` to specify which GPU(s) to use. Without specifying CUDA devices, the training will use all the available GPUs and train with data parallel (Note that due to I/O load, training speedup is not linear to the nubmer of GPUs used). Run `python train.py -h` to see more training options.
+While training you can check the `log_sunrgbd/log_train.txt` file on its progress, or use the TensorBoard to see loss curves.
+
+To test the trained model with its checkpoint:
+
+    python eval.py --dataset sunrgbd --checkpoint_path log_sunrgbd/checkpoint.tar --dump_dir eval_sunrgbd --cluster_sampling seed_fps --use_3d_nms --use_cls_nms --per_class_proposal
+
+Example results will be dumped in the `eval_sunrgbd` folder (or any other folder you specify). You can run `python eval.py -h` to see the full options for evaluation. After the evaluation, you can use MeshLab to visualize the predicted votes and 3D bounding boxes (select wireframe mode to view the boxes).
+Final evaluation results will be printed on screen and also written in the `log_eval.txt` file under the dump directory. In default we evaluate with both [email protected] and [email protected] with 3D IoU on oriented boxes.
+
+### Train and test on ScanNet
+
+To train a VoteNet model on Scannet data (fused scan):
+
+    CUDA_VISIBLE_DEVICES=0 python train.py --dataset scannet --log_dir log_scannet --num_point 40000
+
+To test the trained model with its checkpoint:
+
+    python eval.py --dataset scannet --checkpoint_path log_scannet/checkpoint.tar --dump_dir eval_scannet --num_point 40000 --cluster_sampling seed_fps --use_3d_nms --use_cls_nms --per_class_proposal
+
+Example results will be dumped in the `eval_scannet` folder (or any other folder you specify). In default we evaluate with both [email protected] and [email protected] with 3D IoU on axis aligned boxes.
+
+## Acknowledgements
+We want to thank Erik Wijmans for his PointNet++ implementation in Pytorch ([original codebase](https://github.com/erikwijmans/Pointnet2_PyTorch)).
+
+## License
+votenet is relased under the MIT License. See the [LICENSE file](https://arxiv.org/pdf/1904.09664.pdf) for more details.
diff --git a/demo.py b/demo.py
@@ -0,0 +1,101 @@
+# Copyright (c) Facebook, Inc. and its affiliates.
+# 
+# This source code is licensed under the MIT license found in the
+# LICENSE file in the root directory of this source tree.
+
+""" Demo of using VoteNet 3D object detector to detect objects from a point cloud.
+"""
+
+import os
+import sys
+import numpy as np
+import argparse
+import importlib
+import time
+
+parser = argparse.ArgumentParser()
+parser.add_argument('--dataset', default='sunrgbd', help='Dataset: sunrgbd or scannet [default: sunrgbd]')
+parser.add_argument('--num_point', type=int, default=20000, help='Point Number [default: 20000]')
+FLAGS = parser.parse_args()
+
+import torch
+import torch.nn as nn
+import torch.optim as optim
+
+BASE_DIR = os.path.dirname(os.path.abspath(__file__))
+ROOT_DIR = BASE_DIR
+sys.path.append(os.path.join(ROOT_DIR, 'utils'))
+sys.path.append(os.path.join(ROOT_DIR, 'models'))
+from pc_util import random_sampling, read_ply
+from ap_helper import parse_predictions
+
+def preprocess_point_cloud(point_cloud):
+    ''' Prepare the numpy point cloud (N,3) for forward pass '''
+    point_cloud = point_cloud[:,0:3] # do not use color for now
+    floor_height = np.percentile(point_cloud[:,2],0.99)
+    height = point_cloud[:,2] - floor_height
+    point_cloud = np.concatenate([point_cloud, np.expand_dims(height, 1)],1) # (N,4) or (N,7)
+    point_cloud = random_sampling(point_cloud, FLAGS.num_point)
+    pc = np.expand_dims(point_cloud.astype(np.float32), 0) # (1,40000,4)
+    return pc
+
+if __name__=='__main__':
+
+    # Set file paths and dataset config
+    demo_dir = os.path.join(BASE_DIR, 'demo_files') 
+    if FLAGS.dataset == 'sunrgbd':
+        sys.path.append(os.path.join(ROOT_DIR, 'sunrgbd'))
+        from sunrgbd_detection_dataset import DC # dataset config
+        checkpoint_path = os.path.join(demo_dir, 'pretrained_votenet_on_sunrgbd.tar')
+        pc_path = os.path.join(demo_dir, 'input_pc_sunrgbd.ply')
+    elif FLAGS.dataset == 'scannet':
+        sys.path.append(os.path.join(ROOT_DIR, 'scannet'))
+        from scannet_detection_dataset import DC # dataset config
+        checkpoint_path = os.path.join(demo_dir, 'pretrained_votenet_on_scannet.tar')
+        pc_path = os.path.join(demo_dir, 'input_pc_scannet.ply')
+    else:
+        print('Unkown dataset %s. Exiting.'%(DATASET))
+        exit(-1)
+
+    eval_config_dict = {'remove_empty_box': True, 'use_3d_nms': True, 'nms_iou': 0.25,
+        'use_old_type_nms': False, 'cls_nms': False, 'per_class_proposal': False,
+        'conf_thresh': 0.5, 'dataset_config': DC}
+
+    # Init the model and optimzier
+    MODEL = importlib.import_module('votenet') # import network module
+    device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
+    net = MODEL.VoteNet(num_proposal=256, input_feature_dim=1, vote_factor=1,
+        sampling='seed_fps', num_class=DC.num_class,
+        num_heading_bin=DC.num_heading_bin,
+        num_size_cluster=DC.num_size_cluster,
+        mean_size_arr=DC.mean_size_arr).to(device)
+    print('Constructed model.')
+
+    # Load checkpoint
+    optimizer = optim.Adam(net.parameters(), lr=0.001)
+    checkpoint = torch.load(checkpoint_path)
+    net.load_state_dict(checkpoint['model_state_dict'])
+    optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
+    epoch = checkpoint['epoch']
+    print("Loaded checkpoint %s (epoch: %d)"%(checkpoint_path, epoch))
+
+    # Load and preprocess input point cloud 
+    net.eval() # set model to eval mode (for bn and dp)
+    point_cloud = read_ply(pc_path)
+    pc = preprocess_point_cloud(point_cloud)
+    print('Loaded point cloud data: %s'%(pc_path))
+
+    # Model inference
+    inputs = {'point_clouds': torch.from_numpy(pc).to(device)}
+    tic = time.time()
+    end_points = net(inputs)
+    toc = time.time()
+    print('Inference time: %f'%(toc-tic))
+    end_points['point_clouds'] = inputs['point_clouds']
+    pred_map_cls = parse_predictions(end_points, eval_config_dict)
+    print('Finished detection. %d object detected.'%(len(pred_map_cls[0])))
+
+    dump_dir = os.path.join(demo_dir, '%s_results'%(FLAGS.dataset))
+    if not os.path.exists(dump_dir): os.mkdir(dump_dir) 
+    MODEL.dump_results(end_points, dump_dir, DC, True)
+    print('Dumped detection results to folder %s'%(dump_dir))
diff --git a/doc/teaser.jpg b/doc/teaser.jpg