oi2996814 · pull · Mar 25, 2024 · Jan 14, 2022 · Jun 10, 2022 · Jun 10, 2022
diff --git a/README.md b/README.md
@@ -8,23 +8,20 @@ Website and video: <https://alexyu.net/plenoxels>
 
 arXiv: <https://arxiv.org/abs/2112.05131>
 
-**Note:** This is a preliminary release. We have not carefully tested everything,
-but feel that it would be better to first put the code out there.
-
-Also, despite the name, it's not strictly intended to be a successor of svox
+[Featured at Two Minute Papers YouTube](https://youtu.be/yptwRRpPEBM) 2022-01-11
 
+Despite the name, it's not strictly intended to be a successor of svox
 
 Citation:
 ```
-@misc{yu2021plenoxels,
+@inproceedings{yu2022plenoxels,
       title={Plenoxels: Radiance Fields without Neural Networks}, 
-      author={{Alex Yu and Sara Fridovich-Keil} and Matthew Tancik and Qinhong Chen and Benjamin Recht and Angjoo Kanazawa},
-      year={2021},
-      eprint={2112.05131},
-      archivePrefix={arXiv},
-      primaryClass={cs.CV}
+      author={Sara Fridovich-Keil and Alex Yu and Matthew Tancik and Qinhong Chen and Benjamin Recht and Angjoo Kanazawa},
+      year={2022},
+      booktitle={CVPR},
 }
 ```
+Note that the joint first-authors decided to swap the order of names between arXiv and CVPR proceedings.
 
 This contains the official optimization code.
 A JAX implementation is also available at <https://github.com/sarafridov/plenoxels>. However, note that the JAX version is currently feature-limited, running in about 1 hour per epoch and only supporting bounded scenes (at present). 
@@ -33,8 +30,19 @@ A JAX implementation is also available at <https://github.com/sarafridov/plenoxe
 
 ![Overview](https://raw.githubusercontent.com/sxyu/svox2/master/github_img/pipeline.png)
 
+### Examples use cases
+
+Check out PeRFCeption [Jeong, Shin, Lee, et al], which uses Plenoxels with tuned parameters to generate a large
+dataset of radiance fields:
+https://github.com/POSTECH-CVLab/PeRFception
+
+Artistic Radiance Fields by Kai Zhang et al
+https://github.com/Kai-46/ARF-svox2
+
 ## Setup
 
+**Windows is not officially supported, and we have only tested with Linux. Adding support would be welcome.**
+
 First create the virtualenv; we recommend using conda:
 ```sh
 conda env create -f environment.yml
@@ -43,13 +51,13 @@ conda activate plenoxel
 
 Then clone the repo and install the library at the root (svox2), which includes a CUDA extension.
 
-If your CUDA toolkit is older than 11, then you will need to install CUB as follows:
+**If and only if** your CUDA toolkit is older than 11, you will need to install CUB as follows:
 `conda install -c bottler nvidiacub`.
-Since CUDA 11, CUB is shipped with the toolkit.
+Since CUDA 11, CUB is shipped with the toolkit and installing this may lead to build errors.
 
 To install the main library, simply run
 ```
-pip install .
+pip install -e . --verbose
 ```
 In the repo root directory.
 
@@ -69,7 +77,16 @@ Note this data should be identical to that in NeRF++
 Finally, the real Lego capture can be downloaded from:
 https://drive.google.com/file/d/1PG-KllCv4vSRPO7n5lpBjyTjlUyT8Nag/view?usp=sharing
 
-## Voxel Optimization (aka Training)
+**Note: we currently do not support the instant-ngp format data (since the project was released before NGP). Using it will trigger the nerf-synthetic (Blender) data loader
+due to similarity, but will not train properly. For real data we use the NSVF format.**
+
+To convert instant-ngp data, please try our script
+```
+cd opt/scripts
+python ingp2nsvf.py <ingp_data_dir> <output_data_dir>
+```
+
+## Optimization
 
 For training a single scene, see `opt/opt.py`. The launch script makes this easier.
 
@@ -96,6 +113,7 @@ Usage,
 
 By default this saves all frames, which is very slow. Add `--no_imsave` to avoid this.
 
+
 ## Rendering a spiral
 
 Use `opt/render_imgs_circle.py`
@@ -133,36 +151,72 @@ Please take images all around the object and try to take images at different ele
 First make sure you have colmap installed. Then
 
 (in opt/scripts)
-`bash proc_colmap.sh <img_dir>`
+`bash proc_colmap.sh <img_dir> --noradial`
 
 Where `<img_dir>` should be a directory directly containing png/jpg images from a 
 normal perspective camera.
+UPDATE: `--noradial` is recommended since otherwise, the script performs undistortion, which seems to not work well and make results blurry.
+Support for the complete OPENCV camera model which has been used by more recent projects would be welcome
+https://github.com/google-research/multinerf/blob/1c8b1c552133cdb2de1c1f3c871b2813f6662265/internal/camera_utils.py#L477.
 For custom datasets we adopt a data format similar to that in NSVF
 <https://github.com/facebookresearch/NSVF>
 
+
 You should be able to use this dataset directly afterwards. The format will be auto-detected.
 
-To view the data use:
+To view the data (and check the scene normalization) use:
 `python view_data.py <img_dir>`
 
+You will need nerfvis: `pip install nerfvis`
+
 This should launch a server at localhost:8889
 
 
 Now follow the "Voxel Optimization (aka Training)" section to train:
 
 `./launch.sh <exp_name> <GPU_id> <data_dir> -c configs/custom.json`
 
-You can also try `configs/custom_alt.json` which has some minor differences.
-You may need to tune the TV for best results.
+custom.json was used for the real lego bulldozer scene.
+You can also try `configs/custom_alt.json` which has some minor differences **especially that near_clip is eliminated**. If the scene's central object is totally messed up, this might be due to the aggressive near clip, and the alt config fixes it.
+
+You may need to tune the TV and sparsity loss for best results.
+
 
 To render a video, please see the "rendering a spiral" section.
 To convert to a svox1-compatible PlenOctree (not perfect quality since interpolation is not implemented)
 you can try `to_svox1.py <ckpt>`
 
+
+Example result with the mip-nerf-360 garden data (using custom_alt config as provided)
+![Garden](https://raw.githubusercontent.com/sxyu/svox2/master/github_img/garden.png)
+
+Fox data (converted with the script `opt/scripts/ingp2nsvf.py`)
+![Fox](https://raw.githubusercontent.com/sxyu/svox2/master/github_img/fox.png)
+
+### Common Capture Tips
+
+Floaters and poor quality surfaces can be caused by the following reasons
+
+- Dynamic objects. Dynamic object modelling is not supported in this repo, and if anything moves it will probably lead to floaters
+- Specularity. Very shiny surfaces will lead to floaters and/or poor surfaces
+- Exposure variations. Please lock the exposure when recording a video if possible
+- Lighting variations. Sometimes the clouds move when capturing outdoors.. Try to capture within a short time frame
+- Motion blur and DoF blur. Try to move slowly and make sure the object is in focus. For small objects, DoF tends to be a substantial issue
+- Image quality. Images may have severe JPEG compression artifacts for example
+
+## Potential extensions
+
+Due to limited time we did not make the follow extensions which should make the quality  and speed better.
+
+- Use exp activation instead of ReLU. May help with the semi-transparent look issue
+- Add mip-nerf 360 distortion loss to reduce floaters. PeRFCeption also tuned some parameters to help with the quality
+- Exposure modelling
+- Use FP16 training. This codebase uses FP32 still. This should improve speed and memory use
+- Add a GUI viewer
+
 ## Random tip: how to make pip install faster for native extensions
 
 You may notice that this CUDA extension takes forever to install.
 A suggestion is using ninja. On Ubuntu,
 install it with `sudo apt install ninja-build`.
-Then set the environment variable `MAX_JOBS` to the number of CPUS to use in parallel (e.g. 12) in your shell startup script.
 This will enable parallel compilation and significantly improve iteration speed.
diff --git a/environment.yml b/environment.yml
@@ -22,7 +22,7 @@ dependencies:
       - moviepy
       - matplotlib
       - scipy>=1.6.0
-  - pytorch
+  - pytorch=1.11.0
   - torchvision
   - cudatoolkit
   - tqdm

diff --git a/github_img/fox.png b/github_img/fox.png
diff --git a/github_img/garden.png b/github_img/garden.png
diff --git a/opt/configs/custom_alt.json b/opt/configs/custom_alt.json
@@ -4,11 +4,11 @@
     "background_nlayers": 64,
     "background_reso": 1024,
     "cam_scale_factor": 0.95,
-    "upsamp_every": 25600,
+    "upsamp_every": 38400,
     "lr_sigma": 3e1,
     "lr_sh": 1e-2,
     "lr_sigma_delay_steps": 35000,
-    "lr_fg_begin_step": 10000,
+    "lr_fg_begin_step": 50,
     "thresh_type": "weight",
     "weight_thresh": 1.28,
     "lambda_tv": 5e-5,

diff --git a/opt/scripts/ingp2nsvf.py b/opt/scripts/ingp2nsvf.py
@@ -0,0 +1,141 @@
+"""
+Convert NeRF-iNGP data to NSVF
+python ingp2nsvf.py <ngp_data_dir> <our_data_dir>
+"""
+import os
+import shutil
+from glob import glob
+import json
+
+import numpy as np
+from PIL import Image
+import argparse
+
+def convert(data_dir : str, out_data_dir : str):
+    """
+    Convert Instant-NGP (modified NeRF) data to NSVF
+
+    :param data_dir: the dataset dir (NeRF-NGP format) to convert
+    :param out_data_dir: output dataset directory NSVF
+    """
+
+    images_dir_name = os.path.join(out_data_dir, "images")
+    pose_dir_name = os.path.join(out_data_dir, "pose")
+
+    os.makedirs(images_dir_name, exist_ok=True)
+    os.makedirs(pose_dir_name, exist_ok=True)
+
+    def get_subdir(name):
+        if name.endswith("_train.json"):
+            return "train"
+        elif name.endswith("_val.json"):
+            return "val"
+        elif name.endswith("_test.json"):
+            return "test"
+        return ""
+
+    def get_out_prefix(name):
+        if name.endswith("_train.json"):
+            return "0_"
+        elif name.endswith("_val.json"):
+            return "1_"
+        elif name.endswith("_test.json"):
+            return "2_"
+        return ""
+
+    jsons = {
+        x: (get_subdir(x), get_out_prefix(x))
+        for x in glob(os.path.join(data_dir, "*.json"))
+    }
+
+    # OpenGL -> OpenCV
+    cam_trans = np.diag(np.array([1.0, -1.0, -1.0, 1.0]))
+
+    # fmt: off
+    world_trans = np.array(
+        [
+            [0.0, -1.0, 0.0, 0.0],
+            [0.0, 0.0, -1.0, 0.0],
+            [1.0, 0.0, 0.0, 0.0],
+            [0.0, 0.0, 0.0, 1.0],
+        ]
+    )
+    # fmt: on
+
+    assert len(jsons) > 0, f"No jsons found in {data_dir}, can't convert"
+    cnt = 0
+
+    example_fpath = None
+    tj = {}
+    for tj_path, (tj_subdir, tj_out_prefix) in jsons.items():
+        with open(tj_path, "r") as f:
+            tj = json.load(f)
+        if "frames" not in tj:
+            print(f"No frames in json {tj_path}, skipping")
+            continue
+
+        for frame in tj["frames"]:
+            # Try direct relative path (used in newer NGP datasets)
+            fpath = os.path.join(data_dir, frame["file_path"])
+            if not os.path.isfile(fpath):
+                # Legacy path (NeRF)
+                fpath = os.path.join(
+                    data_dir, tj_subdir, os.path.basename(frame["file_path"]) + ".png"
+                )
+            example_fpath = fpath
+            if not os.path.isfile(fpath):
+                print("Could not find image:", frame["file_path"], "(this may be ok)")
+                continue
+
+            ext = os.path.splitext(fpath)[1]
+
+            c2w = np.array(frame["transform_matrix"])
+            c2w = world_trans @ c2w @ cam_trans  # To OpenCV
+
+            image_fname = tj_out_prefix + f"{cnt:05d}"
+
+            pose_path = os.path.join(pose_dir_name, image_fname + ".txt")
+
+            # Save 4x4 OpenCV C2W pose
+            np.savetxt(pose_path, c2w)
+
+            # Copy images
+            new_fpath = os.path.join(images_dir_name, image_fname + ext)
+            shutil.copyfile(fpath, new_fpath)
+            cnt += 1
+
+    assert len(tj) > 0, f"No valid jsons found in {data_dir}, can't convert"
+
+    w = tj.get("w")
+    h = tj.get("h")
+
+    if w is None or h is None:
+        assert example_fpath is not None
+        # Pose not available so load a image and get the size
+        w, h = Image.open(example_fpath).size
+
+    fx = float(0.5 * w / np.tan(0.5 * tj["camera_angle_x"]))
+    if "camera_angle_y" in tj:
+        fy = float(0.5 * h / np.tan(0.5 * tj["camera_angle_y"]))
+    else:
+        fy = fx
+
+    cx = tj.get("cx", w * 0.5)
+    cy = tj.get("cy", h * 0.5)
+
+    intrin_mtx = np.array([
+        [fx, 0.0, cx, 0.0],
+        [0.0, fy, cy, 0.0],
+        [0.0, 0.0, 1.0, 0.0],
+        [0.0, 0.0, 0.0, 1.0],
+    ])
+    # Write intrinsics
+    np.savetxt(os.path.join(out_data_dir, "intrinsics.txt"), intrin_mtx)
+
+
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser()
+    parser.add_argument("data_dir", type=str, help="NeRF-NGP data directory")
+    parser.add_argument("out_data_dir", type=str, help="Output NSVF data directory")
+    args = parser.parse_args()
+    convert(args.data_dir, args.out_data_dir)
diff --git a/opt/scripts/run_colmap.py b/opt/scripts/run_colmap.py
@@ -259,6 +259,9 @@ def run_colmap(vid_root, args, factor, overwrite=False):
     os.system(mapper_cmd)
 
     if not args.noradial:
+        print("Warning: I've found the undistorter to work very poorly, substantially reducing quality.")
+        print("A potential (fairly easy) improvement is to support OPENCV camera model in the codebase, "
+              "and without doing undistorting.")
         undist_dir = os.path.join(vid_root, args.undistorted_output)
         if not os.path.exists(undist_dir) or overwrite:
             os.makedirs(undist_dir, exist_ok=True)
@@ -387,7 +390,7 @@ def preprocess(vid_root, args):
     parser.add_argument('--known-intrin', action='store_true', default=False, help='use intrinsics in <root>/intrinsics.txt if available')
     parser.add_argument('--fix-intrin', action='store_true', default=False, help='fix intrinsics in bundle adjustment, only used if --known-intrin is given and intrinsics.txt exists')
     parser.add_argument('--debug', action='store_true', default=False, help='render debug video')
-    parser.add_argument('--noradial', action='store_true', default=False, help='do not use radial distortion')
+    parser.add_argument('--noradial', action='store_true', default=True, help='do not use radial distortion')
     parser.add_argument('--use-masks', action='store_true', default=False, help='use automatic masks')
     parser.add_argument(
                     '--images-resized', default='images_resized', help='location for resized/renamed images')

diff --git a/opt/scripts/view_data.py b/opt/scripts/view_data.py
@@ -372,7 +372,7 @@ def get_transform(c2w):
     out_dir = path.join(args.data_dir, "visual")
     scene.add_axes(length=1.0, visible=False)
     scene.add_sphere("Unit Sphere", visible=False)
-    scene.add_cube("Unit Cube", scale=2, visible=False)
+    scene.add_wireframe_cube("Unit Cube", scale=2, visible=False)
     print('WRITING', out_dir)
     scene.display(out_dir, world_up=world_up, cam_origin=origin, cam_center=center, cam_forward=vforward)
 

diff --git a/opt/util/nerf_dataset.py b/opt/util/nerf_dataset.py
@@ -15,6 +15,8 @@
 class NeRFDataset(DatasetBase):
     """
     NeRF dataset loader
+
+    WARNING: this is only intended for use with NeRF Blender data!!!!
     """
 
     focal: float
@@ -58,6 +60,8 @@ def __init__(
         data_json = path.join(root, "transforms_" + split_name + ".json")
 
         print("LOAD DATA", data_path)
+        print("WARNING: This data loader is ONLY intended for use with NeRF-synthetic Blender data!!!!")
+        print("If you want to try running this code on Instant-NGP data please use scripts/ingp2nsvf.py")
 
         j = json.load(open(data_json, "r"))