GitHub - MrZihan/HNR-VLN: Official implementation of Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language Navigation (CVPR'24 Highlight).

Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language Navigation

Zihan Wang, Xiangyang Li, Jiahao Yang, Yeqi Liu, Junjie Hu, Ming Jiang and Shuqiang Jiang

Vision-and-language navigation (VLN) enables the agent to navigate to a remote location following the natural language instruction in 3D environments. At each navigation step, the agent selects from possible candidate locations and then makes the move. For better navigation planning, the lookahead exploration strategy aims to effectively evaluate the agent's next action by accurately anticipating the future environment of candidate locations. To this end, some existing works predict RGB images for future environments, while this strategy suffers from image distortion and high computational cost. To address these issues, we propose the pre-trained hierarchical neural radiance representation model (HNR) to produce multi-level semantic features for future environments, which are more robust and efficient than pixel-wise RGB reconstruction. Furthermore, with the predicted future environmental representations, our lookahead VLN model is able to construct the navigable future path tree and select the optimal path branch via efficient parallel evaluation. Extensive experiments on the VLN-CE datasets confirm the effectiveness of our method.

TODOs

Release the pre-training code of the Hierarchical Neural Radiance Representation Model.
Release the checkpoints of the Hierarchical Neural Radiance Representation Model.
Tidy the pre-training code for easy execution.
~~Release the fine-tuning code of the Lookahead VLN Model.~~ See the improved model g3D-LF.
~~Release the checkpoints of the Lookahead VLN Model.~~ See the improved model g3D-LF.

Issues

For training speed, see Issue#7

Load only a few scenes for efficient debugging, see Issue#4

Requirements

Install Habitat simulator: follow instructions from ETPNav and VLN-CE.
Download the Habitat-Matterport 3D Research Dataset (HM3D) from habitat-matterport-3dresearch
```
hm3d-train-habitat-v0.2.tar
hm3d-val-habitat-v0.2.tar
```
Download annotations (PointNav, VLN-CE) and trained models from Baidu Netdisk or TeraBox.
Download pre-trained waypoint predictor from link.

Install torch_kdtree for K-nearest feature search from torch_kdtree.

git clone https://github.com/thomgrand/torch_kdtree
cd torch_kdtree
git submodule init
git submodule update
pip3 install .

Install tinycudann for faster multi-layer perceptrons (MLPs) from tiny-cuda-nn.

pip3 install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch

Pre-train the HNR model

bash run_r2r/nerf.bash train 2345

Evaluate the HNR model

Evaluate the cosine similarity between the HNR model's predicted features and the CLIP model's ground truth features.

bash run_r2r/nerf.bash eval 2345

Set Visualization to True in line 68 of HNR-VLN/NeRF/ss_trainer_ETP.py, visualize and save the images predicted by the HNR model.

Citation

@InProceedings{Wang_lookahead,
    author    = {Wang, Zihan and Li, Xiangyang and Yang, Jiahao and Liu, Yeqi and Hu, Junjie and Jiang, Ming and Jiang, Shuqiang},
    title     = {Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language Navigation},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2024},
    pages     = {13753-13762}
}

Acknowledgments

Our code is based on ETPNav, nerf-pytorch and torch_kdtree. Thanks for their great works!

Name	Name	Last commit message	Last commit date
Latest commit MrZihan Feb 28, 2025 ee468f5 · Feb 28, 2025 History 37 Commits
NeRF	NeRF	Add files via upload	Jul 6, 2024
bert_config	bert_config	Add files via upload	Apr 9, 2024
habitat_extensions	habitat_extensions	Add files via upload	Apr 9, 2024
run_r2r	run_r2r	Add files via upload	Jul 6, 2024
run_rxr	run_rxr	Add files via upload	Apr 9, 2024
README.md	README.md	Update README.md	Feb 28, 2025
demo.gif	demo.gif	Add files via upload	Mar 18, 2024
environment.yaml	environment.yaml	Add files via upload	May 23, 2024
run.py	run.py	Add files via upload	Apr 9, 2024
run_nerf.py	run_nerf.py	Add files via upload	Apr 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language Navigation

Zihan Wang, Xiangyang Li, Jiahao Yang, Yeqi Liu, Junjie Hu, Ming Jiang and Shuqiang Jiang

TODOs

Issues

Requirements

Pre-train the HNR model

Evaluate the HNR model

Citation

Acknowledgments

About

Releases

Packages

Languages

MrZihan/HNR-VLN

Folders and files

Latest commit

History

Repository files navigation

Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language Navigation

Zihan Wang, Xiangyang Li, Jiahao Yang, Yeqi Liu, Junjie Hu, Ming Jiang and Shuqiang Jiang

TODOs

Issues

Requirements

Pre-train the HNR model

Evaluate the HNR model

Citation

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages