Abstract

Next Best View (NBV) algorithms aim to maximize 3D scene acquisition quality using minimal resources, e.g. number of acquisitions, time taken, or distance traversed. Prior methods often rely on coverage maximization as a proxy for reconstruction quality, but for complex scenes with occlusions and finer details, this is not always sufficient and leads to poor reconstructions. Our key insight is to train an acquisition policy that directly optimizes for reconstruction quality rather than just coverage. To achieve this, we introduce the View Introspection Network (VIN): a lightweight neural network that predicts the Relative Reconstruction Improvement (RRI) of a potential next viewpoint without making any new acquisitions. We use this network to power a simple, yet effective, sequential sampling-based greedy NBV policy. Our approach, VIN-NBV, generalizes to unseen object categories, operates without prior scene knowledge, is adaptable to resource constraints, and can handle occlusions. We show that our RRI fitness criterion leads to a ~30 gain in reconstruction quality over a coverage-based criterion using the same greedy strategy. Furthermore, VIN-NBV also outperforms deep reinforcement learning methods, Scan-RL and GenNBV, by ~40%.

VIN-NBV Policy Overview

Overview of the VIN-NBV Policy and the VIN architecture. The VIN is trained to predict the reconstruction improvement of a query view given a set of prior acquisitions. The VIN-NBV policy uses the VIN to select the next best view to acquire. The design of our policy makes it easy to modify with custom termination criteria and decision making logic.

Evaluation Results

We show the final average chamfer distance of our method compared to prior works evaluated on the OmniObject3D houses category for 20 captures. We also graph the average chamfer distance of our method as more acquisitions are made. We show that our method outperforms all prior works and that the chamfer distance improves as more acquisitions are made.

BibTeX

@misc{frahm2025vinnbvviewintrospectionnetwork,
        title={VIN-NBV: A View Introspection Network for Next-Best-View Selection for Resource-Efficient 3D Reconstruction}, 
        author={Noah Frahm and Dongxu Zhao and Andrea Dunn Beltran and Ron Alterovitz and Jan-Michael Frahm and Junier Oliva and Roni Sengupta},
        year={2025},
        eprint={2505.06219},
        archivePrefix={arXiv},
        primaryClass={cs.CV},
        url={https://arxiv.org/abs/2505.06219}, 
  }

VIN-NBV: A View Introspection Network for Next-Best-View Selection for Resource-Efficient 3D Reconstruction

Abstract

VIN-NBV Policy Overview

Evaluation Results

10 Acquisition Comparison

Cov-NBV

VIN-NBV

Time in Motion Comparison

Base View 1

Base View 2

Initial Reconstruction (0s)

Cov-NBV

15 s

30 s

45 s

60 s

VIN-NBV

15 s

30 s

45 s

60 s

BibTeX

VIN-NBV: A View Introspection Network for Next-Best-View Selection for Resource-Efficient 3D Reconstruction

Abstract

VIN-NBV Policy Overview

Evaluation Results

10 Acquisition Comparison

Cov-NBV

VIN-NBV

Time in Motion Comparison

Base View 1

Base View 2

Initial Reconstruction (0s)

Cov-NBV

15 s

30 s

45 s

60 s

VIN-NBV

15 s

30 s

45 s

60 s

BibTeX

15 s

30 s

45 s

60 s

15 s

30 s

45 s

60 s