IVGM: Large Scale Indoor Visual-Geometric Multimodal Dataset and Benchmark for Novel View Synthesis

Abstract

Accurate reconstruction of indoor environments is crucial for applications in augmented reality, virtual reality, and robotics. However, existing indoor datasets are often limited in scale, lack ground-truth point clouds, and provide insufficient viewpoints, which impedes the development of robust novel view synthesis (NVS) techniques. To address these limitations, we introduce a new large-scale indoor dataset that features diverse and challenging scenes, including basements and long corridors. This dataset offers panoramic image sequences for comprehensive coverage, high-resolution point clouds, meshes, and textures as ground truth, and a novel benchmark specifically designed to evaluate NVS algorithms in complex indoor environments. Our dataset and benchmark aim to advance indoor scene reconstruction and facilitate the creation of more effective NVS solutions for real-world applications.

Dataset Overview

The IVGM dataset encompasses a diverse array of environments, meticulously captured by our custom-designed data acquisition vehicle across three distinct scenes. This collection includes two segments from school office floors and one scene from underground garages. It provides:

Sequence Name	Area Size(m²)	Point Number	Insta Images	Titan Images
Office Area1	2,989.63	76,488,066	1,610	12,872
Office Area2	2,651.00	86,233,513	2,669	21,608
Underground Garage	3,797.11	153,185,271	1,816	14,528

Benchmark Results

To evaluate the applicability, versatility, and performance of our dataset on novel view synthesis algorithms, we tested several popular methods developed in recent years.

Key results show:

The choice of camera system can significantly affect the performance of NVS algorithms.
Models trained with images from all five cameras better fit the images from all perspectives.
The inclusion of LiDAR point cloud data significantly enhances the performance of the novel view synthesis (NVS) algorithms.

Lidar benchmark results — Quantitative Comparison of Different Novel View Synthesis Methods in IVGM Dataset.

RGB benchmark results — Results of Different Input Image Dataset on IVGM Dataset. Images of Insta-single results are rendered using Insta-five camera pose and vice versa.