BlockGaussian: Efficient Large-Scale Scene Novel View Synthesis via Adaptive Block-Based Gaussian Splatting

Beihang University (BUAA)
Teaser
BlockGaussian reconstructs city-scale scenes from massive multi-view images and enables high-quality novel view synthesis from arbitrary viewpoints, as illustrated in the surrounding images. Compared to existing methods, our approach reduces reconstruction time from hours to minutes while achieving superior rendering quality in most scenes.

Abstract

The recent advancements in 3D Gaussian Splatting (3DGS) have demonstrated remarkable potential in novel view synthesis tasks. The divide-and-conquer paradigm has enabled large-scale scene reconstruction, but significant challenges remain in scene partitioning, optimization, and merging processes. This paper introduces BlockGaussian, a novel framework incorporating a content-aware scene partition strategy and visibility-aware block optimization to achieve efficient and high-quality large-scale scene reconstruction. Specifically, our approach considers the content-complexity variation across different regions and balances computational load during scene partitioning, enabling efficient scene reconstruction. To tackle the supervision mismatch issue during independent block optimization, we introduce auxiliary points during individual block optimization to align the ground-truth supervision, which enhances the reconstruction quality. Furthermore, we propose a pseudo-view geometry constraint that effectively mitigates rendering degradation caused by airspace floaters during block merging. Extensive experiments on large-scale scenes demonstrate that our approach achieves state-of-the-art results in both reconstruction efficiency and rendering quality.

Challenges

Challenges
Our goal is to solve the existing challenges in large-scale scene novel view synthesis task under divide-and-conquer paradigm. a) Imbalanced reconstruction complexity across blocks: The intensity of content in different scene regions exhibits significant differences. Areas with dense content require finer subdivision granularity to ensure reconstruction fidelity, while sparser-content regions benefit from coarser partitioning to enhance computational efficiency. b) Supervision mismatch in block-wise optimization: The content of a training view may be divided into multiple blocks after scene partitioning. Due to visibility constraints, the entire training view image does not match the ideal supervision when optimizing the individual block. c) Quality degradation in fusion results: Floater in airspace is an important reason for the quality degradation of fusion results. Since each block is optimized individually, these floaters fit well in the training perspective but degrade the quality of the synthesized novel views, especially in the boundary region.

Overview

Method overview
Overview of our proposed method. We first divide the entire scene and allocates viewpoints with Content-Aware Scene Partition, which jointly considering the complexity of scene content and the computational load distribution across blocks. Subsequently, we optimize each block independently, which is executable either sequentially on a single GPU or in parallel across multiple GPUs. During block optimization, we introduce auxiliary point clouds (aux pts) to address supervision mismatch issues. Pseudo-View Geometry Constraint is conducted to supervise airspace regions and mitigate floater artifacts. Finally, the optimized results from all blocks are integrated to construct a comprehensive Gaussian Representation of the entire scene, enabling interactive novel view synthesis.

Comparison With SOTA

Quality comparison with other methods on U3D and Mill19 datasets
Quantitative comparison of novel view synthesis results on Mill19 and UrbanScene3D datasets. The best, the second best, and the third best results are highlighted in red, orange and yellow.
Description of the second image
Quantitative comparison of novel view synthesis results on Mill19 and UrbanScene3D datasets. We present the optimization time OptTime (hh:mm), the number of final points ($10^6$) and the allocated VRAM (GB) during evaluation.
Description of the second image
Description of the second image

Rendering Comparison

Dynamic Comparison

BibTeX

@misc{wu2025blockgaussianefficientlargescalescene,
      title={BlockGaussian: Efficient Large-Scale Scene Novel View Synthesis via Adaptive Block-Based Gaussian Splatting}, 
      author={Yongchang Wu and Zipeng Qi and Zhenwei Shi and Zhengxia Zou},
      year={2025},
      eprint={2504.09048},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2504.09048}, 
}