Zhiwen Yang and Yuxin Peng, “GaLa-2.5D: global-local alignment with 2.5d semantic guidance for camera-based 3d semantic scene completion in autonomous driving,” Chinese Journal of Electronics, vol. x, no. x, pp. 1–12, xxxx. DOI: 10.23919/cje.2025.00.297
Citation: Zhiwen Yang and Yuxin Peng, “GaLa-2.5D: global-local alignment with 2.5d semantic guidance for camera-based 3d semantic scene completion in autonomous driving,” Chinese Journal of Electronics, vol. x, no. x, pp. 1–12, xxxx. DOI: 10.23919/cje.2025.00.297

GaLa-2.5D: Global-Local Alignment with 2.5D Semantic Guidance for Camera-based 3D Semantic Scene Completion in Autonomous Driving

  • Camera-based 3D Semantic Scene Completion (SSC) offers a cost-effective solution to infer semantic occupancy and instance geometry of surrounding scenes with image input. Current SSC methods predominantly rely on intricate voxel-based 3D models to refine the projected voxel features, but overlook the inherent inconsistency of fine-grained semantics during the view transformation procedure. To address this issue, we introduce the Global-Local Alignment with 2.5D Semantic Guidance (GaLa-2.5D) framework, which maintains consistent fine-grained semantics with a 2.5D semantic bank, providing global and local guidance for aligned view transformation and precise scene completion. Specifically, to preserve fine-grained semantics through view transformation, we start with a Hybrid Semantic Fusion module that maintains a dynamically updated 2.5D semantic bank, querying and clustering aligned semantics from 2D and 3D features. Then, we design a Global Proposal Alignment module that dynamically filters seed voxels with the global guidance of 2.5D semantics, assigning fine-grained semantics throughout the scene. Finally, we propose a Local Context Alignment module that aligns contextual geometric structures and semantic correlations under the local guidance from 2.5D semantics, reducing ambiguity in scene completion. Extensive experiments and analyses on the SemanticKITTI and SSCBench-KITTI-360 datasets highlight the superiority of our GaLa-2.5D over existing state-of-the-art methods. The code is available at https://github.com/PKU-ICST-MIPL/CJE_GaLa-2.5D.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return