Skip to main content

METHODS article

Front. Comput. Sci.
Sec. Computer Vision
Volume 6 - 2024 | doi: 10.3389/fcomp.2024.1382080

Monocular 3D Object Detection for Occluded Targets Based on Spatial Relationships and Decoupled Depth Predictions

Provisionally accepted
Yanfei Gao Yanfei Gao 1*Xiongwei Miao Xiongwei Miao 2Guoye Zhang Guoye Zhang 3
  • 1 Shanxi Finance and Taxation College, Taiyuan, China
  • 2 Shanxi Intelligent Big Data Industry Technology Innovation Research Institute, Taiyuan, Shanxi Province, China
  • 3 Shanxi Provincial Digital Government Service Center, Taiyuan, Shanxi Province, China

The final, formatted version of the article will be published soon.

    Autonomous driving is the future trend. Accurate 3D object detection is a prerequisite for achieving autonomous driving. Currently, 3D object detection relies on three main sensors: monocular cameras, stereo cameras, and lidar. In comparison to methods based on stereo cameras and lidar, monocular 3D object detection offers advantages such as a broad detection field and low deployment costs. However, the accuracy of existing monocular 3D object detection methods is not ideal, especially for occluded targets. To tackle this challenge, the paper introduces a novel approach for monocular 3D object detection, denoted as SRDDP-M3D, aiming to improve monocular 3D object detection by considering spatial relationships between targets, and by refining depth predictions through a decoupled approach. We consider how objects are positioned relative to each other in the environment and encode the spatial relationships between neighboring objects, the detection performance is enhanced specially for occluded targets. Furthermore, a strategy of decoupling the prediction of target depth into two components of target visual depth and target attribute depth is introduced. This decoupling is designed to improve the accuracy of predicting the overall depth of the target. Experimental results using the KITTI dataset demonstrate that this approach substantially enhances the detection accuracy of occluded targets.

    Keywords: Autonomous Driving, Computer Vision, Monocular 3D object detection, Object detection (OD), spatial relationships, Decoupled Depth Predictions

    Received: 16 Feb 2024; Accepted: 20 Dec 2024.

    Copyright: © 2024 Gao, Miao and Zhang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Yanfei Gao, Shanxi Finance and Taxation College, Taiyuan, China

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.