Skip to main content

ORIGINAL RESEARCH article

Front. Neurorobot.
Volume 18 - 2024 | doi: 10.3389/fnbot.2024.1484276
This article is part of the Research Topic Advancing Autonomous Robots: Challenges and Innovations in Open-World Scene Understanding View all articles

Improved object detection method for autonomous driving based on DETR

Provisionally accepted
Huaqi Zhao Huaqi Zhao 1*Songnan Zhang Songnan Zhang 1Xiang Peng Xiang Peng 1Zhengguang Lu Zhengguang Lu 1*Guojing Li Guojing Li 2*
  • 1 School of Information and Electronics Technology, Jiamusi University, Jiamusi, China
  • 2 School of Materials Science and Engineering, Jiamusi University, Jiamusi, China

The final, formatted version of the article will be published soon.

    Object detection is a critical component in the development of autonomous driving technology and has demonstrated significant growth potential. To address the limitations of current techniques, this paper presents an improved object detection method for autonomous driving based on a Detection Transformer (DETR). First, we introduce a multi-scale feature and location information extraction method, which solves the inadequacy of the model for multi-scale object localization and detection. In addition, we developed a Transformer encoder based on the group axial attention mechanism. This allows for efficient attention range control in the horizontal and vertical directions while reducing computation, ultimately enhancing the inference speed. Furthermore, we propose a novel dynamic hyperparameter tuning training method based on Pareto efficiency, which coordinates the training state of the loss functions through dynamic weights, overcoming issues associated with manually setting fixed weights and enhancing model convergence speed and accuracy. Experimental results demonstrate that the proposed method surpasses others, with improvements of 3.3%, 4.5% and 3% in Average Precision on the COCO, PASCAL VOC and KITTI datasets, respectively, and an 84% increase in FPS.

    Keywords: object detection, feature extraction, transformer encoder, Loss function, parameter tuning

    Received: 21 Aug 2024; Accepted: 30 Dec 2024.

    Copyright: © 2024 Zhao, Zhang, Peng, Lu and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence:
    Huaqi Zhao, School of Information and Electronics Technology, Jiamusi University, Jiamusi, China
    Zhengguang Lu, School of Information and Electronics Technology, Jiamusi University, Jiamusi, China
    Guojing Li, School of Materials Science and Engineering, Jiamusi University, Jiamusi, China

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.