The final, formatted version of the article will be published soon.
ORIGINAL RESEARCH article
Front. Robot. AI
Sec. Robot Vision and Artificial Perception
Volume 11 - 2024 |
doi: 10.3389/frobt.2024.1424036
This article is part of the Research Topic Computer Vision Mechanisms for Resource-Constrained Robotics Applications View all 7 articles
A Fast Monocular 6D Pose Estimation Method for Textureless Objects Based on Perceptual Hashing and Template Matching
Provisionally accepted- 1 Mercedes-Benz (Germany), Stuttgart, Germany
- 2 Technical University of Berlin, Berlin, Brandenburg, Germany
- 3 Aalborg University, Aalborg, Denmark
Object pose estimation is essential for computer vision applications such as quality inspection, robotic bin picking, and warehouse logistics. However, this task often requires expensive equipment such as 3D cameras or Lidar sensors, as well as significant computational resources.Many state-of-the-art methods for 6D pose estimation depend on deep neural networks, which are computationally demanding and require GPUs for real-time performance. Moreover, they usually involve collection and labeling of large training datasets, which is costly and time-consuming.We propose a template-based matching algorithm that utilizes a novel perceptual hashing method for binary images, enabling fast and robust pose estimation. This approach allows the automatic preselection of a subset of templates, significantly reducing inference time while maintaining similar accuracy. Our solution runs efficiently on multiple devices without GPU support, offering reduced runtime and high accuracy on cost-effective hardware.We benchmarked our proposed approach on a body-in-white automotive part, relevant to the automotive industry and on a widely-used publicly available dataset. Our set of experiments, on a synthetically generated dataset reveals a superior trade-off between accuracy and computation time compared to a previous work evaluated on the same automotive-production use case. The algorithm Additionally, our algorithm efficiently utilizes all CPU cores and includes adjustable parameters for balancing computation time and accuracy, making it suitable for a wide range of 1 Araya-Martinez et al.applications where hardware cost and power efficiency are critical. For instance, with a rotation step of 10°in the template database, we achieve an average rotation error of 10°, matching the template quantization level, and an average translation error of 14cm 14% of the object's size, with an average processing time of 0.3s per image on an small form-factor Nvidia AGX Orin device. We also evaluate robustness under partial occlusions (up to 10% occlusion) and noisy inputs (SNRs up to 10dB), with only minor losses in accuracy. Additionally, we compare our method to state-of-the-art deep learning models on a public dataset. While our algorithm does not outperform them in absolute accuracy, it provides a more favorable trade-off between accuracy and processing time, which is especially relevant to applications employing resource-constrained devices.
Keywords: 6D pose estimation, Perceptual hashing, iou, Hamming distance, Automotive production
Received: 27 Apr 2024; Accepted: 02 Dec 2024.
Copyright: © 2024 Araya Martinez, Matthiesen, Bøgh, Lambrecht and Pimentel de Figueiredo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence:
Jose Moises Araya Martinez, Mercedes-Benz (Germany), Stuttgart, Germany
Vinicius Soares Matthiesen, Aalborg University, Aalborg, 9220, Denmark
Simon Bøgh, Aalborg University, Aalborg, 9220, Denmark
Jens Lambrecht, Technical University of Berlin, Berlin, 10623, Brandenburg, Germany
Rui Pimentel de Figueiredo, Aalborg University, Aalborg, 9220, Denmark
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.