The final, formatted version of the article will be published soon.
ORIGINAL RESEARCH article
Front. Artif. Intell.
Sec. Machine Learning and Artificial Intelligence
Volume 8 - 2025 |
doi: 10.3389/frai.2025.1454488
MPAR-RCNN: A MultiTask Network for Multiple Person Detection with Attribute Recognition
Provisionally accepted- 1 Manipal Institute of Technology, Manipal, India
- 2 KPIT Technologies, Pune, India
- 3 Sleep Medicine Center, Stanford Healthcare, Stanford, California, United States
Multi-label attribute recognition is a critical task in computer vision, with applications ranging across diverse fields. This problem often involves detecting objects with multiple attributes, necessitating sophisticated models capable of both high-level differentiation and fine-grained feature extraction. The integration of object detection and attribute recognition typically relies on approaches such as dual-stage networks, where accurate predictions depend on advanced feature extraction techniques, such as Region of Interest (RoI) pooling. To meet these demands, an efficient method that achieves both reliable detection and attribute classification in a unified framework is essential. This paper introduces an innovative MTL framework designed to incorporate Multi-Person Attribute Recognition (MPAR) within a single model architecture. Named MPAR-RCNN, this framework unifies object detection and attribute recognition tasks through a spatially aware, shared backbone, facilitating efficient and accurate multi-label prediction. Unlike the traditional Fast Region-based Convolutional Neural Network (R-CNN), which separately manages person detection and attribute classification with a dual-stage network, the MPAR-RCNN architecture optimizes both tasks within a single structure. Validated on the WIDER (Web Image Dataset for Event Recognition) dataset, the proposed model demonstrates an improvement over current state-of-the-art (SOTA) architectures, showcasing its potential in advancing multi-label attribute recognition.
Keywords: Attribute recognition, Convolution Neural Network, Human attribute recognition, Multi-task learning, object detection
Received: 25 Jun 2024; Accepted: 22 Jan 2025.
Copyright: © 2025 Raghavendra, Abhilash, Nookala, Shetty and Bharathi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence:
Jayashree Shetty, Manipal Institute of Technology, Manipal, India
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.