Skip to main content

ORIGINAL RESEARCH article

Front. Neurorobot.
Volume 18 - 2024 | doi: 10.3389/fnbot.2024.1427786
This article is part of the Research Topic Multi-modal Learning with Large-scale Models View all articles

Multi-Modal Remote Perception Learning for Object Sensory Data

Provisionally accepted
Nouf A. Almujally Nouf A. Almujally 1Adnan A. Rafique Adnan A. Rafique 2Naif Al Mudawi Naif Al Mudawi 3Abdulwahab Alazeb Abdulwahab Alazeb 3Mohammed Alonazi Mohammed Alonazi 4Asaad Algarni Asaad Algarni 5Ahmad Jalal Ahmad Jalal 6*Hui Liu Hui Liu 7*
  • 1 Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
  • 2 University of Poonch Rawalakot, Rawalakot, Azad Kashmir, Pakistan
  • 3 Najran University, Najran, Saudi Arabia
  • 4 Prince Sattam Bin Abdulaziz University, Al-Kharj, Saudi Arabia
  • 5 Northern Border University, Arar, Northern Borders, Saudi Arabia
  • 6 Air University, Islamabad, Pakistan
  • 7 University of Bremen, Bremen, Bremen, Germany

The final, formatted version of the article will be published soon.

    Introduction: When it comes to interpreting visual input, intelligent systems make use of contextual scene learning, which significantly improves both resilience and context awareness. The management of enormous amounts of data is a driving force behind the growing interest in computational frameworks, particularly in the context of autonomous cars.The purpose of this study is to introduce a novel approach known as Deep Fused Networks (DFN), which improves contextual scene comprehension by merging multi-object detection and semantic analysis.To enhance accuracy and comprehension in complex situations, DFN makes use of a combination of deep learning and fusion techniques. With a minimum gain of 6.4% in accuracy for the SUN-RGB-D dataset and 3.6% for the NYU-Dv2 dataset.Discussion: Findings demonstrate considerable enhancements in object detection and semantic analysis when compared to the methodologies that are currently being utilized.

    Keywords: Multi-Modal, sensory data, Objects recognition, visionary sensor, simulation environment Multi-modal, simulation environment

    Received: 04 May 2024; Accepted: 26 Aug 2024.

    Copyright: © 2024 Almujally, Rafique, Al Mudawi, Alazeb, Alonazi, Algarni, Jalal and Liu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence:
    Ahmad Jalal, Air University, Islamabad, Pakistan
    Hui Liu, University of Bremen, Bremen, 28359, Bremen, Germany

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.