Skip to main content

ORIGINAL RESEARCH article

Front. Artif. Intell.
Sec. Natural Language Processing
Volume 7 - 2024 | doi: 10.3389/frai.2024.1413020

Development of Morphological Analyzer for Somali Language via Deep Learning Approach

Provisionally accepted
Tahir Ibrahim Gedi Tahir Ibrahim Gedi 1Tessfu Geteye Fantaye Tessfu Geteye Fantaye 2*
  • 1 Jigjiga University, Jijiga, Somali Region, Ethiopia
  • 2 Dire Dawa University, Dire Dawa, Ethiopia

The final, formatted version of the article will be published soon.

    Morphological analysis is one of the vital steps in the development of several natural language processing applications. It decomposes words into their constituent morphemes. It is a challenging task to address the morphological features of morphologically rich languages like the Somali language. To overcome this challenge and advance previous studies on Somali morphological analysis, this study proposes a deep learning-based morphological analyzer with a larger corpus for the Somali language. Various recurrent neural network models with and without attention mechanisms have been developed and trained using a 19,839-word training dataset based on a 10-fold cross-validation technique. The trained models were evaluated using a separate 2,204-word testing dataset. The experimental results demonstrated that bidirectional long short-term memory (BILSTM) without attention, BILSTM with attention, bidirectional gated recurrent unit (BIGRU) without attention, and BIGRU with attention achieved accuracy values of 92.77%, 95.04%, 93.34%, and 95.92%, respectively. Based on these results, BIGRU with attention and BILSTM with attention models achieved the best accuracy for the Somali morphological analyzer. Overall, the recurrent neural network models with attention are suitable for developing the best-performing morphological analyzer for the Somali language.

    Keywords: Natural Language Processing, Morphological analyzer, Somali language, BIGRU with attention, BILSTM with attention

    Received: 06 Apr 2024; Accepted: 29 Jul 2024.

    Copyright: © 2024 Gedi and Fantaye. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Tessfu Geteye Fantaye, Dire Dawa University, Dire Dawa, Ethiopia

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.