Skip to main content

ORIGINAL RESEARCH article

Front. Earth Sci.

Sec. Geoinformatics

Volume 13 - 2025 | doi: 10.3389/feart.2025.1530004

Assessing Named Entity Recognition by using Geosciences Domain Schemas: The case of Mineral Systems

Provisionally accepted
Sandra Paula Villacorta Chambi Sandra Paula Villacorta Chambi 1*Mark Lindsay Mark Lindsay 1Jens Klump Jens Klump 1Klaus Gessner Klaus Gessner 2Erin Gray Erin Gray 2Helen McFarlane Helen McFarlane 1
  • 1 Mineral Resources - CSIRO, Kensington, Australia
  • 2 Geological Survey of Western Australia, East Perth, Western Australia, Australia

The final, formatted version of the article will be published soon.

    Named Entity Recognition (NER) is essential for extracting and classifying specialized domain terms from textual data. Schemas provide structured frameworks by defining relevant entity classes and relationships, enabling computer systems to process discipline-specific terminology accurately. This study introduces the Schema for Mineral Systems (SMS), developed to evaluate NER accuracy in geoscientific texts on mineral systems, with nine geological and five general entity classes. SMS was created through domain characterization, word disambiguation, taxonomy development, and expert input to address the complexity of geological terminology. Domain-specific dictionaries and schema-linked annotations facilitated the identification of unique terms in mineral systems, while expert validation highlighted the importance of iterative verification to improve NER model performance. Applied to corpora on iron and lithium deposits in Western Australia, SMS use demonstrates the challenges of context-specific schemas in enhancing specialized knowledge extraction and accurate entity recognition in complex domains.

    Keywords: Knowledge Management, nlp, ner, geological terminology, ontologies

    Received: 18 Nov 2024; Accepted: 31 Mar 2025.

    Copyright: © 2025 Villacorta Chambi, Lindsay, Klump, Gessner, Gray and McFarlane. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Sandra Paula Villacorta Chambi, Mineral Resources - CSIRO, Kensington, Australia

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

    Research integrity at Frontiers

    Man ultramarathon runner in the mountains he trains at sunset

    95% of researchers rate our articles as excellent or good

    Learn more about the work of our research integrity team to safeguard the quality of each article we publish.


    Find out more