Skip to main content

ORIGINAL RESEARCH article

Front. Artif. Intell.
Sec. Natural Language Processing
Volume 7 - 2024 | doi: 10.3389/frai.2024.1463164

Sequence Labeling via Reinforcement Learning with Aggregate Labels

Provisionally accepted
Marcel Geromel Marcel Geromel *Philipp Cimiano Philipp Cimiano
  • Bielefeld University, Bielefeld, Germany

The final, formatted version of the article will be published soon.

    Sequence labeling is pervasive in natural language processing, encompassing tasks such as Named Entity Recognition, Question Answering, and Information Extraction. Traditionally, these tasks are addressed via supervised machine learning approaches. However, despite their success, these approaches are constrained by two key limitations: a common mismatch between the training and evaluation objective, and the resource-intensive acquisition of ground-truth token-level annotations. In this work, we introduce a novel reinforcement learning approach to sequence labeling that leverages aggregate annotations by counting entity mentions to generate feedback for training, thereby addressing the aforementioned limitations. We conduct experiments using various combinations of aggregate feedback and reward functions for comparison, focusing on Named Entity Recognition to validate our approach. The results suggest that sequence labeling can be learned from purely count-based labels, even at the sequence-level. Overall, this count-based method has the potential to significantly reduce annotation costs and variances, as counting entity mentions is more straightforward than determining exact boundaries.

    Keywords: reinforcement learning, Reward functions, annotations, Sequence labeling, Information Extraction

    Received: 11 Jul 2024; Accepted: 28 Oct 2024.

    Copyright: © 2024 Geromel and Cimiano. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Marcel Geromel, Bielefeld University, Bielefeld, Germany

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.