Skip to main content

SYSTEMATIC REVIEW article

Front. Artif. Intell.
Sec. Machine Learning and Artificial Intelligence
Volume 7 - 2024 | doi: 10.3389/frai.2024.1456486
This article is part of the Research Topic Hybrid Human Artificial Intelligence: Augmenting Human Intelligence with AI View all articles

Human-Centered Evaluation of Explainable AI Applications: a Systematic Review

Provisionally accepted
  • 1 HU University of Applied Sciences Utrecht, Utrecht, Netherlands
  • 2 Jheronimus Academy of Data Science, s'Hertogenbosch, Netherlands

The final, formatted version of the article will be published soon.

    Explainable Artificial Intelligence (XAI) aims to provide insights into the inner workings and the outputs of AI systems. Recently, there's been growing recognition that explainability is inherently human-centric, tied to how people perceive explanations. Despite this, there is no consensus in the research community on whether user evaluation is crucial in XAI, and if so, what exactly needs to be evaluated and how. This systematic literature review addresses this gap by providing a detailed overview of the current state of affairs in human-centered XAI evaluation. We reviewed 73 papers across various domains where XAI was evaluated with users. These studies assessed what makes an explanation "good" from a user's perspective, i.e., what makes an explanation meaningful to a user of an AI system. We identified 30 components of meaningful explanations that were evaluated in the reviewed papers and categorized them into a taxonomy of human-centered XAI evaluation, based on: (a) the contextualized quality of the explanation, (b) the contribution of the explanation to human-AI interaction, and (c) the contribution of the explanation to human-AI performance. Our analysis also revealed a lack of standardization in the methodologies applied in XAI user studies, with only 19 of the 73 papers applying an evaluation framework used by at least one other study in the sample. These inconsistencies hinder crossstudy comparisons and broader insights. Our findings contribute to understanding what makes explanations meaningful to users and how to measure this, guiding the XAI community toward a more unified approach in human-centered explainability.

    Keywords: Explainable AI, XAI, human-centered evaluation, Meaningful Explanations, XAI Evaluation, Systematic review, Human-AI interaction, Human-AI Performance

    Received: 28 Jun 2024; Accepted: 25 Sep 2024.

    Copyright: © 2024 Kim, Maathuis and Sent. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence:
    Jenia Kim, HU University of Applied Sciences Utrecht, Utrecht, Netherlands
    Henry Maathuis, HU University of Applied Sciences Utrecht, Utrecht, Netherlands

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.