Skip to main content

ORIGINAL RESEARCH article

Front. Comput. Sci.
Sec. Human-Media Interaction
Volume 6 - 2024 | doi: 10.3389/fcomp.2024.1436341

Voice accentedness, but not gender, affects social responses to a computer tutor

Provisionally accepted
  • University of California, Davis, Davis, United States

The final, formatted version of the article will be published soon.

    The current study had two goals: First, we aimed to conduct a conceptual replication and extension of a classic study by Nass and colleagues (1997) who found that participants display voice-gender bias when completing a tutoring session with a computer. In the present study, we used a more modern paradigm (i.e., app-based tutoring) and commercially-available TTS voices. Second, we asked whether participants provided different social evaluations of non-native-accented and native-accented American English-speaking machines. In the present study, 85 American participants completed a tutoring session with a system designed to look like a device application (we called it a "TutorBot"). Participants were presented with facts related to two topics: 'love and relationships' and 'computers and technology'. Tutoring was provided either by a female or male TTS voice. Participants heard either native-English accented voices or non-native-English accented (here, Castilian Spanish-accented) voices. Overall, we find no effect of voice gender on any of the dependent measures: listeners recalled facts and rated female and male voices equivalently across topics and conditions. Yet, participants rated non-native accented TTS voices as less competent, less knowledgeable, and less helpful after completing the tutoring session. Finally, when participants were tutored on facts related to 'love and relationships', they showed better accuracy at recall and provided higher ratings for app competency, likeability, and helpfulness (and knowledgeable, but only for native-accented voices). These results are relevant for theoretical understandings of human-computer interaction, particularly the extent to which human-based social biases are transferred to machines, as well as for applications to voice-AI system design and usage.

    Keywords: voice gender, Accentedness, human-computer interaction, social evaluation, Learning

    Received: 21 May 2024; Accepted: 26 Aug 2024.

    Copyright: © 2024 Jones and Zellou. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Allison Jones, University of California, Davis, Davis, United States

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.