AUTHOR=Addlesee Angus , Eshghi Arash 

TITLE=You have interrupted me again!: making voice assistants more dementia-friendly with incremental clarification

JOURNAL=Frontiers in Dementia

VOLUME=3

YEAR=2024

URL=https://www.frontiersin.org/journals/dementia/articles/10.3389/frdem.2024.1343052

DOI=10.3389/frdem.2024.1343052

ISSN=2813-3919

ABSTRACT=<p>In spontaneous conversation, speakers seldom have a full plan of what they are going to say in advance: they need to conceptualise and plan <italic>incrementally</italic> as they articulate each word in turn. This often leads to long pauses mid-utterance. Listeners either wait out the pause, offer a possible completion, or respond with an incremental clarification request (iCR), intended to recover the rest of the truncated turn. The ability to generate iCRs in response to pauses is therefore important in building <italic>natural</italic> and <italic>robust</italic> everyday voice assistants (EVA) such as Amazon Alexa. This becomes crucial with people with dementia (PwDs) as a target user group since they are known to pause longer and more frequently, with current state-of-the-art EVAs interrupting them prematurely, leading to frustration and breakdown of the interaction. In this article, we first use two existing corpora of truncated utterances to establish the generation of clarification requests as an effective strategy for recovering from interruptions. We then proceed to report on, analyse, and release SLUICE-CR: a new corpus of 3,000 crowdsourced, human-produced iCRs, the first of its kind. We use this corpus to probe the incremental processing capability of a number of state-of-the-art large language models (LLMs) by evaluating (1) the quality of the model's generated iCRs in response to incomplete questions and (2) the ability of the said LLMs to respond correctly <italic>after</italic> the users response to the generated iCR. For (1), our experiments show that the ability to generate contextually appropriate iCRs only emerges at larger LLM sizes and only when prompted with example iCRs from our corpus. For (2), our results are in line with (1), that is, that larger LLMs interpret incremental clarificational exchanges more effectively. Overall, our results indicate that autoregressive language models (LMs) are, in principle, able to both understand and generate language incrementally and that LLMs can be configured to handle speech phenomena more commonly produced by PwDs, mitigating frustration with today's EVAs by improving their accessibility.</p>