AUTHOR=Greco Matteo , Cometa Andrea , Artoni Fiorenzo , Frank Robert , Moro Andrea TITLE=False perspectives on human language: Why statistics needs linguistics JOURNAL=Frontiers in Language Sciences VOLUME=2 YEAR=2023 URL=https://www.frontiersin.org/journals/language-sciences/articles/10.3389/flang.2023.1178932 DOI=10.3389/flang.2023.1178932 ISSN=2813-4605 ABSTRACT=

A sharp tension exists about the nature of human language between two opposite parties: those who believe that statistical surface distributions, in particular using measures like surprisal, provide a better understanding of language processing, vs. those who believe that discrete hierarchical structures implementing linguistic information such as syntactic ones are a better tool. In this paper, we show that this dichotomy is a false one. Relying on the fact that statistical measures can be defined on the basis of either structural or non-structural models, we provide empirical evidence that only models of surprisal that reflect syntactic structure are able to account for language regularities.

One-sentence summary

Language processing does not only rely on some statistical surface distributions, but it needs to be integrated with syntactic information.