AUTHOR=Najnin Shamima , Banerjee Bonny 

TITLE=Pragmatically Framed Cross-Situational Noun Learning Using Computational Reinforcement Models

JOURNAL=Frontiers in Psychology

VOLUME=Volume 9 - 2018

YEAR=2018

URL=https://www.frontiersin.org/journals/psychology/articles/10.3389/fpsyg.2018.00005

DOI=10.3389/fpsyg.2018.00005

ISSN=1664-1078

ABSTRACT=Cross-situational learning and social pragmatic theories are prominent mechanisms for learning word meanings (i.e. word-object pairs). In this paper, the role of reinforcement is investigated for early word learning by an artificial agent. When exposed to a group of speakers, the agent comes to understand an initial set of vocabulary items belonging to the language used by the group. Both cross-situational learning and social pragmatic theory are taken into account. As social cues, joint attention and prosodic cues in caregiver's speech are considered. During agent-caregiver interaction, the agent selects a word from the caregiver's utterance and learns the relations between that word and the objects in its visual environment. The `novel words to novel objects' language-specific constraint is assumed for computing rewards. {\color{blue}The models are learned by maximizing the expected reward using reinforcement learning algorithms (i.e. table-based algorithms: Q-learning, SARSA, SARSA-$\lambda$ and neural network-based algorithms: Q-learning for neural network (Q-NN), neural-fitted Q-network (NFQ) and deep Q-network (DQN)). Neural network-based reinforcement learning models are chosen over table-based models for better generalization and quicker convergence.} Simulations are carried out using mother-infant interaction {\color{blue}CHILDES} dataset for learning word-object pairings. Reinforcement is modeled in two cross-situational learning cases: (1) with joint attention (attentional models), and (2) with joint attention and prosodic cues (attentional-prosodic models). Attentional-prosodic models manifest superior performance to attentional ones for the task of word learning. The attentional-prosodic DQN outperforms existing word learning models for the same task.