Skip to main content

METHODS article

Front. Psychol.
Sec. Quantitative Psychology and Measurement
Volume 16 - 2025 | doi: 10.3389/fpsyg.2025.1506320
This article is part of the Research Topic Methodological and Statistical Advances in Educational Assessment View all 6 articles

A Multidimensional Bayesian IRT Method for Discovering Misconceptions from Concept Test Data

Provisionally accepted
  • 1 Department of Mechanical Engineering, School of Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States
  • 2 Department of Physics, School of Science, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States
  • 3 Department of Physics and Astronomy, West Virginia University, Morgantown, West Virginia, United States
  • 4 Department of Electrical Engineering and Computer Science, School of Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States

The final, formatted version of the article will be published soon.

    We present an exploratory method for discovering likely misconceptions from multiple-choice concept test data, as well as preliminary evidence that this method recovers known misconceptions from real student responses. Our procedure is based on a Bayesian implementation of the Multidimensional Nominal Categories IRT model (MNCM) combined with standard factor-analytic rotation methods; by analyzing student responses at the level of individual distractors rather than at the level of entire questions, this approach is able to highlight multiple likely misconceptions for subsequent investigation without requiring any manual labeling of test content. We explore the performance of the Bayesian MNCM on synthetic data and find that it is able to recover multidimensional item parameters consistently at achievable sample sizes. These studies demonstrate the method’s robustness to overfitting and ability to perform automatic dimensionality assessment and selection. The method also compares favorably to existing IRT software implementing marginal maximum likelihood estimation which we use as a validation benchmark. We then apply our method to approximately 10,000 students’ responses to a research-designed concept test: the Force Concept Inventory. In addition to a broad first dimension strongly correlated with overall test score, we discover thirteen additional dimensions which load on smaller sets of distractors; we discuss two as examples, showing that these are consistent with already-known misconceptions in Newtonian mechanics. While work remains to validate our findings, our hope is that future applications of this method could aid in the refinement of existing concept inventories or the development of new ones, enable the discovery of previously-unknown student misconceptions across a variety of disciplines, and—by leveraging the method's ability to quantify the prevalence of particular misconceptions—provide opportunities for targeted instruction at both the individual and classroom level.

    Keywords: item response theory, Student misconceptions, Multiple-choice questions, Distractor analysis, multidimensional nominal categories model, mean-field variational inference, Hierarchical priors

    Received: 04 Oct 2024; Accepted: 06 Jan 2025.

    Copyright: © 2025 Segado, Adair, Stewart, Ma, Drury and Pritchard. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

    * Correspondence: Martin Segado, Department of Mechanical Engineering, School of Engineering, Massachusetts Institute of Technology, Cambridge, 02139-4307, Massachusetts, United States

    Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.