JALEX: Japanese version of lexical decision database

Ota, Naoto; Mochizuki, Masaya

doi:10.3389/flang.2024.1506509

DATA REPORT article

Front. Lang. Sci., 13 January 2025

Sec. Psycholinguistics

Volume 3 - 2024 | https://doi.org/10.3389/flang.2024.1506509

JALEX: Japanese version of lexical decision database

Naoto Ota¹^*

Masaya Mochizuki²

¹Department of Psychology, Aichi Shukutoku University, Nagakute, Japan
²College of Humanities and Sciences, Nihon University, Setagaya, Tokyo, Japan

1 Introduction

Several lexical databases have been developed in both English-speaking countries and other countries, leading to numerous studies using these resources. A prominent example is the English Lexicon Project (ELP; Balota et al., 2007), a large-scale database containing behavioral data on English word processing. The ELP provides data for two main tasks: the lexical decision task (LDT) and the speeded naming task. Among these, the LDT is the most commonly utilized in word-processing research, largely because (1) it is easy to implement, and (2) it can be conducted online with relative ease (Lieber et al., 2014).

In the LDT, participants are asked to decide as quickly and accurately as possible whether a visually presented string of letters forms a real word or a non-word. By analyzing the response time from when the string is presented until the participant makes a decision, researchers can evaluate the speed of word access and semantic processing. The LDT has been employed not only to assess word processing efficiency and cognitive load but also to investigate the structure of the mental lexicon and concept representation. For example, researchers have examined the relationship between LDT response times and various word properties, including the frequency effect, where more frequent words are processed faster and reexamined using the LDT data (Brysbaert et al., 2011).

Numerous psycholinguistic studies have explored the semantic properties of word recognition using LDT data. Recently, LDT databases have expanded beyond English, with resources available in languages such as Chinese (Tse et al., 2017), French (Ferrand et al., 2018, 2010), and Spanish (Aguasvivas et al., 2018), allowing for more efficient research across languages. For example, researchers have tested hypotheses involving grounded cognition and embodied cognition (Barsalou, 2008) in word recognition and explored the relationship between word recognition and sensorimotor information across various languages, e.g., English (Pexman et al., 2019; Sidhu et al., 2014), French (Lalancette et al., 2024), and Spanish (Alonso et al., 2018). They further examined theoretical predictions with large-scale survey data, often using lexical decision task (LDT) reaction times as the dependent variable in regression analyses. While earlier findings have supported these theories by showing consistent trends across languages, recent discussions have highlighted cross-linguistic variability in these effects (Alonso et al., 2018; Lalancette et al., 2024). Such hypothesis testing using a database reduces stimulus bias by incorporating many words (see Dymarska et al., 2023) and enables new discoveries through cross-linguistic comparisons.

In Japanese, several databases are available, as will be discussed later. For example, databases exist for attributes such as word imageability (Sakuma et al., 2005) and familiarity (Asahara, 2020), each containing evaluative data for tens of thousands of words. These databases have long been used in various ways, such as serving as control variables in numerous Japanese word recognition studies (e.g., Mizuno and Matsui, 2018; Mochizuki and Ota, 2020, 2024). However, no LDT database currently exists for Japanese, posing a challenge to psycholinguistic research on the Japanese language as a result of limited resources. Of course, lexical decision tasks have been widely used in Japanese word recognition studies (e.g., Kawakami, 2002; Kusunose et al., 2013). However, the number of stimulus words used in these studies is significantly smaller compared to databases such as the ELP (Balota et al., 2007). Furthermore, the data are not always publicly available, which limits their utility as resources. Given the increasing emphasis on cross-linguistic validation—particularly in studies of abstract concepts shaped by language and culture (Dove, 2018)—developing a large-scale Japanese LDT database would not only aid Japanese researchers but also contribute to the broader field. Therefore, this study aimed to construct a Japanese version of LDT database.

It is important to note that individual differences in LDT response times exist (e.g., Hawker and Ferraro, 2007; Yates and Slattery, 2019; Lim et al., 2020). To enhance the database, we collected data on participants' individual characteristics following the LDT. Specifically, participants completed the ENDCOREs, which measures interpersonal communication skills (Fujimoto and Daibo, 2007), and the Japanese version of the Plymouth Sensory Imagery Questionnaire (Psi-Q; Fukui and Aoki, 2022). The ENDCOREs assesses six dimensions of communication: self-control, expressiveness, comprehension, assertiveness, acceptance of others, and relational adjustment. The Psi-Q evaluates the vividness of mental imagery across sensory modalities (i.e., vision, sound, smell, taste, touch, body, and emotion), capturing individual differences in multisensory imagery.

Although we do not hypothesize a direct relationship between these individual difference variables and simple LDT response times (e.g., the higher/lower a score, the slower/faster the response time), they may serve as possible predictors for validating certain content. For instance, the grounded or embodied cognition framework (Barsalou, 2020, 2008) posits that processing words or concepts involves simulating the sensory modalities through which they are acquired. Consistent with this, processing words rich in sensorimotor information tends to be more efficient (Lynott et al., 2020; Siakaluk et al., 2008; Sidhu et al., 2014; Sidhu and Pexman, 2016; Tillotson et al., 2008). Individual differences in sensitivity to sensory and motor modalities may interact with word characteristics and influence LDT performance. Furthermore, the “Words as Social Tools” (WAT) perspective (Borghi and Binkofski, 2014) posits that simulating social and linguistic information is crucial for understanding abstract concepts (Borghi et al., 2019). Therefore, words with a stronger social nature may be processed more efficiently (Diveica et al., 2023), and the interaction between verbal sociality and individual sociality may affect LDT response times. Since ENDCOREs reflect an individual's communication skills, individuals with high social interaction skills may find it easier to simulate socially relevant words. Consequently, they might be more efficient in processing abstract words with strong social characteristics. While the present study did not specifically examine the relationship between individual differences and LDT response times, future research could benefit from incorporating these variables into the database.

This report introduces the Japanese LDT database (JALEX), which incorporates individual differences among participants. The response time and accuracy data can be used for future psycholinguistic studies involving Japanese participants. Additionally, while no hypotheses were tested, future research may explore the role of individual differences as needed.

2 Method

2.1 Participants

In the development of psycholinguistic norms, ~30–40 observations per word are typically required (Balota et al., 2007; Ferrand et al., 2018). However, we recruited a relatively large number of participants to account for potential dropouts, as this was an online study, and to develop more reliable norms.

Participants were recruited through a crowdsourcing service Yahoo! Crowdsourcing (https://crowdsourcing.yahoo.co.jp/). A total of 2,689 individuals accessed the task. However, 1,037 either did not start, failed to complete the task, or provided no responses. Ultimately, 1,652 participants completed the task. All participants self-reported as native Japanese speakers. Among them, 1,226 were men, 407 were women, two identified as other genders, and 17 chose not to respond. The mean age was 51.07 years (SD = 11.91), with a range from 18 to 85 years. The participants' highest levels of education were as follows: 26 had completed doctoral programs, 119 had master's degrees, 1,069 were college graduates, 21 had finished high school, 28 had completed junior high school, and 26 chose not to respond. As detailed below, the words were divided into 38 lists. With 1,652 participants, this resulted in ~43 participants per list.¹

2.2 Stimuli

To develop JALEX databases, we selected words with semantic properties listed in multiple extant databases (DBs). This approach ensured consistency with previous word recognition studies and supported continuity in future research. We followed a specific selection procedure. First, we used the Word List by Semantic Principles, revised and enlarged edition (WLSP, National Institute for Japanese Language Linguistics, 2004) as the master list. From this, we selected words that appeared in all eight of the following DBs: the word familiarity DB (Asahara, 2020), an alternate word familiarity DB (Fujita and Kobayashi, 2020), the word frequency DB (Amano and Kondo, 2000), the NINJAL-LWP for TWC word frequency DB (University of Tsukuba, 2013), the word difficulty DB (Kajiwara et al., 2020), the imageability DB for visual words (Sakuma et al., 2005), the semantic orientations DB (Takamura et al., 2005), and the abstractness DB for Japanese words (The Social Computing Laboratory, 2021). Following this procedure, we selected 5,736 Japanese words as stimuli. These included 4,977 nouns, 648 verbs, and 111 adjectives.

For each word, linguistic characteristics such as orthographic neighborhood size (ONS), phonological neighborhood size (PNS), orthographic Levenshtein distance 20 (OLD20, Yarkoni et al., 2008), the number of letters, and the number of morae (a rhythmic unit of sound) were calculated. The PNS was computed by decomposing the “phonetic” (読み) variable in the WLSP (National Institute for Japanese Language Linguistics, 2004) by mora and calculating how many words in the WLSP had one mora replaced. Similarly, the ONS was calculated by decomposing the “letter (見出し本体)” variable in the WLSP into individual characters and determining how many words had one letter replaced. OLD20 was calculated using the old20 function in the vwr package (Keuleers, 2013) in R (R Core Team, 2022), based on the “letter (見出し本体)” variable in the WLSP.

In addition, non-words were constructed as filler items for the LDT. First, from the WLSP, we excluded words with one mora, words containing spaces, symbols, or particles, homophones, and items with repetitive morae (e.g., ha-ha-ha [ha/ha/ha]), as these could not be transformed into non-words using the procedure described below. The remaining items were then decomposed into morae, and each mora was randomly shuffled. If the resulting item was not found in the WLSP, it was considered a non-word candidate. This process yielded 63,305 non-word candidates, from which we randomly selected 5,736 to serve as fillers for the LDT. The authors reviewed these candidates, and those deemed too similar to real words were replaced with different non-word candidates. All non-word stimuli are available for reference on Open Science Framework (OSF).

The words and non-words were randomly divided into 38 lists, each containing 150 or 151 words (150 × 2 + 151 × 36 = 5,736) with an equal number of non-words.

2.3 Procedure

The LDT task was conducted online, and participants accessed the LDT program via their own PCs. The program was created using PsychoPy (Peirce et al., 2019) and hosted on Pavlovia (https://pavlovia.org/). After obtaining informed consent from the participants, they were instructed to begin the task. In the LDT, a blank screen appeared for 200 ms, followed by a fixation point in the center of the screen for 300 ms. A string of characters was then presented, and participants had to decide as quickly and accurately as possible whether the string represented a real Japanese word. The string remained on the screen until a response was made or for up to 2,000 ms. Participants pressed the “L” key for words and the “S” key for non-words. If the response was correct, the task proceeded to the next trial; if incorrect, a feedback message (“Wrong”) appeared in red. If no response was given within 2,000 ms, the feedback message (“Too late”) was displayed in red for 300 ms. Words and non-words were presented in random order. Participants completed 20 practice trials before starting the actual task. The practice trials used different stimuli from those in the actual task.

During the task, participants were allowed to take a break for a maximum of 60 s between the 100 and 200th trials. During the break, their percentage of correct answers was displayed to encourage them to continue. Upon completing the LDT, participants answered the ENDCOREs (Fujimoto and Daibo, 2007) and Psi-Q (Fukui and Aoki, 2022) questionnaires.² Additionally, they provided demographic information, including gender, age, dominant hand, highest level of education, and native language. Data were collected on May 20 and May 21, 2024.

3 Data processing

We calculated the accuracy rate for each participant, and the lowest percentage of correct responses exceeded 75%. Since no participants demonstrated an exceptionally low accuracy rate, data from all participants were retained for analysis.

The procedure for processing the response time data followed that employed in ELP (Balota et al., 2007). First, we extracted only correct trials, where the “L” key was pressed for word stimuli, and excluded any trials with response times below 200 ms. Second, we removed trials that deviated by ±3 SD from the participant's mean response time. This resulted in the exclusion of 1.96% of trials as outliers.

4 Dataset overview

The distribution of response times averaged by item is showed in Figure 1. We presented the partial correlations with existing DB variables referenced in stimulus selection to examine the convergent validity of the response time data (Figure 2). These findings confirmed the phenomena predicted in prior studies. Specifically, we confirmed the frequency effect (Rubenstein et al., 1971) and familiarity effect (Connine et al., 1990), where lexical decision times decrease as word frequency and familiarity increase. We also observed the imageability effect (Balota et al., 2004), where higher imageability leads to faster lexical decisions, and the orthographic similarity effect (Yarkoni et al., 2008), where greater Levenshtein distance results in longer response times. While few studies have reported simple or partial correlations with these variables in Japanese, several experimental studies using Japanese words as stimuli have observed effects similar to those identified in the present study. For instance, Japanese word recognition research has reported faster word processing for words with higher imageability (Ogawa and Nittono, 2018) and higher frequency (Mizuno and Matsui, 2015). The relationship between response time and difficulty rating (Kajiwara et al., 2020) has yet to be investigated. However, it is reasonable to predict that more difficult words would require longer processing times for comprehension. The current analysis identified a slight positive correlation between word difficulty and response time. These findings suggest that JALEX is valid to a considerable extent.

Figure 1

Figure 1. Histogram of response time in lexical decision task (LDT).

Figure 2

Figure 2. Partial correlation between measures in LDT and each variable. Crosses in the figure indicate non-significant correlations. LDT, Lexical Decision Task; meanRT, Mean response time in the LDT; ACC, Mean accuracy in the LDT; ONS, Orthographic Neighborhood Size; PNS, Phonological Neighborhood Size; OLD20. Orthographic Levenshtein Distance 20; were calculated by the authors. The following variables are derived from previous studies: difficulty (Kajiwara et al., 2020), frequency (log-transformed, University of Tsukuba, 2013), familiarity (Fujita and Kobayashi, 2020), imageability (Sakuma et al., 2005), orientation (semantic orientations, Takamura et al., 2005), and abstractness (The Social Computing Laboratory, 2021). The diagonal elements in the matrix shown in the figure represent the mean and standard deviation for each variable, while the values at the bottom indicate the range of each variable.

The partial correlations between familiarity, semantic orientation, abstractness, and response time were significant, but the effects were small. Of these, the zero-order correlation for familiarity was r = −0.42, suggesting that higher familiarity facilitates responses when not adjusted for covariates. Zero-order correlations for abstractness revealed a small effect (r = 0.11), indicating that processing was slightly suppressed for more abstract words, consistent with the representativeness effect (Cortese and Balota, 2012). This study also found that words with high ONS had shorter lexical decision times. The results showed that high ONS words took less time to judge than low ONS words when using Kanji words (Mizuno and Matsui, 2014), which is consistent with the current results. However, when using Katakana words, the inhibitory effect was observed, indicating that low ONS words took less time to judge than high ONS words (Kawakami, 2002). Furthermore, an interaction between ONS and PNS has also been observed in lexical decision performance for katakana words (Hino et al., 2011). The difference in these results may be caused by the limited number of words used in the experiment and the factors of the orthographic form. In studies using word norms, it is particularly important to consider the extent of word coverage and the absence of bias (Dymarska et al., 2023). The failure to replicate the effects observed in previous studies in the present analysis of a relatively large database may be attributed to biases in the stimulus sets used in those studies, which could have significantly influenced their results. Future research should assess the reproducibility of findings from previous studies by leveraging large databases, such as JALEX, and conducting comprehensive analyses.

5 Theoretical implications and future studies

In the present study, the imageability effect (Balota et al., 2004) was replicated even after controlling for linguistic statistical variables, such as the frequency of neighboring words. The effects of psycholinguistic variables, such as the imageability effect, are often discussed in relation to semantic richness (Pexman et al., 2013). Semantic richness refers to the idea that words associated with more semantic information have richer semantic representations, enabling them to be processed more quickly and accurately. In other words, our study replicates in Japanese the finding that the ease of forming a mental image is an important semantic variable in word representations. As discussed in the introduction, the relationship between sensorimotor information and word recognition is explained by the concept of semantic richness—specifically, the richness of the semantic dimension of sensorimotor information facilitates word recognition. In future studies, it will be important to investigate the nature of concept representations by examining psycholinguistic variables influencing word recognition beyond imageability.

Furthermore, this study is the first DB of LDT to include individual difference variables for respondents, paving the way for future research on individual differences using JALEX. In word recognition research, it has been observed that certain words exhibit significant individual differences and high variability in ratings of psychological variables (Paisios et al., 2023). A key limitation of the previous DB of LDT is that they did not provide individual difference data for participants, making them unsuitable for studying individual differences in words with high variability in ratings among individuals. Future research using JALEX is expected to refine further grounded cognition theory (Barsalou, 2020, 2008) and advance WAT theory (Borghi and Binkofski, 2014), particularly by promoting individual difference studies on the simulation of sensorimotor information and those related to social communication.

6 Strengths and limitations

This database represents the most comprehensive dataset on the efficiency of Japanese visual word processing and stands as a powerful resource for future research in psychology and linguistics. A unique feature of this database is its inclusion of individual difference variables for participants, allowing researchers to analyze these differences in future studies.

However, it is important to note that some words in the dataset had lower accuracy rates. For example, at least 15 items had a correct response rate below 70%, with fewer than 20 observations. Items with fewer observations may exhibit lower reliability and reproducibility compared to others. While we did not exclude these items in the current analysis, researchers should be mindful of their presence when using the database.

Data availability statement

The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found at: https://osf.io/qr2sg, Open Science Framework.

Ethics statement

The studies involving humans were approved by Ethics Committee of College of Humanities and Sciences, Nihon University. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.

Author contributions

NO: Conceptualization, Formal analysis, Validation, Visualization, Writing – original draft. MM: Conceptualization, Data curation, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Writing – review & editing.

Funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This work was supported by JSPS KAKENHI Grant Number: JP24K15684.

Acknowledgments

The authors would like to thank all the participants of the study.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Gen AI was used in the creation of this manuscript.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Footnotes

1. ^Note that, due to program constraints, not exactly the same number of participants per list was allocated.

2. ^As previously mentioned, we do not assume a direct relationship between individual difference variables and reaction time. Consequently, we did not investigate the association between these variables in this study.

References

Aguasvivas, J. A., Carreiras, M., Brysbaert, M., Mandera, P., Keuleers, E., and Duñabeitia, J. A. (2018). SPALEX: a Spanish lexical decision database from a massive online data collection. Front. Psychol. 9:2156. doi: 10.3389/fpsyg.2018.02156

PubMed Abstract | Crossref Full Text | Google Scholar

Alonso, M. Á., Díez, E., Díez-Álamo, A. M., and Fernandez, A. (2018). Body-object interaction ratings for 750 Spanish words. Appl. Psycholingust. 39, 1239–1252. doi: 10.1017/S0142716418000309

Crossref Full Text | Google Scholar

Amano, S., and Kondo, K. (2000). NTT Database Series. Lexical Properties of Japanese, No. 7. Frequency (Tokyo: Sanseido).