- 1Department of Special Education and Communication Disorders, University of Nebraska–Lincoln, Lincoln, NE, United States
- 2Department of Speech and Hearing Science, University of Illinois at Urbana-Champaign, Champaign, IL, United States
Standardized, norm-referenced language assessment tools are used for a variety of purposes, including in education, clinical practice, and research. Unfortunately, norm-referenced language assessment tools can demonstrate floor effects (i.e., a large percentage of individuals scoring at or near the lowest limit of the assessment tool) when used with some groups with neurodevelopmental disorders (NDDs), such as individuals with intellectual disability and neurogenetic syndromes. Without variability at the lower end of these assessment tools, professionals cannot accurately measure language strengths and difficulties within or across individuals. This lack of variability may be tied to poor representation of individuals with NDDs in normative samples. Therefore, the purpose of this study was to identify and examine common standardized, norm-referenced language assessment tools to report the representation of individuals with NDDs in normative samples and the range of standard/index scores provided. A systematic search identified 57 assessment tools that met inclusion criteria. Coding of the assessment manuals identified that most assessment tools included a “disability” or “exceptionality” group in their normative sample. However, the total number of individuals in these groups and the number of individuals with specific NDDs was small. Further, the characteristics of these groups (e.g., demographic information; disability type) were often poorly defined. The floor standard/index scores of most assessment tools were in the 40s or 50s. Only four assessment tools provided a standard score lower than 40. Findings of this study can assist clinicians, educators, and researchers in their selections of norm-referenced assessment tools when working with individuals with NDDs.
Introduction
Because the development of language is critical to meeting the demands of everyday life, accurate assessment of language is critical for diagnosing primary language disorders, identifying secondary language difficulties across other neurodevelopmental disorders (NDDs), and ultimately developing effective, targeted intervention and treatment plans that include monitoring progress over time. However, most commonly used assessment tools were not developed specifically for individuals with NDDs, and professionals who work with this population are often left without much guidance as to which tools to select. Therefore, the purpose of this study is to identify and examine common standardized, norm-referenced assessment tools of language to report the representation of individuals with NDDs in normative samples and the range of standard/index scores provided. This information can assist clinicians, educators, and researchers in their selections of norm-referenced assessment tools when working with individuals with NDDs.
Neurodevelopmental disorders are common in the United States, with birth cohort data (n > 3.3 million children) reporting that by 8 years of age, 23.9% of publicly insured children and 11% of privately insured children had a diagnosis of one or more NDDs (Straub et al., 2022). NDDs include a range of conditions resulting from either a genetic or multifactorial etiology (i.e., a combination of genetic and environmental factors) that occur during the developmental period and that are characterized by delays in cognition, communication, behavior, and/or motor skills (American Psychiatric Association [APA], 2013; Van Herwegen et al., 2015; World Health Organization [WHO], 2019). These conditions impact personal, social, academic, and/or occupational functioning (American Psychiatric Association [APA], 2013; Van Herwegen et al., 2015; World Health Organization [WHO], 2019). Specific NDDs include intellectual disability, communication disorders, autism, attention-deficit/hyperactivity disorder (ADHD), neurodevelopmental motor disorders (e.g., developmental coordination disorder, stereotypic movement disorder, and tic disorders), specific learning disorders, and some neurogenetic syndromes (e.g., Down syndrome, fragile X syndrome, and Williams syndrome). Different NDDs often co-occur; for example, individuals with autism may also have an intellectual disability or ADHD, and individuals with Down syndrome typically also have an intellectual disability (American Psychiatric Association [APA], 2013; Van Herwegen et al., 2015).
Many individuals with NDDs experience difficulties with language, though the exact pattern of these difficulties varies across diagnoses and individuals (e.g., Luyster et al., 2011). Some individuals have an NDD in which the primary diagnosis is specific to language. For example, developmental language disorder (under the umbrella of communication disorders) is linked to difficulties with pragmatics and structural aspects of language (e.g., Reed, 2018). Other individuals may have a different primary NDD but still also have language difficulties. For example, ADHD is often associated with secondary language difficulties in pragmatics (e.g., Geurts and Embrechts, 2008; Hawkins et al., 2016; Helland et al., 2016). Individuals with intellectual disability and neurogenetic syndromes also experience a range of difficulties in spoken language (Abbeduto et al., 2016; McDuffie et al., 2017), but the exact patterns of strength and difficulty often vary across different etiologies. For example, individuals with Down syndrome typically have relative strengths in vocabulary but more significant difficulties in grammar and syntax, whereas individuals with Williams syndrome tend to have relative strengths in concrete vocabulary but difficulties with relational vocabulary and pragmatics (Abbeduto et al., 2019).
One of the most common ways to measure language abilities is via standardized, norm-referenced language assessment tools. Norm-referenced assessment tools refer to those that have been tested (i.e., “normed”) on a large group of individuals meant to represent the age and demographic makeup of those for whom the test is intended to be used (Peña et al., 2006). When the assessment is administered in a standardized way, as outlined in the administration manual, an individual’s performance can then be compared to that of the normative sample to see how the individual compares to peers of a similar age and demographic makeup. However, the exact makeup of the normative sample can influence the scores of a norm-referenced assessment tool and its outcomes for the individual who is assessed (Peña et al., 2006; Spaulding et al., 2006). Thus, which norm-referenced language assessment tool a professional should use depends on the purpose of the assessment and the individual being assessed.
Two primary purposes of language assessment are to (1) diagnose language disorders and (2) describe language profiles. When the primary purpose of a language assessment is for diagnosis, a professional may want to select a norm-referenced assessment tool that did not include individuals with disabilities in the normative sample. Individuals with a primary language disorder may exhibit subtle, yet meaningful, language delays in which scores fall close to the diagnostic cut-off. In these cases, if individuals with disorders were included in the normative sample of the assessment tool being used, the normative group mean would be lower, with an increased standard deviation, resulting in decreased classification accuracy for identifying language impairment (i.e., a missed diagnosis), as demonstrated in a simulation study by Peña et al. (2006). On the other hand, for individuals with NDDs whose primary diagnosis is something other than a communication disorder (e.g., intellectual disability), the purpose of a language assessment is not typically for diagnosis but rather to describe their language profile and/or to identify their areas of strength or difficulty. This information can be used to guide intervention and academic planning. In these instances, it is important that norm-referenced assessment tools are not only reliable and valid for use in this population but that they also capture a wide range of skill levels, including at the lower-performing end where individuals with intellectual disability often fall.
Unfortunately, many standardized, norm-referenced assessment tools are not normed beyond three or four standard deviations below the normative mean, causing many participants with NDDs, such as individuals with intellectual disability, to score at the floor on standard/index scores (e.g., cf. Spaulding et al., 2006; Kasari et al., 2013; DiStefano et al., 2020). Floor effects occur when a large percentage of individuals have standard scores at or near the lowest limit of an assessment tool because its measurement range does not extend low enough to capture low levels of skills/performance (Hessling et al., 2004; McBee, 2010; Zhu and Gonzalez, 2017). Floor effects limit variability or separation in standard scores at the lower end of the assessment tool, and information regarding true differences across individuals is lost. These compressed scores, in turn, prevent researchers, clinicians, and educators from accurately capturing language strengths and difficulties within or across individuals and from tracking if individuals make clinically meaningful change/gains over time (Hessl et al., 2009; Sansone et al., 2014; Esbensen et al., 2017).
This issue of compressed scores is reflected in recent calls for the development of appropriate outcome measures for individuals with intellectual disability and neurogenetic syndromes (Esbensen et al., 2017; Hendrix et al., 2020). Floor effects have even been linked to recent failed pharmacological clinical trials for individuals with neurogenetic syndromes (Berry-Kravis et al., 2013; Budimirovic et al., 2017; Esbensen et al., 2017; see Abbeduto et al., 2020 for an overview; Baumer et al., 2022). Thus, many researchers, clinicians, and educators working with individuals with NDDs are pushing to develop more sensitive measures for use with these populations. Although there has been research addressing floor effects in cognitive/IQ tests (Hessl et al., 2009; Sansone et al., 2014), this line of research has not yet been extended to norm-referenced language assessment tools. At the same time, there is, and will continue to be, a need to use norm-referenced language assessment tools with this population, especially in clinical practice. Therefore, professionals who are assessing individuals who have an NDD that is not a primary language disorder and who are likely to score at the lower level of the assessment (e.g., intellectual disability) should select a norm-referenced assessment tool that has a low floor, to improve their ability to identify areas of strength and difficulty.
Given the variability in language skills across and within individuals with NDDs, and the various purposes of norm-referenced language assessment tools, researchers, educators, and clinicians need to be able to make informed decisions to select assessment tools that best meet their needs. Some may need norm-referenced assessment tools that did not include individuals with NDDs in their normative samples for better classification/diagnostic accuracy. Others may need norm-referenced assessment tools that have included individuals with NDDs in their normative samples, or at least that demonstrate variability at lower-performing ends of the assessment tool. Unfortunately, information on normative samples and psychometric properties is often not easily accessible before purchase, making it difficult to identify if a specific norm-referenced assessment tool meets one’s needs. Therefore, the purpose of this study was to:
1) Identify common standardized, norm-referenced assessment tools of language.
2) Report the representation of individuals with NDDs in their normative samples and the range of standard/index scores available.
Materials and Methods
Inclusion Criteria
To be included in our review, language assessment tools had to be a direct measure of oral language (e.g., the assessment tool could focus on any aspect of oral language, including phonology, but could not focus exclusively on articulation/speech or mostly on academics), have been published in the last 20 years (i.e., in or after 2002), and have been normed in the United States for individuals 22 years or younger (i.e., covers the developmental period; Schalock et al., 2021). In addition, the measure had to be published in English and commercially available for purchase by a main publishing house in the United States. Five main publishing houses in the United States were identified for review: Brookes Publishing, PARInc, Pearson, ProEd, and WPS. Screeners and caregiver-, teacher-, or self-report measures were not included.
Procedures
Identification of Assessment Tools
Each of the five publishing houses’ websites was reviewed by two independent research assistants. The research assistants reviewed all assessment tools listed or tagged on the website as “speech and language” (or similar). Using the search function, they also searched for each of the following terms: “language,” “grammar,” “syntax,” “morphology,” “vocabulary,” “phonology,” “pragmatics,” “listening comprehension,” and “auditory processing.” Research assistants excluded any assessment tools that clearly did not meet the inclusion criteria but defaulted to including any language assessment tools that were unclear as to whether or not they met the study’s inclusion criteria. The first three authors made the final decisions on which assessment tools to include in situations of discrepancies across reviewers or when all reviewers were unsure. This process resulted in the identification of 55 assessment tools.
The assessment tool list was then reviewed by one university speech-language clinic director and one speech-language pathology clinical assistant professor with expertise in school-age language disorders. The clinicians were asked to review the list of assessment tools to determine if any language assessment tools were missed in the review. This process resulted in the inclusion of two additional assessment tools for a total of 57 assessment tools.
Coding of Assessment Tools
Following the identification of assessment tools, each assessment tool’s administration or technical manual was independently reviewed and coded by two research assistants for the variables listed below. Discrepancies were identified and resolved by the first and fourth author, with assistance from research assistants, by consulting the assessment tool manual.
Variables
Full Normative Samples
The full normative sample of each assessment tool was coded for the total sample size and demographic information, including sex/gender, race/ethnicity, and geographic region. Each assessment tool was also coded for the chronological ages it was normed for and if the socioeconomic status of its normative sample was considered/reported.
Standard/Index Scores
We also documented the minimum and maximum standard/index scores provided by each assessment tool.
Inclusion of Individuals With Disabilities and Specific Neurodevelopmental Disorders in the Normative Sample
Many assessment tools included individuals with “disabilities” or “exceptionalities” in their normative samples without clearly differentiating disability type. For this reason, each assessment tool was coded for the total number of individuals with disabilities included in the normative sample. When possible, this information was also reported by disability type, including specific NDDs (e.g., number with intellectual disability, autism spectrum disorder, ADHD, learning disability). Demographic information was also coded for disability groups.
Results
Full Normative Samples
Demographic information on the full normative samples is reported in Table 1. A majority (n = 45/57) of assessment tools had normative sample sizes of over 1,000 individuals with relatively equal numbers of males and females. These samples included individuals from all regions of the United States, though five assessment manuals did not specify where their participants were from, and one did not have participants representing all regions of the United States. Sample diversity (defined in terms of race and ethnicity) was reported for all but one assessment tool [i.e., the Bilingual English-Spanish Assessment, BESA (Peña et al., 2018)] and varied across assessment tools. Most assessment tools (n = 49/57) considered some aspect of socioeconomic status (e.g., maternal education, income, and/or percentage receiving free or reduced lunch).
Standard/Index Scores
The floor standard/index score of most assessment tools was in the 40s (n = 27/57) or 50s (n = 23/57). Three assessment tools had floor scores in the 60s [i.e., Clinical Assessment of Pragmatics, CAPs (Lavi, 2019); Communication and Symbolic Behavior Scales, Normed Edition, CSBS (Wetherby and Prizant, 2002); Listening Comprehension Test, LCT-2 (Bowers et al., 2006)]. Only four measures from our list provide a standard score lower than 40. The Phonological Awareness Test, Second Edition: Normative Update (PAT-2:NU; Robertson and Salter, 2018) provides standard scores down to 39. The WORD Test 3 Elementary (WORD-3; Bowers et al., 2014) provides scores down to −9. The Test of Adolescent and Adult Language (TOAL-4; Hammill et al., 2007) provides scores down to 34, and the Test of Early Communication and Emerging Language (TECEL; Huer and Miller, 2011) provides standard scores down to 25.
Inclusion of Individuals With Disabilities and Specific Neurodevelopmental Disorders in the Normative Sample
Number of Individuals With Disabilities and Neurodevelopmental Disorders in Normative Samples
Of the 57 assessment tools, 52 indicated that they included individuals with disabilities in the normative sample (numbers and percentages of individuals with disabilities and specific NDDs are reported in Table 2). However, five assessment tools did not include or report on any individuals with disabilities in their normative sample: the BESA (Peña et al., 2018), the Test of Integrated Language and Literacy Skills (TILLS; Nelson et al., 2016), the Test of Phonological Awareness, Second Edition Plus (TOPA-2+; Torgeson and Bryant, 2004), the Test of Semantic Skills Primary (TOSS-P; Huisingh et al., 2002), and the Vocabulary Assessment Scales – Expressive (VAS-E) and Receptive (VAS-R; Gerhardstein Nader, 2013). These assessment tools are therefore not included in Table 2.
Nine assessment tools indicated that they may have included some individuals with disabilities, or alternatively did not exclude all individuals with disabilities. However, they did not track and/or specify if/how many individuals with disabilities were included. To be as inclusive as possible, these assessment tools are reported in Table 2.
For the remaining 43 assessment tools that clearly included individuals with disabilities in their normative samples, the total number varied across assessment tools. However, in most cases, this was a low percentage of the normative sample (ranging from 3 to <26%). Only six assessment tools had normative samples in which 20% or more of the normative sample had a disability or an “exceptionality status”: Clinical Evaluation of Language Fundamentals, Fifth Edition (CELF-5; Wiig et al., 2013), Comprehensive Receptive and Expressive Vocabulary Test, Third Edition (CREVT-3; Wallace and Hammill, 2013), Khan-Lewis Phonological Assessment, Third Edition (KLPA-3; Khan and Lewis, 2015), Social Language Development Test – Adolescent: Normative Update (SLDT-A:NU; Bowers et al., 2017a), Test of Language Development – Intermediate: Fifth Edition (TOLD-I:5; Hammill and Newcomer, 2019), Test of Pragmatic Language, Second Edition (TOPL-2; Phelps-Terasaki and Phelps-Gunn, 2007). Another 21 assessment tools had normative samples in which 10–19% had a disability. Fifteen assessment tools had normative samples in which less than 10% had disabilities, and one assessment tool [i.e., the Auditory Processing Abilities Test, APAT (Ross-Swain and Long, 2004)] had between 9 and 16%, though the exact percentage was unclear. Further, the overall sample size (n) of individuals with disabilities was not reported in all assessment tools. When possible, we estimated the overall percentage of individuals with disabilities based on the available information (e.g., reported n’s of specific NDDs). This method does not account for dual-diagnoses, though, so the reported number may be smaller than estimated.
Descriptions and Demographic Information of Individuals With Disabilities and Neurodevelopmental Disorders in Normative Samples
The makeup (i.e., disability type and demographic information) of individuals with disabilities was often poorly defined for these 43 assessment tools. Ten assessment manuals did not specify what type(s) of disabilities were represented in their normative sample (i.e., the number of individuals with specific disabilities or NDDs such as intellectual disability or learning disabilities). Another 10 assessment tools only reported 2–3 disability groups (e.g., language impairment and “other/special education”). Further, no assessment tools reported race, ethnicity, or socioeconomic status information specifically for subgroups of individuals with disabilities in their normative samples.
Discussion
The purpose of this study was to identify the number and characteristics of individuals with NDDs in commonly used and commercially available standardized, norm-referenced language assessment tools. Our findings indicate that many of these assessment tools, though not all, did include some individuals with “disabilities” in their normative sample. However, the number of individuals with specific types of disabilities or NDDs was often very low, and minimal demographic information was provided about groups with disabilities.
We identified 43 assessment tools that included individuals with disabilities in their normative samples. These “disability” groups typically included individuals with any disabilities, not just NDDs, and the groups were often not broken down by disability type. Therefore, the number of individuals with NDDs, specifically, in the normative samples was often unclear. There was high variability in the percentages of individuals with “disabilities” in the normative samples, ranging from 3% to <26%. These rates align with some available prevalence data on individuals with disabilities in the United States. For example, 2019 United States census data reveal that 4.3% of children under 18 have a disability (Young, 2021), and 2020–2021 United States special education data indicate that 15% of 3-to-21-year-olds receive services under the Individuals with Disabilities Education Act (National Center for Education Statistics, 2022). However, the percentage of children with NDDs reported from public or private insurance data is even higher [i.e., 23.9 and 11%, respectively (Straub et al., 2022)]. It would be helpful if our results could be easily interpreted within the available prevalence data, but similar to the reporting of disabilities within the assessment tools we reviewed, these data are also difficult to interpret and vary based on how disabilities are defined. This presents a barrier to the selection of standardized assessment tools for these populations.
Another barrier is the lack of demographic information (i.e., race, ethnicity, and socioeconomic status) provided about the individuals with disabilities or NDDs in the normative samples of the assessment tools. Without this information, the diversity of the individuals in these groups is unknown. It is possible, for example, that there were no Black individuals with autism included in some normative samples or no Hispanic individuals with intellectual disability. Thus, it is unknown if the individuals with disabilities who were included are representative of these groups as a whole, including across race, ethnicity, and socioeconomic status. Together with the lack of definition of “disability” provided by many of the assessment tools we reviewed, it is unclear if their normative samples are representative of the population of individuals with disabilities or NDDs in the United States.
Considerations for the Selection of Norm-Referenced Language Assessment Tools
There are several scenarios in which clinicians, educators, and researchers must use standardized, norm-referenced language assessment tools with individuals who have, or who are suspected of having, NDDs. The information extracted from this study can be used by these professionals to guide the selection of such assessment tools and while interpreting their scores.
The decision about which language assessment tools are most suitable depends on the specific population of interest and the intended purpose of the assessment. Professionals using norm-referenced assessments tools to identify if an individual has a primary diagnosis of a communication disorder may want to choose an assessment tool that does not include individuals with “disabilities” in the normative sample because they are trying to determine if an NDD (e.g., a communication disorder) is present or absent. To make this determination, an individual’s score should be compared to a normative sample of peers who do not have an NDD. Relatedly, professionals using norm-referenced assessment tools to determine if clients qualify for services (e.g., special education services) may also wish to use assessment tools that do not include individuals with disabilities in their samples, because, as Peña et al. (2006) demonstrated, the presence of individuals with disabilities in the normative sample can lower the level of performance that falls within the average range. Consequently, it becomes less likely that an individual who has a disability will score below the average range and thus be eligible for services. This is particularly important when evaluating an individual with relatively mild delays. In our review, this included the BESA (Peña et al., 2018), the TILLS (Nelson et al., 2016), the TOPA-2+ (Torgeson and Bryant, 2004), the TOSS-P (Huisingh et al., 2002), and the VAS-E and VAS-R (Gerhardstein Nader, 2013).
In contrast, professionals working with individuals with NDDs may select a standardized assessment tool for the purpose of identifying areas of strength and need to support intervention and educational planning. This may be common when an individual has a primary diagnosis of an NDD other than a communication disorder, in which language is also affected (e.g., intellectual disability). In such cases, it is ideal to select an assessment tool that has been developed and normed with others who have a similar NDD and who are demographically similar to their client. This is especially important when working with clients who have more severe disabilities and who are at risk of performing at the floor level of an assessment tool. Thus, yet another important consideration is the range and floor level of the standard/index scores. Those with lower floors may allow for more separation between scores at the lower-performing end. This, in turn, allows for greater differentiation across specific skills or language domains. These assessment tools are also better options for professionals who are using norm-referenced assessment tools to monitor progress over time (e.g., clinical gains or intervention success). In our review, we identified that the floor score of some language assessment tools is in the 60s, while others include scores between 50 and 40, and only four had standard scores of 40 or lower. Those with floors below 40 were the PAT-2:NU (Robertson and Salter, 2018), the WORD-3 (Bowers et al., 2014), the TOAL-4 (Hammill et al., 2007), and the TECEL (Huer and Miller, 2011).
Similarly, researchers who are documenting patterns of strength and difficulty to inform the field about different NDD phenotypes should also consider selecting norm-referenced assessment tools that have included individuals with NDDs and that have a wide range of standard/index scores with lower floors. This allows for more nuance in understanding the variability among participant samples, especially at the lower-performing end. The ability to include participant samples with more diverse language profiles can lead to more precise phenotyping that can ultimately be applied to develop evidence-based language interventions. This could also improve the likelihood of intervention success because intervention studies and clinical trials often fail to demonstrate response to treatment due in part to poor outcome measures (see Esbensen et al., 2017; Abbeduto et al., 2020). If language assessment tools can better differentiate among different language profiles, it may be possible for researchers to specify who does and does not respond to certain interventions. When researchers select measures that do not include individuals with NDDs in the normative sample, the interpretation of skills and abilities is reduced to comparisons with neurotypical peers. Instead, if individuals with NDDs are compared to individuals with other NDDs (e.g., Down syndrome vs. intellectual disability), areas of unique strength and need can be identified and used in treatment planning.
Future Directions and Recommendations for Holistic Language Assessment for Individuals With Neurodevelopmental Disorders
Several researchers have noted the limited utility of standardized, norm-referenced assessment tools for individuals with certain NDDs (e.g., intellectual disability and neurogenetic syndromes) and have started developing more sensitive measures for these populations (e.g., Berry-Kravis et al., 2013; Budimirovic et al., 2017; Esbensen et al., 2017; Abbeduto et al., 2020; Baumer et al., 2022). For example, Brady et al. (2012, 2018, 2020) developed the Communication Complexity Scale to assess communication skills in individuals who have intellectual disabilities and are minimally speaking, and Abbeduto et al. (2020) and Thurman et al. (2021) developed an expressive language sampling procedure for use with individuals with intellectual disability and neurogenetic syndromes. These measures capture more variability in language and communication skills in individuals with intellectual and developmental disabilities, with demonstrated evidence of their usefulness as outcome measures. Thus, professionals working with individuals with NDDs should consider these measures when tracking progress over time. These assessment tools can also be used by professionals working with neurotypical individuals; for example, Channell et al. (2018) documented that expressive language sampling in the context of narration showed age-related increases in syntactic complexity and lexical diversity from 4 years up until 18.5 years. As these language assessment tools continue to be tested and examined, professionals may have more options in which to assess clients with NDDs.
In addition to these language/communication sampling assessment tools, there will continue to be a need for norm-referenced language assessment tools for use with individuals with NDDs. Thus, in the future, test developers should not only consider including a more representative number of individuals with NDDs in their normative samples but also as part of the iterative test development and standardization processes. Test developers should also consider the possibility of including separate norms for individuals with NDDs and/or who perform at the lower ends of their assessment tool (e.g., Hendrix et al., 2020). Importantly, test developers should better define the characteristics of individuals with disabilities who are included and seek to include diverse samples of individuals with disabilities. Information about the normative sample composition is a critical part of assessment tool selection; therefore, the inclusion of this information would aid professionals when determining the best assessment tool for an individual client or student.
Until then, standardized, norm-referenced assessment tools that do not include individuals with disabilities broadly and/or NDDs specifically can still be used when working with this population. In particular, professionals can examine item-level performance and/or use growth or deviation scores to track change over time (e.g., Sansone et al., 2014). Professionals should also continue to use holistic approaches to assessment when working with individuals with NDDs by supplementing norm-referenced assessment tools with additional non-standardized assessment tools and dynamic assessment methods (Haywood and Tzuriel, 2002; Grigorenko, 2009).
Study Limitations
There are several limitations to note in the current study. First, this review focused on normative samples, specifically. Many of the reported assessment tools have conducted follow-up validity or clinical research studies to test their measure on small groups of individuals with disabilities or NDDs. Although these participants are not included in the normative sample, the information can still be helpful for understanding if an assessment tool is appropriate for use with individuals with NDDs (e.g., if it will capture variability at lower ends, if items are appropriate, and/or if it yields valid and reliable scores in these populations). Future studies could review these validation studies to provide a comprehensive summary of the additional testing that has been conducted. Another limitation of the current study was that it excluded norm-referenced academic assessment tools that include a language subtest, as well as screeners and caregiver-, teacher-, and self-report measures. Therefore, we are unable to comment on their normative samples. Similarly, our review was limited to language, and we therefore cannot comment on norm-referenced measures of speech, other communication skills, or cognition more broadly. Lastly, although all discrepancies were resolved, we did not track the percentage of agreement across reviewers for the identification and coding of assessment tools and therefore cannot report inter-rater reliability.
Conclusion
Researchers, clinicians, and educators who work with individuals with NDDs must often use standardized, norm-referenced language assessment tools. Unfortunately, many norm-referenced assessment tools have floor effects when used with individuals with intellectual disability or neurogenetic syndromes. We proposed that these floor effects may be due, in part, to the limited inclusion of individuals with NDDs in normative samples. However, even if some professionals wanted to use norm-referenced assessment tools that included individuals with NDDs in their normative samples, or at least that demonstrate variability at lower-performing ends of the assessment tool, this information can be difficult to access. Therefore, we reviewed and reported the representation of individuals with disabilities and NDDs in the normative samples of standardized, norm-referenced language assessment tools, as well as the range of standard/index scores provided. This information can be used to guide professionals’ selections of assessment tools, based on the individual or sample of individuals they are working with and the purpose of the assessment.
Data Availability Statement
The original contributions presented in this study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.
Author Contributions
SL, MC, and LM conceptualized the study and drafted and edited the manuscript. SL and AB reviewed and coded assessments and drafted and edited tables for the Results section. All authors contributed to the article and approved the submitted version.
Funding
This project was funded by startup funds from the Department of Special Education and Communication Disorders and the College of Education and Human Sciences at the University of Nebraska–Lincoln.
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s Note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Acknowledgments
We would like to thank Dr. Kristy Weissling and Ms. Jennifer Dahman for their reviews and recommendations for which assessments to include. We would also like to thank our research assistants: Hailey Droge, Claire Kubicek, Katelyn Pick, Anna Suppes, and Abbie Zoucha.
References
Abbeduto, L., Berry-Kravis, E., Sterling, A., Sherman, S., Edgin, J. O., McDuffie, A., et al. (2020). Expressive language sampling as a source of outcome measures for treatment studies in fragile X syndrome: feasibility, practice effects, test-retest reliability, and construct validity. J. Neurodev. Disord. 12:10. doi: 10.1186/s11689-020-09313-6
Abbeduto, L., McDuffie, A., Thurman, A., and Kover, S. T. (2016). Language development in individuals with intellectual and developmental disabilities: from phenotypes to treatments. Int. Rev. Res. Dev. Disabil. 50, 71–118. doi: 10.1016/bs.irrdd.2016.05.006
Abbeduto, L., Thurman, A. J., Bullard, L., Nelson, S., and McDuffie, A. (2019). “Genetic syndromes associated with intellectual disabilities,” in Handbook of Medical Neuropsychology, eds C. Armstrong and L. Morrow (Cham: Springer), 263–299. doi: 10.1007/978-3-030-14895-9_13
American Psychiatric Association [APA] (2013). Neurodevelopmental Disorders Diagnostic and Statistical Manual of Mental Disorders: DSM-5, Vol. 5. Washington, DC: American Psychiatric Association. doi: 10.1176/appi.books.9780890425596
Bankson, N. W., and Bernthal, J. E. (2020). Bankson-Bernthal Test of Phonology, 2nd Edn. Austin, TX: Pro-Ed, Inc. [Examiner’s Manual].
Bankson, N. W., Mentis, M., and Jagielko, J. R. (2018). Bankson Expressive Language Test, 3rd Edn. Austin, TX: Pro-Ed. [Examiner’s Manual].
Baumer, N. T., Becker, M. L., Capone, G. T., Egan, K., Fortea, J., Handen, B. L., et al. (2022). Conducting clinical trials in persons with Down syndrome: summary from the NIH INCLUDE Down syndrome clinical trials readiness working group. J. Neurodev. Disord. 14:22. doi: 10.1186/s11689-022-09435-z
Berry-Kravis, E., Hessl, D., Abbeduto, L., Reiss, A. L., Beckel-Mitchener, A., Urv, T. K., et al. (2013). Outcome measures for clinical trials in fragile X syndrome. J. Dev. Behav. Pediatr. 34, 508–522. doi: 10.1097/DBP.0b013e31829d1f20
Blank, M., Rose, S. A., and Berlin, L. J. (2003). Preschool Language Assessment Instrument, 2nd Edn. Austin, TX: Pro-Ed. [Examiner’s Manual].
Bowers, L. B., Huisingh, R., and LoGiudice, C. (2006). Listening Comprehension Test, 2nd Edn. Austin, TX: Pro-Ed. [Examiner’s Manual].
Bowers, L. B., Huisingh, R., and LoGiudice, C. (2017a). Social Language Development Test – Adolescent: Normative Update. Austin, TX: Pro-Ed. [Examiner’s Manual].
Bowers, L. B., Huisingh, R., and LoGiudice, C. (2017b). Social Language Development Test – Elementary: Normative Update. Austin, TX: Pro-Ed. [Examiner’s Manual].
Bowers, L. B., Huisingh, R., LoGiudice, C., and Orman, J. (2018b). Expressive Language Test, 2nd Edn Normative Update. Austin, TX: Pro-Ed. [Examiner’s Manual].
Bowers, L. B., Huisingh, R., and LoGiudice, C. (2018a). Listening Comprehension Test – Adolescent: Normative Update. Austin, TX: Pro-Ed. [Examiner’s Manual].
Bowers, L. B., Huisingh, R., LoGiudice, C., and Orman, J. (2014). The WORD Test: Elementary, 3rd Edn. Austin, TX: Pro-Ed. [Examiner’s Manual].
Brady, N. C., Fleming, K., Romine, R. S., Holbrook, A., Muller, K., and Kasari, C. (2018). Concurrent validity and reliability for the communication complexity scale. Am. J. Speech Lang. Pathol. 27, 237–246. doi: 10.1044/2017_AJSLP-17-0106
Brady, N. C., Fleming, K., Thiemann-Bourque, K., Olswang, L., Dowden, P., Saunders, M. D., et al. (2012). Development of the communication complexity scale. Am. J. Speech Lang. Pathol. 21, 16–28. doi: 10.1044/1058-0360(2011/10-0099)
Brady, N. C., Romine, R. E. S., Holbrook, A., Fleming, K. K., and Kasari, C. (2020). Measuring change in the communication skills of children with autism spectrum disorder using the communication complexity scale. Am. J. Intellect. Dev. Disabil. 125, 481–492. doi: 10.1352/1944-7558-125.6.481
Budimirovic, D. B., Berry-Kravis, E., Erickson, C. A., Hall, S. S., Hessl, D., Reiss, A. L., et al. (2017). Updated report on tools to measure outcomes of clinical trials in fragile X syndrome. J. Neurodev. Disord. 9:14. doi: 10.1186/s11689-017-9193-x
Carrow-Woolfolk, E. (2011). Oral and Written Language Scales, 2nd Edn. Torrance, CA: Western Psychological Services. [Manual].
Carrow-Woolfolk, E. (2014). Test for Auditory Comprehension of Language, 4th Edn. Austin, TX: Pro-Ed. [Examiner’s Manual].
Carrow-Woolfolk, E. (2017). Comprehensive Assessment of Spoken Language, 2nd Edn. Torrance, CA: Western Psychological Services. [Manual].
Carrow-Woolfolk, E., and Allen, E. A. (2014). Test of Expressive Language. Pro-Ed: Austin, TX. [Examiner’s Manual].
Carrow-Woolfolk, E., and Klein, A. M. (2017). Oral Passage Understanding Scale. Bloomington, MN: NCS Pearson. [Examiner’s Manual].
Channell, M. M., Loveall, S. J., Conners, F. A., Harvey, D. J., and Abbeduto, L. (2018). Narrative language sampling in typical development: implications for clinical trials. Am. J. Speech Lang. Pathol. 27, 123–135. doi: 10.1044/2017_AJSLP-17-0046
Dawson, J. I., Stout, C. E., and Eyer, J. A. (2003). Structured Photographic Expressive Language Test, 3rd Edn. Dekalb, IL: Janelle Publications. [Manual].
DiStefano, C., Sadhwani, A., and Wheeler, A. C. (2020). Comprehensive assessment of individuals with significant levels of intellectual disability: challenges, strategies, and future directions. Am. J. Intellect. Dev. Disabil. 125, 434–448. doi: 10.1352/1944-7558-125.6.434
Dodd, B., Hua, Z., Crosbie, S., Holm, A., and Ozanne, A. (2006). Diagnostic Evaluation of Articulation and Phonology. San Antonio, TX: Pearson Education.
Dunn, D. M. (2019). Peabody Picture Vocabulary Test, 5th Edn. Bloomington, MN: NCS Pearson. [Manual].
Esbensen, A. J., Hooper, S. R., Fidler, D., Hartley, S. L., Edgin, J., d’Ardhuy, X. L., et al. (2017). Outcome measures for clinical trials in Down syndrome. Am. J. Intellect. Dev. Disabil. 122, 247–281. doi: 10.1352/1944-7558-122.3.247
Fudala, J. B., and Stegall, S. (2017). Arizona Articulation and Phonology Scale, 4th Edn. Torrance, CA: Western Psychological Services. [Examiner’s Manual].
Gerhardstein Nader, R. (2013). Vocabulary Assessment Scales – Expressive & Receptive. Lutz, FL: PARInc.
Geurts, H. M., and Embrechts, M. (2008). Language profiles in ASD, SLI, and ADHD. J. Autism Dev. Disord. 38, 1931–1943. doi: 10.1007/s10803-008-0587-1
Gilliam, R. B., and Pearson, N. A. (2017). Test of Narrative Language, 2nd Edn. Austin, TX: Pro-Ed. [Examiner’s Manual].
Goldman, R., and Fristoe, M. (2015). Goldman-Fristoe Test of Articulation, 3rd Edn. Bloomington, MN: NCS Pearson. [Manual].
Grigorenko, E. L. (2009). Dynamic assessment and response to intervention: two sides of one coin. J. Learn. Disabil. 42, 111–132. doi: 10.1177/0022219408326207
Hamaguchi, P., and Ross-Swain, D. (2015). Receptive, Expressive, and Social Communication Assessment - Elementary. Novato, CA: Academic Therapy Publications. [Technical and Administration Manuals].
Hammill, D. D., Brown, V., Larsen, S., and Wiederholt, L. (2007). Test of Adolescent & Adult Language, 4th Edn. Austin, TX: Pro-Ed. [Examiner’s Manual].
Hammill, D. D., and Newcomer, P. L. (2019). Test of Language Development – Intermediate, 5th Edn. Austin, TX: Pro-Ed. [Examiner’s Manual].
Hawkins, E., Gathercole, S., Astle, D., and Calm Team, and Holmes, J. (2016). Language problems and ADHD symptoms: how specific are the links? Brain Sci. 6:50. doi: 10.3390/brainsci6040050
Haywood, H. C., and Tzuriel, D. (2002). Applications and challenges in dynamic assessment. Peabody J. Educ. 77, 40–63. doi: 10.1207/S15327930PJE7702_5
Helland, W. A., Posserud, M. B., Helland, T., Heimann, M., and Lundervold, A. J. (2016). Language impairments in children with ADHD and in children with reading disorder. J. Atten. Disord. 20, 581–589. doi: 10.1177/1087054712461530
Hendrix, J. A., Amon, A., Abbeduto, L., Agiovlasitis, S., Alsaied, T., Anderson, H. A., et al. (2020). Opportunities, barriers, and recommendations in Down syndrome research. Transl. Sci. Rare Dis. 5, 99–129. doi: 10.3233/TRD-200090
Hessl, D., Nguyen, D. V., Green, C., Chavez, A., Tassone, F., Hagerman, R. J., et al. (2009). A solution to limitations of cognitive testing in children with intellectual disabilities: the case of fragile X syndrome. J. Neurodev. Disord. 1, 33–45. doi: 10.1007/s11689-008-9001-8
Hessling, R. M., Schmidt, T. J., and Traxel, N. M. (2004). “Floor effect,” in Encyclopedia of Social Science Research Methods, eds M. S. Lewis-Beck, A. Bryman, and T. F. T. Liao (Thousand Oaks, CA: SAGE Publications, Inc), 390–391.
Hodson, B. W. (2004). Hodson Assessment of Phonological Patterns, 3rd Edn. Austin, TX: Pro-Ed. [Examiner’s Manual].
Hresko, W., Reid, K., and Hammill, D. (2018). Test of Early Language Development, 4th Edn. Austin, TX: Pro-Ed. [Examiner’s Manual].
Huer, M. B., and Miller, L. (2011). Test of Early Communication and Emerging Language. Austin, TX: Pro-Ed. [Examiner’s Manual].
Huisingh, R., Bowers, L. B., LoGiudice, C., and Orman, J. (2002). Test of Semantic Skills – Primary. Austin, TX: Pro-Ed. [Examiner’s Manual].
Huisingh, R., Bowers, L. B., LoGiudice, C., and Orman, J. (2019). Test of Semantics Skills – Intermediate: Normative Update. Austin, TX: Pro-Ed. [Examiner’s Manual].
Kasari, C., Brady, N., Lord, C., and Tager-Flusberg, H. (2013). Assessing the minimally verbal school-aged child with autism spectrum disorder. Autism Res. 6, 479–493. doi: 10.1002/aur.1334
Khan, L. M., and Lewis, N. (2015). Khan-Lewis Phonological Assessment, 3rd Edn. Bloomington, MN: NCS Pearson. [Manual].
Lavi, A. (2019). Clinical Assessment of Pragmatics (CAPs). Torrance, CA: Western Psychological Services. [Manual].
Lawrence, B., and Seifert, D. (2016). Test of Semantic Reasoning. Novato, CA: Academic Therapy Publications. [Manual].
Luyster, R. J., Seery, A., Talbott, M. R., and Tager-Flusberg, H. (2011). Identifying early-risk markers and developmental trajectories for language impairment in neurodevelopmental disorders. Dev. Disabil. Res. Rev. 17, 151–159. doi: 10.1002/ddrr.1109
Martin, N., Brownell, R., and Hamaguchi, P. (2018). A Language Processing Skills Assessment, 4th Edn. Novato, CA: Academic Therapy Publications. [Manual].
Martin, N. A. (2013). Expressive One-Word Picture Vocabulary Test, 4th Edn Spanish-Bilingual. Novato, CA: Academic Therapy Publications. [Manual].
Martin, N. A., and Brownell, R. (2011a). Expressive One-Word Picture Vocabulary Test, 4th Edn. Novato, CA: Academic Therapy Publications. [Examiner’s Manual].
Martin, N. A., and Brownell, R. (2011b). Receptive One-Word Picture Vocabulary Test, 4th Edn. Novato, CA: Academic Therapy Publications. [Examiner’s Manual].
Mathews, S., and Miller, L. (2015). Test of Preschool Vocabulary. Austin, TX: Pro-Ed. [Examiner’s Manual].
McBee, M. (2010). Modeling outcomes with floor or ceiling effects: an introduction to the Tobit model. Gifted Child Q. 54, 314–320. doi: 10.1177/0016986210379095
McDuffie, A., Thurman, A. J., Channell, M. M., and Abbeduto, L. (2017). “Language disorders in children with intellectual disability of genetic origin,” in Handbook of Child Language Disorders, 2nd Edn, ed. R. Schwartz (Milton Park: Taylor & Francis), 52–81. doi: 10.4324/9781315283531-2
Montgomery, J. (2008a). Montgomery Assessment of Vocabulary Acquisition (Expressive Vocab Test). Austin, TX: Pro-Ed. [Examiner’s Manual].
Montgomery, J. (2008b). Montgomery Assessment of Vocabulary Acquisition (Receptive Vocab Test). Austin, TX: Pro-Ed. [Examiner’s Manual].
National Center for Education Statistics (2022). Students with Disabilities. Condition of Education. Washington, DC: U.S. Department of Education.
Nelson, N., Plante, E., Helm-Estabrooks, N., and Hotz, G. (2016). Test of Integrated Language and Literacy Skills. Baltimore, MD: Brookes Publishing Company. [Examiner’s Manual].
Newcomer, P. L., and Hammill, D. D. (2019). Test of Language Development – Primary, 5th Edn. Austin, TX: Pro-Ed. [Examiner’s Manual].
Peña, E. D., Gutiérrez-Clellen, V. F., Iglesias, A., Goldstein, B. A., and Bedore, L. M. (2018). Bilingual English-Spanish Assessment. Baltimore, MD: Brookes Publishing Company. [Test Manual].
Peña, E. D., Spaulding, T. J., and Plante, E. (2006). The composition of normative groups and diagnostic decision making: shooting ourselves in the foot. Am. J. Speech Lang. Pathol. 15, 247–254. doi: 10.1044/1058-0360(2006/023)
Phelps-Terasaki, D., and Phelps-Gunn, T. (2007). Test of Pragmatic Language, 2nd Edn. Austin, TX: Pro-Ed. [Examiner’s Manual].
Reed, R. (2018). “Toddlers and preschoolers with specific language impairment,” in An Introduction to Children with Language Disorders, 5th Edn, ed. V. Reed (New York, NY: Pearson Education), 77–129.
Richard, G. J., and Hanner, M. A. (2005). Language Processing Test 3: Elementary. Austin, TX: Pro-Ed. [Examiner’s Manual].
Robertson, C., and Salter, W. (2018). Phonological Awareness Test, 2nd Edn Normative Update. Austin, TX: Pro-Ed. [Examiner’s Manual].
Ross-Swain, D., and Long, N. (2004). Auditory Processing Abilities Test. Novato, CA: Academic Therapy Publications. [Manual].
Sansone, S. M., Schneider, A., Bickel, E., Berry-Kravis, E., Prescott, C., and Hessl, D. (2014). Improving IQ measurement in intellectual disabilities using true deviation from population norms. J. Neurodev. Disord. 6:16. doi: 10.1186/1866-1955-6-16
Schalock, R. L., Luckasson, R., and Tassé, M. J. (2021).? Intellectual Disability: Definition, Diagnosis, Classification, and Systems of Supports. Silver Spring: American Association on Intellectual and Developmental Disabilities. doi: 10.1352/1944-7558-126.6.439
Secord, W. A., and Donohue, J. S. (2014). Clinical Assessment of Articulation and Phonology, 2nd Edn. Austin, TX: Pro-Ed. [Examiner’s Manual].
Spaulding, T. J., Plante, E., and Farinella, K. A. (2006). Eligibility criteria for language impairment: is the low end of normal always appropriate? Lang. Speech Hear. Serv. Schl. 37, 61–72. doi: 10.1044/0161-1461(2006/007)
Straub, L., Bateman, B. T., Hernandez-Diaz, S., York, C., Lester, B., Wisner, K. L., et al. (2022). Neurodevelopmental disorders among publicly or privately insured children in the United States. JAMA Psychiatry 79, 232–242. doi: 10.1001/jamapsychiatry.2021.3815
Thurman, A. J., Edgin, J. O., Sherman, S. L., Sterling, A., McDuffie, A., Berry-Kravis, E., et al. (2021). Spoken language outcome measures for treatment studies in Down syndrome: feasibility, practice effects, test-retest reliability, and construct validity of variables generated from expressive language sampling. J. Neurodev. Disord. 13:13. doi: 10.1186/s11689-021-09361-6
Tomblin, J. B., Records, N. L., Buckwalter, P., Zhang, X., and Smith, E.O’Brien, M. (1997). Prevalence of specific language impairment in kindergarten children. J. Speech Lang. Hear. Res. 40, 1245–1260.
Torgeson, J. K., and Bryant, B. R. (2004). Test of Phonological Awareness, 2nd Edn Plus. Austin, TX: Pro-Ed. [Examiner’s Manual].
U.S. Department of Education (2000). Annual Report to Congress on the Implementation of IDEA. Washington, DC: U.S. Government Printing Office.
Van Herwegen, J., Riby, D., and Farran, E. K. (2015). “Neurodevelopmental disorders: definitions and issues,” in Neurodevelopmental Disorders: Research Challenges and Solutions, eds J. Van Herwegen and D. Riby (London: Psychology Press), 3–18. doi: 10.4324/9781315735313
Wagner, R. K., Torgesen, J. K., Rashotte, C. A., and Pearson, N. A. (2013). Comprehensive Test of Phonological Processing, 2nd Edn. Austin, TX: Pro-Ed. [Examiner’s Manual]. doi: 10.1037/t52630-000
Wallace, G., and Hammill, D. D. (2013). Comprehensive Receptive and Expressive Vocabulary Test, 3rd Edn. Austin, TX: Pro-Ed. [Examiner’s Manual].
Wetherby, A. M., and Prizant, B. M. (2002). Communication and Symbolic Behavior Scales, Normed Edition. Baltimore, MD: Brookes Publishing Company. [Manual]. doi: 10.1037/t11527-000
Wiig, E. H., and Secord, W. A. (2006). Emerging Literacy & Language Assessment. Greenville, SC: Super Duper Publications. [Manual].
Wiig, E. H., Secord, W. A., and Semel, E. (2020). Clinical Evaluation of Language Fundamentals Preschool, 3rd Edn. Bloomington, MN: NCS Pearson. [Manual].
Wiig, E. H., Semel, E., and Secord, W. A. (2013). Clinical Evaluation of Language Fundamentals, 5th Edn. Bloomington, MN: NCS Pearson. [Manual].
Williams, K. T. (2014). Phonological and Print Awareness Scale. Torrance, CA: Western Psychological Services. [Manual].
Williams, K. T. (2018). Expressive Vocabulary Test, 3rd Edn. Bloomington, MN: NCS Pearson. [Manual].
World Health Organization [WHO] (2019). International Classification of Diseases for Mortality and Morbidity Statistics (11th Revision). Geneva: World Health Organization.
Young, N. A. E. (2021). “Childhood Disability in the United States: 2019,” ACSBR-006, American Community Survey Briefs. Washington, DC: U.S. Census Bureau.
Zhu, L., and Gonzalez, J. (2017). Modeling floor effects in standardized vocabulary test scores in a sample of low ses Hispanic preschool children under the multilevel structural equation modeling framework. Front. Psychol. 8:2146. doi: 10.3389/fpsyg.2017.02146
Keywords: language assessment, neurodevelopmental disorders (NDDs), norm-referenced assessments, language, standardized assessment
Citation: Loveall SJ, Channell MM, Mattie LJ and Barkhimer AE (2022) Inclusion of Individuals With Neurodevelopmental Disorders in Norm-Referenced Language Assessments. Front. Psychol. 13:929433. doi: 10.3389/fpsyg.2022.929433
Received: 26 April 2022; Accepted: 09 June 2022;
Published: 12 August 2022.
Edited by:
Marisa Filipe, Faculdade de Letras da Universidade de Lisboa, PortugalReviewed by:
Andreia Salé Veloso, University of Porto, PortugalAna Paula Vale, University of Trás-os-Montes and Alto Douro, Portugal
Copyright © 2022 Loveall, Channell, Mattie and Barkhimer. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Susan J. Loveall, sloveall-hague2@unl.edu