- Department of Endocrinology, Diabetes and Metabolism, Virginia Commonwealth University Health, Richmond, VA, United States
Introduction
Thyroid nodules are common and are one of the most common reasons for endocrinology clinic encounters. The widespread use of various imaging modalities and improved healthcare access have resulted in a significant increase in the discovery of incidental thyroid nodules. About half of the population develops a thyroid nodule by age 60 that can be found either through physical examination or imaging. Thankfully, 85% to 90% prove benign (1–3). However, in the United States, every year over 500 000 fine-needle aspirations (FNAs) are conducted, with about 200 000 of them being unnecessary. Thus, identifying the nodules at the highest risk of malignancy is critical.
Evaluation of patients with a suspected thyroid nodule must include a thorough medical history and physical examination and a thyroid-stimulating hormone (TSH) level and ultrasound (US) evaluation. The sonographic characteristics of these nodules are used to better assess the risk of malignancy (RoM). Based on large studies, US features that are associated with an increased risk of malignancy (hypoechogenicity, solid composition, microcalcifications/punctate echogenic foci, irregular margins, taller than wide shape) and decreased risk of malignancy (isoechoic nodules, spongiform appearance, simple cystic nodules, comet tail artifacts) have been identified (4–6). No single US feature satisfactorily identifies malignant nodules. Over the years, several risk stratification systems (RSSs) that use a combination of these features to help clinicians identify high-risk nodules have been developed. An ideal RSS would minimize the number of unnecessary FNAs and identify all clinically significant thyroid cancers, leading to lower healthcare costs and morbidity.
Ultrasound scoring systems
Currently available tools to help clinicians risk-stratify thyroid nodules are:
1) Clinical practice guidelines (CPG) from various professional societies,
2) Scoring systems (qualitative or quantitative),
3) Web-based calculators and
4) An interactive algorithm.
In recent years, artificial intelligence (AI) has shown significant promise in the evaluation of thyroid ultrasounds and in stratifying thyroid nodules.
Several professional organizations have developed ultrasound-based RSSs and management guidelines for thyroid nodules, namely, the American College of Radiology Thyroid Imaging Reporting and Data System (ACR TI-RADS), the American Thyroid Association (ATA) guidelines, the European Thyroid Association (ETA, EU-TIRADS), the Korean Society of Thyroid Radiology/Korean Thyroid Association (KSThR/KTA, K-TIRADS), the Chinese Medical Association (C-TIRADS), the American Association of Clinical Endocrinology (AACE), the American College of Endocrinology (ACE), and the Associazione Medici Endocrinologi (AME) (7–13). There are additional RSSs developed by groups of investigators who do not represent professional organizations.
The characteristics of the commonly used RSSs are outlined in Table 1. The most commonly used ultrasound RSSs are based on the presence of one or more discrete features with one exception. The ATA system uses discrete features and patterns comprised of a combination of these discrete features.
Table 1 Characteristics of major ultrasound risk stratification systems [adapted from reference (14)].
Risk calculators and computer-interpretable guidelines (CIG) are interactive tools where unambiguous, sequential recommendations are made and can be used to engage patients. Table 2 summarizes the various risk calculators available.
Table 2 Summary—thyroid nodule risk calculators [Adapted from reference (14)].
Comparison of risk stratification systems
There are considerable differences between the various RSSs. They differ in their formats (pattern recognition versus point systems), risk categories, FNA size thresholds, and in the recommended surveillance intervals (if present). Multiple studies have compared various risk stratification tools, most of them retrospective. No single system has consistently demonstrated superiority over the others (possibly due to differences in the patient populations, inclusion and exclusion criteria, and analytic methods).
A meta-analysis compared five major RSSs, namely, AACE/ACE/AME, ATA, K-TIRADS, ACR TI-RADS, and EU-TIRADS. It included 12 studies with 28,750 nodules (15.2% malignant). In order to avoid the bias arising from the different methodologies of the published studies, summary operating measures that are assumed to be independent of disease prevalence were used, such as the diagnostic odds ratio (DOR). The DOR is the odds of a positive test in those with disease relative to the odds of a positive test in those without disease. The diagnostic odds ratio ranged from 2.2 to 4.9 among the different RSSs. A head-to-head comparison showed a higher relative DOR (RDOR) [1.9, 95% CI (1.3-2.9); P = .002] for ACR-TIRADS [DOR: 5.6, 95% CI (3.4–9.0)] versus ATA [DOR: 2.9, 95% CI (1.3–6.5)] due to a higher relative likelihood ratio for positive results. Similarly, a comparison between ACR-TIRADS [DOR: 4.5, 95% CI (2.5–7.9)] and K-TIRADS [DOR: 2.5 95% CI (1.1-5.6)] showed a higher RDOR [1.8, 95% CI (1.2 – 2.6); P = .002] (15).
Ha et al. studied a total of 2000 consecutive thyroid nodules (≥ 1 cm) in 1802 patients and compared seven society guidelines. Overall, the ACR TI-RADS recommended the fewest “unnecessary” (benign) thyroid nodule FNAs at 25.3%, followed by the 2016 AACE/ACE/AME guidelines (32.5%), ATA (51.7%), and K-TIRADS (56.9%). While the K-TIRADS (94.5%) and ATA (89.6%) guidelines were more sensitive compared with the AACE/ACE/AME (80.4%) and ACR (74.7%), the latter were more specific (ACR 67.3%, AACE/ACE/AME 58%, and ATA 33.2%) (16).
Another meta-analysis compared four RSSs, namely, ACR-TIRADS, EU-TIRADS, ATA, and K-TIRADS. This analysis included 29 different studies with a total of 33,748 nodules with pathological or imaging follow-up. The respective pooled sensitivity and specificity of the various RSSs were:
- ACR-TIRADS: 66% and 91% for category 5 and 95% and 55% for category 4 or 5
- ATA: 74% and 88% for category 5 and 91% and 64% for category 4 or 5
- K-TIRADS: 55% and 95% for category 5 and 89% and 64% for category 4 or 5
- EU-TIRADS: 82% and 90% for category 5 and 96% and 52% for category 4 or 5.
When high-risk categories (categories 4-5) were evaluated, no difference was found between the RSSs (17).
A prospective, observational study from a single thyroid cancer unit of a large hospital analyzed 832 thyroid nodules referred for FNA and compared the performance of five RSSs (ATA, AACE/ACE/AME, ACR TIRADS, EU-TIRADS, and K-TIRADS). All the nodules were classified based on US features and stratified using each of the five RSSs, and the recommendation for FNA was evaluated with the final pathologic diagnosis. After excluding nodules with indeterminate cytology, a total of 502 nodules were included in the final cohort. It was concluded that consistently adhering to any of the RSS guidelines would have reduced the number of FNAs by 17.1% and that ACR-TIRADS allowed the largest reduction (268 of 502) in the number of FNAs with the lowest false-negative rate of 2.2% (95% CI, 95.2% to 99.2%). Although the discriminatory capacities of all the RSSs (except for K-TIRADS) were comparable to that of ACR-TIRADS, they recommended more FNAs (18).
Discussion
With multiple risk stratification tools available, clinicians choose their tools informed by their geography and specialization. Both these factors select for involvement with particular professional societies, many of which have their own validated risk stratification systems. As discussed above, studies comparing the performance of various RSSs have had inconsistent results. This makes it difficult for clinicians to consistently implement an RSS. The wide variety of systems may often lead to confusion on the part of both patients and physicians due to a lack of uniformity. This is relevant, especially in the era of “open notes”, where patients can access their health records. It can be a puzzling experience when radiologists and clinicians use multiple RSSs with differing management recommendations. It can also be a time-consuming exercise for clinicians to re-evaluate all the nodules using a different RSS, particularly in the fast-paced clinics.
This also poses a challenge to endocrinologists and other clinicians in training. During clinical training, trainees work with several teaching attendings, and many of them have a different approach to thyroid nodule evaluation, the biggest difference being the RSS in use. Some senior clinicians do not use any specific RSS but go with their intuition, while others use different RSSs, reflective of the differences in their training and experience. Some radiologists include the ACR-TIRADS classification of nodules in their reports, while others do not. Although this system enables clinicians in training to learn and use one of several RSSs to justify a specific recommendation based on the patient’s medical history, comorbidities, and preferences, it can be an overwhelming and confusing experience.
Another challenge of US-based RSSs is inter- and intra-observer variability (19). When comparing various RSSs, studies have shown that inter-observer agreement is better for intermediate- and high-suspicion nodules than for low-suspicion nodules (20). In another blinded, multi-center study, 100 electronically recorded thyroid nodule US images were analyzed, and the evaluation was repeated four months later after randomization. The analysis was performed by radiologists and endocrinologists. They were also classified according to the ATA, AACE/ACE/AME, EU-TIRADS, and ACR-TIRADS classifications. The aim of this study was to assess inter- and intra-observer agreement between different thyroid centers and different specialists. They concluded that while the intra-observer reproducibility for thyroid nodule US classification appears fairly adequate, the inter-observer agreement between the different centers is lower than in single-center trials (21). There are still inconsistencies in thyroid US examiners’ reporting and rating abilities. A potential solution to this problem is a unified lexicon of thyroid US features and dedicated training. This may increase inter-observer agreement and improve the predictive value of the classification system.
There is a compelling need for a universal risk stratification system that would help not only clinicians but also patients in understanding ultrasound reports and making appropriate recommendations in identifying the nodules that require further evaluation including a biopsy. A grassroots initiative, managed by the steering committee of the International Thyroid Nodule Ultrasound Working Group (ITNUWG), is currently working to develop an international RSS, termed I-TIRADS, that integrates the leading RSSs (22). A recent multidisciplinary international survey conducted by the ITNUWG on RSS-use patterns and practitioner characteristics and preferences confirmed this notion. There were 875 respondents from 52 countries from more than seven specialties. About one-third of the respondents indicated the use of more than one RSS in their practice, potentially leading to confusion, and another third of the respondents reported not using an RSS for various reasons. Most of them supported a comprehensive points-based RSS with no more than five risk categories (23). The majority of them (62% of the respondents) indicated that a universal lexicon paired with illustrative images of ultrasound features would improve inter-observer variability. They also supported the idea of a comprehensive atlas of thyroid US images and videos and dedicated training on the universal lexicon.
There is a strong need for a universal RSS with a lexicon to harmonize all the current systems and standardize the evaluation of thyroid nodules with the aim of reducing unnecessary thyroid biopsies without jeopardizing the detection of clinically significant malignancies. The development of I-TIRADS is a step towards this vision, but we would still need to wait for validation in large population studies.
Author contributions
The author confirms being the sole contributor of this work and has approved it for publication.
Conflict of interest
The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
References
1. Yassa L, Cibas ES, Benson CB, Alexander EK, Krane JF, Barletta JA, et al. Long-term assessment of a multidisciplinary approach to thyroid nodule diagnostic evaluation. Cancer (2007) 111(6):508–16. doi: 10.1002/cncr.23116
2. Siegel R, Naishadham D, Jemal A. Cancer statistics, 2012. CA Cancer J Clin (2012) 62(1):10–29. doi: 10.3322/caac.20138
3. Popoveniuc G, Jonklaas J. Thyroid nodules. Med Clin North Am (2012) 96(2):329–49. doi: 10.1016/j.mcna.2012.02.002
4. Angell TE, Maurer R, Wang Z, Robbins J, Sosa BE, Gofnung Y, et al. A cohort analysis of clinical and ultrasound variables predicting cancer risk in 20,001 consecutive thyroid nodules. J Clin Endocrinol Metab (2019) 104(11):5665–72. doi: 10.1210/jc.2019-00664
5. Cappelli C, Castellano M, Pirola I, Gandossi L, Agosti L, Cimino L, et al. The predictive value of ultrasound findings in the management of thyroid nodules. QJM (2007) 100(1):29–35. doi: 10.1093/qjmed/hcl121
6. Sipos JA. Advances in ultrasound for the diagnosis and management of thyroid cancer. Thyroid (2009) 19(12):1363–72. doi: 10.1089/thy.2009.1608
7. Tessler FN, Middleton WD, Grant EG, Hoang JK, Berland LL, Eberhardt JDH, et al. ACR thyroid imaging, reporting and data system (TI-RADS): white paper of the ACR TI-RADS committee. J Am Coll Radiol (2017) 14(5):587–95. doi: 10.1016/j.jacr.2017.01.046
8. Haugen BR, Alexander EK, Bible KC, Doherty GM, Mandel SJ, Nikiforov YE, et al. 2015 American thyroid association management guidelines for adult patients with thyroid nodules and differentiated thyroid cancer: the American thyroid association guidelines task force on thyroid nodules and differentiated thyroid cancer. Thyroid (2016) 26(1):1–133. doi: 10.1089/thy.2015.0020
9. Russ G, Bonnema SJ, Erdogan MF, Durante C, Ngu R, Leenhardt L. European Thyroid association guidelines for ultrasound malignancy risk stratification of thyroid nodules in adults: the EU-TIRADS. Eur Thyroid J (2017) 6(5):225–37. doi: 10.1159/000478927
10. Shin JH, Baek JH, Chung J, Ha EJ, Na DG, Jung SL, et al. Ultrasonography diagnosis and imaging-based management of thyroid nodules: revised Korean society of thyroid radiology consensus statement and recommendations. Korean J Radiol (2016) 17(3):370–95. doi: 10.3348/kjr.2016.17.3.370
11. Zhou J, Yin L, Wei X, Xue S, Zhang X, Liu C, et al. 2020 Chinese Guidelines for ultrasound malignancy risk stratification of thyroid nodules: the c-TIRADS. Endocrine (2020) 70(2):256–79. doi: 10.1007/s12020-020-02441-y
12. Gharib H, Papini E, Garber JR, Duick DS, Hamilton RL, Harrell RM, et al. American association of clinical endocrinologists, american college of endocrinology, and associazione medici endocrinologi medical guidelines for clinical practice for the diagnosis and management of thyroid nodules–2016 update. Endocr Pract (2016) 22(5):622–39. doi: 10.4158/EP161208.GL
13. Garber JR, Papini E, Frasoldati A, Bartalena L, Hegedüs L, Hansen JC, et al. American Association of clinical endocrinology and associazione Medici endocrinologi thyroid nodule algorithmic tool. Endocr Pract (2021) 27(7):649–60. doi: 10.1016/j.eprac.2021.04.007
14. Majety P, Garber JR. Ultrasound scoring systems, clinical risk calculators, and emerging tools. In: Handbook of thyroid and neck ultrasonography: an illustrated case compendium with clinical and pathologic correlation. (Cham, Germany: Springer International Publishing) (2023). p. 25–52. doi: 10.1007/978-3-031-18448-2_2
15. Castellana M, Castellana C, Treglia G, Giovanella L, Bruno R, Trimboli P, et al. Performance of five ultrasound risk stratification systems in selecting thyroid nodules for FNA. J Clin Endocrinol Metab (2020) 105(5):dgz170. doi: 10.1210/clinem/dgz170
16. Ha EJ, Na DG, Baek JH, Sung JY, Kim JH, Kang SY. US Fine-needle aspiration biopsy for thyroid malignancy: diagnostic performance of seven society guidelines applied to 2000 thyroid nodules. Radiology (2018) 287(3):893–900. doi: 10.1148/radiol.2018171074
17. Hoang JK, Middleton WD, Langer JE, Tabár L, Zhang Z, Haas BR, et al. Comparison of thyroid risk categorization systems and fine-needle aspiration recommendations in a multi-institutional thyroid ultrasound registry. J Am Coll Radiol (2021) 18(12):1605–13. doi: 10.1016/j.jacr.2021.07.019
18. Grani G, Lamartina L, Ascoli V, Filetti S, Elisei R, Durante C, et al. Reducing the number of unnecessary thyroid biopsies while improving diagnostic accuracy: toward the “Right” TIRADS. J Clin Endocrinol Metab (2019) 104(1):95–102. doi: 10.1210/jc.2018-01674
19. Russ G, Trimboli P, Buffet C. The new era of TIRADSs to stratify the risk of malignancy of thyroid nodules: strengths, weaknesses and pitfalls. Cancers (Basel) (2021) 13(17):4316. doi: 10.3390/cancers13174316
20. Yim Y, Na DG, Ha EJ, Lim HK, Kim JH, Shin JH, et al. Concordance of three international guidelines for thyroid nodules classified by ultrasonography and diagnostic performance of biopsy criteria. Korean J Radiol (2020) 21(1):108–16. doi: 10.3348/kjr.2019.0215
21. Persichetti A, Di Stasio E, Coccaro C, Mirabella R, Campanella L, Giacomelli L, et al. Inter- and intraobserver agreement in the assessment of thyroid nodule ultrasound features and classification systems: a blinded multicenter study. Thyroid (2020) 30(2):237–42. doi: 10.1089/thy.2019.0360
22. Tessler F. I-TIRADS (International thyroid imaging, reporting, and data system) project: roadmap and status. American Thyroid Association, Chicago,IL, USA. (2019).
Keywords: thyroid nodule, ultrasound scoring systems, risk stratification, sonographic features, risk calculators
Citation: Majety P (2023) Thyroid nodules: need for a universal risk stratification system. Front. Endocrinol. 14:1209631. doi: 10.3389/fendo.2023.1209631
Received: 21 April 2023; Accepted: 30 June 2023;
Published: 21 July 2023.
Edited by:
Andrea Frasoldati, Endocrine Unit ASMN, ItalyReviewed by:
Magdalena Stasiak, Polish Mother’s Memorial Hospital Research Institute, PolandCopyright © 2023 Majety. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Priyanka Majety, priyanka.majety@vcuhealth.org