Recent topics on thyroid cytopathology: reporting systems and ancillary studies
Article information
Abstract
As fine-needle aspiration techniques and diagnostic methodologies for thyroid nodules have continued to evolve and reporting systems have been updated accordingly, we need to be up to date with the latest information to achieve accurate diagnoses. However, the diagnostic approaches and therapeutic strategies for thyroid nodules vary across laboratories and institutions. Several differences exist between Western and Eastern practices regarding thyroid fine-needle aspiration. This review describes the reporting systems for thyroid cytopathology and ancillary studies. Updated reporting systems enhance the accuracy, consistency, and clarity of cytology reporting, leading to improved patient outcomes and management strategies. Although a single global reporting system is optimal, reporting systems tailored to each country is acceptable. In such cases, compatibility must be ensured to facilitate data sharing. Ancillary methods include liquid-based cytology, immunocytochemistry, biochemical measurements, flow cytometry, molecular testing, and artificial intelligence, all of which improve diagnostic accuracy. These methods continue to evolve, and cytopathologists should actively adopt the latest methods and information to achieve more accurate diagnoses. We believe this review will be useful to practitioners of routine thyroid cytology.
INTRODUCTION
Fine-needle aspiration (FNA) cytology is widely used as the most effective preoperative diagnostic tool for thyroid nodules. Its primary purpose is to triage patients with thyroid nodules into appropriate treatment plans, which has contributed to reducing the rate of unnecessary thyroid surgery in patients with benign nodules. FNA techniques and diagnostic methodologies for thyroid nodules have continued to evolve, and the corresponding reporting systems have been updated accordingly. Therefore, medical systems need to be updated with the latest information to achieve more accurate diagnoses. However, the diagnostic approaches and therapeutic strategies for thyroid nodules vary across laboratories and hospitals.
Several differences exist between Western and Eastern practices regarding thyroid FNA. Various factors may play a role, including ethnicity, lifestyle, medical environment, surgical indication, and interpretation of pathological criteria. This review describes the reporting systems for thyroid cytopathology and ancillary techniques used to improve diagnostic accuracy. Updated reporting systems enhance the accuracy, consistency, and clarity of cytology reporting, ultimately leading to improved patient outcomes and management strategies. Ancillary methods include liquid-based cytology, immunocytochemistry, biochemical measurements, flow cytometry, molecular testing, and artificial intelligence (AI), all of which improve diagnostic accuracy. Not all institutions worldwide have access to the ancillary methods presented in this review; nevertheless, the information will be valuable for future implementation. In addition, this review explains the differences in practices related to thyroid FNA between Western and Eastern countries. This review will be useful to readers who practice routine thyroid cytology.
REPORTING SYSTEMS
The Bethesda System for Reporting Thyroid Cytopathology (TBSRTC) was developed in 2007 to facilitate communication between cytopathologists and clinicians [1]. It is now a global reporting system for thyroid FNA cytology. However, interpretations and applications vary among individuals, institutions, and countries. Different reporting systems have been used across countries such as England [2], Italy [3], and Japan [4,5]. Although it would be best to establish a single global reporting system, systems adapted by country are acceptable. In such cases, compatibility must be ensured to facilitate data sharing.
The Bethesda System for Reporting Thyroid Cytopathology
The Thyroid Fine Needle Aspiration State of the Science Conference was held on October 22 and 23, 2007, in Bethesda, MD, USA, and a framework for TBSRTC was formed [1]. TBSRTC consists of six categories for reporting thyroid nodules. Each category has an implied risk of malignancy (ROM) and the usual clinical management strategy. The third edition of TBSRTC was proposed in 2023 [6], and its main revisions are listed in Table 1. In this edition, diagnostic categories were unified under a single name, i.e., nondiagnostic for nondiagnostic/unsatisfactory, atypia of undetermined significance (AUS) for AUS/follicular lesion of undetermined significance, and follicular neoplasm (FN) for FN/suspicious for FN. The ROM was revised based on the data published after the second edition (Table 2). Herein, AUS was subcategorized as AUS with nuclear atypia or AUS-other because the ROM for resected AUS with nuclear atypia (36%–44%) was higher than that in AUS-other (15%–23%) [6].

ROM and usual management for adult patients in The Bethesda System for Reporting Thyroid Cytopathology
A definitive diagnosis of noninvasive follicular thyroid neoplasm with papillary-like nuclear features (NIFTP) requires an extensive histological examination of the entire tumor capsule. Therefore, NIFTP cannot be diagnosed by cytology. Most tumors are cytologically classified into intermediate diagnostic categories (AUS and FN). Since NIFTP is now considered non-carcinomatous, the ROM for each diagnostic category decreased by 1.3% to 9.1% [6]. The goal of the third edition was to increase awareness of the potential diagnostic clues for NIFTP. Follicular lesions with mild or focal nuclear alterations associated with papillary thyroid carcinoma (PTC) were categorized as FN.
The same diagnostic categories were applied to pediatric patients (Table 3). However, the ROM and management recommendations for these patients are not the same as those for adults. The ROMs in pediatric patients were higher than those in adult patients in all categories. This difference is especially large for FN (50% in pediatric cases, 30% in adults) [6]. Molecular testing, diagnostic lobectomy, and surveillance are not recommended in pediatric patients.
Japanese system
In 2005, the Japanese reporting system for thyroid aspiration cytology was initially proposed by the Japanese Society of Thyroid Surgery, classifying cytological findings into five categories: (I) inadequate, (II) normal or benign, (III) intermediate, (IV) malignancy suspected, and (V) malignant [4]. In 2013, the Japan Thyroid Association introduced an original diagnostic system that incorporated malignancy risk estimates for each category, and it uniquely subclassified FN into “likely benign,” “borderline,” and “likely malignant” subtypes [7]. However, the system was not widely adopted because general cytopathologists were unable to accurately differentiate among classes. In 2015, the current Japanese reporting system was proposed by the Japanese Society of Thyroid Surgery [4]. It was revised by the Japan Association of Endocrine Surgery and the Japanese Society of Thyroid Pathology in 2019 [4] and again in 2023 [5]. The Japanese system comprises seven categories: unsatisfactory, cyst fluid, benign, undetermined significance, FN, suspicious for malignancy, and malignant. The diagnostic criteria for each category are identical to those used for TBSRTC; the only difference is that cystic fluid is handled as an independent category. Data showed that the ROM in cyst fluid only (CFO) nodules was 0.2% [8], which was apparently lower than that in nondiagnostic nodules excluding CFO (5.6%) and benign nodules (1.2%) [9,10]. In addition, Japanese endocrinologists did not agree that reporting as nondiagnostic despite having aspirated large amounts of samples.
In Asian countries, the prevalence of NIFTP among cases previously diagnosed as PTCs was relatively low, ranging from 0% to 4.7% [11-13]. In Japan, many cases of NIFTP had previously been classified histologically as follicular thyroid adenomas due to insufficient nuclear features of PTC and had been categorized cytologically as FN [14,15]. Among patients originally diagnosed with follicular thyroid adenoma, 16.7% to 41.3% were subsequently identified as NIFTP [15]. Thus, adopting the NIFTP category has had a limited impact on the cytological classification and treatment strategies in Japan compared to Western countries [16].
The Japanese system does not include the ROM or clinical management for the different diagnostic categories due to insufficient data, and there is no standardized consensus on clinical management. As described in other parts of this article, both the ROM and clinical management in Japan differ from those in the West. Therefore, we need to determine the ROM and provide the recommended clinical management strategy in Japan. Recently, we examined the frequency, re-aspiration rate, resection rate, ROM, and clinical management of each of the seven categories using multi-institutional data from Japanese institutions. Based on these results, we propose the ROM and recommend clinical management (Table 4) [17] different from that in the West. For example, the ROM (11.4%) of FN in Japan is considerably lower than that of TBSRTC. Ultrasound findings are important for clinical management, so a more conservative strategy is recommended. As unsatisfactory and undetermined nodules with benign ultrasound findings were followed without re-aspiration, their re-aspiration rates were low, 17.8% and 12.6%, respectively [17]. Active surveillance has been accepted as an option for low-risk papillary thyroid microcarcinoma nodules [18,19]. Therefore, the resection rates of suspicious for malignancy and malignant nodules were considerably low at 77.8% and 70.8%, respectively. Table 5 shows papillary thyroid microcarcinomas for which active surveillance is not recommended. Active surveillance has emerged as a strategy to address the alarming concern of overdiagnosis and overtreatment of low-risk thyroid cancer [20]. However, successful implementation requires accurate cytological diagnosis, high-quality imaging, a definite medical strategy, and comprehensive informed consent from the patient. It is also important to avoid aspirating small nodules, <10 mm in Western countries [6] and <5 mm in Japan [21], even when malignancy is suspected on imaging. Therefore, we concluded that the same diagnostic categories and criteria should be used, but ROM and recommended clinical management strategy should be adapted by country.
ANCILLARY STUDIES
FNA is the most accurate and cost-effective procedure for initial evaluation of patients with thyroid nodules. The sensitivity and specificity of FNA are reported to be 68%–98% and 56%–100%, respectively [22]. Ancillary studies using aspirated materials, including liquid-based cytology (LBC), immunocytochemistry, biochemical measurements, flow cytometry, molecular testing, and AI, can improve the diagnostic accuracy of FNA. Herein, we describe the indications, methods, and characteristics of ancillary studies based on our experience.
Liquid-based cytology
LBC refers to the preparation of cytological samples suspended in preservative liquids. Table 6 shows the advantages and disadvantages of LBC over direct smears [23-26]. The most important clinical implication of this procedure is a reduction in the number of inadequate specimens. This is due to the higher cell collection rate, removal of red blood cells, reduction in colloid, and avoidance of degeneration via smearing.
The differences in cytological findings of LBCs from those of direct smears will affect the results (Table 7) [26-28]. The LBC preservative has proteolytic and hemolytic effects that remove colloids and red blood cells, simplifying observation of cellular components. However, several diagnostic clues may become less visible, such as lymphocytes in Hashimoto thyroiditis, overlapping and ground glass nuclei in PTC, and lymphoglandular bodies in lymphoma. On the other hand, convoluted nuclei is considered diagnostic clues for PTC. Nuclear size generally decreases in LBC specimens, but the nuclei of lymphoma cells are enlarged and show a meshed chromatin pattern. As the cell shape is better preserved in LBC than direct smears, tall cell and hobnail subtypes of PTC are more easily recognized. Similarly, the tail-like cytoplasmic characteristics of medullary thyroid carcinomas are well-preserved.

Comparison of the cytological characteristics of liquid-based cytology specimens with those of direct smears
Because the cytological findings of LBC specimens differ from those of direct smears, experience and knowledge are required for proper analysis [26,28]. Therefore, a combined method is recommended when implementing the LBC method (Fig. 1). After preparing direct smears, needle washout using the LBC preservative fluid is performed, and the resulting specimen is used for LBC. In our experience, the number of cellular components in LBC specimens is often greater than in conventional smears.
Immunocytochemistry
Immunocytochemistry (ICC) is primarily employed to determine the origin of tumor cells, including follicular epithelial cells, C cells, lymphocytes, parathyroid cells, thymic cells, and metastatic cancer. Ideally, both positive and negative antibodies should be tested together. To prepare multiple ICC slides from a single cytological sample, we recommend the cell transfer method [29]. This method can be performed even if the decision to perform ICC is made after observing Papanicolaou-stained specimens. Table 8 shows the immunocytochemical panels used in routine practice at our institution. Three patterns can be recognized as positive localization: nuclear, cytoplasmic, and cell membrane. Among these, the preferred antibodies are those with nuclear reactivity. Cell membranous and cytoplasmic reactivities are frequently weak and cannot be identified in naked cells [30]. ICC can also be used to determine the subtype of tumors [31-33], differentiate and grade carcinomas [34-36], and detect genetic abnormalities [37,38]. However, ICC cannot be used to differentiate follicular adenomas from follicular carcinomas.
ICC can reduce the number of AUS cases, repeat FNA, and diagnostic surgery. In contrast, compared with immunohistochemical studies, quality control, use of several antibodies, assessment of staining results, and development of manuals for staining methods in ICC are more difficult. We should actively utilize ICC for diagnosis and fully understand its limitations and pitfalls.
Biochemical measurements
Biochemical measurements using the washout fluid of aspiration needle can be highly useful in certain scenarios [39-46]. When metastatic thyroid carcinoma of the lymph node is suspected upon ultrasound, measurement of thyroglobulin levels in the washout fluid of the lymph node aspirates could improve diagnostic sensitivity [44-46]. PTCs frequently exhibit cystic metastases, even when the primary lesion is not cystic. The materials aspirated from such lesions contain foamy histiocytes but may not include carcinoma cells (Fig. 2). In such cases, high fluid thyroglobulin levels can confirm metastasis. Notably, thyroglobulin measurement is not recommended for central lymph node samples. Thyroid organs or beds may also be present along the needle route (Fig. 3). Calcitonin measurement is useful in suspected cases of medullary thyroid carcinoma [41,42]. This method can be used for both thyroid and metastatic lesions. In our institution, calcitonin measurement is always performed in cases with increased serum carcinoembryonic antigen levels but no carcinoma in the gastrointestinal tract or hepato-biliary-pancreatic region. Measuring parathyroid hormone (PTH) levels in the washout fluid of parathyroid lesions may also be useful. However, preoperative parathyroid FNA is not recommended, except in unusual and difficult cases of primary hyperparathyroidism, and should not be performed if parathyroid carcinoma is suspected [47,48]. We perform this procedure under suspicion of a parathyroid cyst, in which the aspirates are usually colorless. Because these specimens do not contain any cells, cytological diagnosis of parathyroid cyst is not possible [43]. PTH measurement is the only method to confirm diagnosis of parathyroid cysts.

Metastatic papillary thyroid carcinoma in a lateral lymph node after thyroidectomy. (A) The lymph node is cystic (ultrasound B-mode). (B) Aspirated material showing histiocytes but no carcinoma cells (Papanicolaou stain).
Flow cytometry
Almost all primary thyroid lymphomas are of B cell origin and are divided into two main categories: diffuse large B cell lymphoma and mucosa-associated lymphoid tissue (MALT) lymphoma [49,50]. MALT lymphomas are frequently confused with Hashimoto thyroiditis, which is characterized by high lymphocytic infiltration. To confirm the diagnosis of lymphoma and distinguish between lymphoma and Hashimoto thyroiditis, a repeat FNA for flow cytometry is desirable [6]. Theoretically, B cell lymphomas reveal either a κ or λ light chain (light chain restriction). In contrast, lymphocytes and plasma cells observed in patients with Hashimoto thyroiditis are polyclonal. We defined light chain restriction as a κ-to-λ ratio less than 0.5 or greater than 3.0 [50,51]. The positivity rate for light chain restriction in lymphoma cases was 69.2%–75.0% [50,51]. The accuracy was almost the same as that using resected materials (69.2%) [52]. Lymphoma cases without light chain restriction showed low light chain positivity rates (<25%) and a B cell-to-T cell ratio >2.0. Given this background, a diagnostic algorithm using flow cytometry to evaluate primary thyroid lymphomas has been proposed (Fig. 4) [49]. However, it is unclear whether this algorithm is applicable to cases of extrathyroidal lymphoma. Currently, flow cytometry using aspirated samples is not popular in Asian countries [49] but should be used more actively to improve the preoperative diagnosis of lymphoma.
Molecular testing
In Western countries, molecular testing is an option for the management of clinically and/or cytologically indeterminate thyroid nodules [6,53,54]. Since the testing reveals a high negative predictive value to rule out carcinoma, low-risk tumors without gene mutation should be followed. Previously, repeat FNA and diagnostic surgery were recommended for AUS and FN nodules, respectively [1]. Consequently, unnecessary thyroid resection was avoided and healthcare costs were reduced. However, in most Asian countries, molecular testing is not performed in medical practice [55]. This can be attributed to the lack of insurance coverage, the absence of commercial availability, and the extremely high cost. Even though such testing is not performed, the repeat FNA rate (AUS, 7.5%–42.2%; FN, 0%–11.1%) and resection rate (AUS, 14.1%–66.7%; FN, 25.0%–67.7%) are not high [17]. In Asian countries, follow-up is the preferred approach for intermediate thyroid nodules, and surgical resection is performed for nodules suspected to be malignant based on clinical and sonographic findings [17,56,57]. Nishino [53] explained the differences between the Western and Asian treatment strategies for intermediate nodules. A lower tolerance for uncertainty has historically favored diagnostic surgery for intermediate nodules in Western countries, and molecular testing with a high negative predictive value to rule out carcinoma has been a priority. In contrast, in Asian countries, follow-up is the preferred approach, and findings with a high positive predictive value may suffice for identification of nodules that warrant immediate thyroidectomy.
Artificial intelligence
AI was first applied to the analysis of thyroid cytology specimens in 1996, with an overall accuracy of 90.61% [58]. It has since been used in several studies to analyze thyroid cytology images. Various input data for training and evaluation in AI have been used, such as direct smears or LBC specimens, Papanicolaou- or Giemsa-stained specimens, and patch or whole-slide imaging data [59-63]. Lee et al. [63] reported that the accuracy of AI models has become higher than that of pathologists (99.71% vs. 88.91%, respectively). However, currently AI methods are inadequate for analyzing thyroid FNA cytology cases in routine practice. Though AI analyses mainly focus on differentiating between two lesions (e.g., benign lesions and PTC), AI models that can be adopted in clinical practice have been developed [64,65]. The precision-recall area under the curve (PR AUC) of one such developed model was >0.95, except for poorly differentiated thyroid carcinoma (PR AUC, 0.49) and medullary thyroid carcinoma (PR AUC, 0.91) [64]. These two carcinomas are difficult to diagnose, even by experts in thyroid cytology. The results showed that the accuracy of AI annotated by cytopathologists resembled that of the cytopathologists. This can be improved using unsupervised machine learning. Two reports indicated that AI allowed distinction of follicular thyroid adenoma and follicular thyroid carcinoma, previously thought to be impossible [62,64]. AI technology is steadily improving, and its outcomes will be used in clinical practice, including as a secondary screening tool for benign specimens (Fig. 5A). Similar to molecular testing, AI may also be involved in clinical management of AUS and FN nodules. Online AI platforms can be used for consultation purposes (Fig. 5B).
CONCLUSION
In this review, we described reporting systems and ancillary studies for thyroid FNA cytology. Though a single global reporting system would be optimal, modifications by country are acceptable. Moreover, compatibility is necessary for data comparisons and clinical/academic advancement. As ancillary studies continue to evolve, cytopathologists should actively adopt the latest methods and information to provide more accurate diagnoses of FNA specimens.
Notes
Ethics Statement
Not applicable.
Availability of Data and Material
The datasets generated or analyzed during the study are available from the corresponding author on reasonable request.
Code Availability
Not applicable.
Author Contributions
Conceptualization: MH. Data curation: MH, AS. Formal analysis: MH. Investigation: MH, AS. Methodology: MH. Visualization: MH, AS. Writing—original draft: MH. Writing—review & editing: MH, AS. Approval of the final manuscript: all authors.
Conflicts of Interest
The authors declare that they have no potential conflicts of interest.
Funding Statement
No funding to declare.
Acknowledgments
The content of this review was presented as a plenary lecture at the 21st Korea-Japan Joint Meeting for Diagnostic Cytopathology (Saturday, September 28, 2024; 3F Crystal Ballroom, Lotte Hotel, Busan, Korea).