Interobserver Reproducibility of PD-L1 Biomarker in Non-small Cell Lung Cancer: A Multi-Institutional Study by 27 Pathologists

Sunhee Chang; Hyung Kyu Park; Yoon-La Choi; Se Jin Jang

doi:10.4132/jptm.2019.09.29

Articles

Page Path: HOME > J Pathol Transl Med > Volume 53(6); 2019 > Article

Original Article Interobserver Reproducibility of PD-L1 Biomarker in Non-small Cell Lung Cancer: A Multi-Institutional Study by 27 Pathologists: Sunhee Chang^,*, Hyung Kyu Park^1,*, Yoon-La Choi^,2, Se Jin Jang³, Cardiopulmonary Pathology Study Group of the Korean Society of Pathologists; Journal of Pathology and Translational Medicine 2019;53(6):347-353.
DOI: https://doi.org/10.4132/jptm.2019.09.29
Published online: October 28, 2019

Department of Pathology, Inje University Ilsan Paik Hospital, Goyang, Korea

¹Department of Pathology, Konkuk University Medical Center, Konkuk University School of Medicine, Seoul, Korea

²Department of Pathology and Translational Genomics, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea

³Department of Pathology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea

Corresponding Author: Yoon-La Choi, MD, Department of Pathology, Samsung Medical Center, Sungkyunkwan University School of Medicine, 81 Irwon-ro, Gangnam-gu, Seoul 06351, Korea Tel: +82-2-3410-2797, Fax: +82-2-3410-6396, E-mail: ylachoi@skku.edu

*Sunhee Chang and Hyung Kyu Park contributed equally to this work.

• Received: July 15, 2019 • Revised: September 1, 2019 • Accepted: September 26, 2019

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

6,482 Views
207 Download
33 Web of Science
31 Crossref
33 Scopus

Full Article

Download PDF

Abstract
MATERIALS AND METHODS
RESULTS
DISCUSSION
NOTES
REFERENCES

Abstract

Background
Assessment of programmed cell death-ligand 1 (PD-L1) immunohistochemical staining is used for treatment decisions in non-small cell lung cancer (NSCLC) regarding use of PD-L1/programmed cell death protein 1 (PD-1) immunotherapy. The reliability of the PD-L1 22C3 pharmDx assay is critical in guiding clinical practice. The Cardiopulmonary Pathology Study Group of the Korean Society of Pathologists investigated the interobserver reproducibility of PD-L1 staining with 22C3 pharmDx in NSCLC samples.
Methods
Twenty-seven pathologists individually assessed the tumor proportion score (TPS) for 107 NSCLC samples. Each case was divided into three levels based on TPS: <1%, 1%–49%, and ≥50%.
Results
The intraclass correlation coefficient for TPS was 0.902±0.058. Weighted κ coefficient for 3-step assessment was 0.748±0.093. The κ coefficients for 1% and 50% cut-offs were 0.633 and 0.834, respectively. There was a significant association between interobserver reproducibility and experience (formal PD-L1 training, more experience for PD-L1 assessment, and longer practice duration on surgical pathology), histologic subtype, and specimen type.
Conclusions
Our results indicate that PD-L1 immunohistochemical staining provides a reproducible basis for decisions on anti–PD-1 therapy in NSCLC.
Keywords: Programmed cell death-ligand 1; Reproducibility; Observer variation; Immunohistochemistry

Immunotherapies with checkpoint inhibitor programmed death-ligand 1 (PD-L1)/programmed cell death protein 1 (PD-1) antibodies have shown encouraging results in patients with advanced non-small cell lung cancer (NSCLC) [1-4]. Assessment of PD-L1 immunohistochemical staining has been developed for pathology laboratories to aid in selecting patients who will benefit from PD-1/PD-L1–targeted therapy [1,5]. PD-L1 immunohistochemistry is currently the most useful biomarker because of the wide availability of formalin-fixed, paraffin-embedded tissues, the relatively low cost, and widespread use in pathology laboratories, particularly in contrast to molecular pathology-based methods [1,6].

Pembrolizumab (Keytruda, Merck & Co., Inc., Kenilworth, NJ, USA), approved by the U.S. Food and Drug Administration (FDA) and Korea Ministry of Food and Drug Safety (MFDS), is a first-line therapy PD-1 inhibitor for patients with advanced NSCLC [1]. Dako PD-L1 immunohistochemical (IHC) 22C3 pharmDx was used to determine PD-L1 expression in patients with advanced NSCLC during the clinical phase 1 trial (KEYNOTE-001) [2,7]. The PD-L1 IHC 22C3 pharmDx assay is the first companion diagnostic assay for PD-L1 approved by the FDA and MFDS [1,5]. Currently, four PD-L1 assays using four different PD-L1 antibodies (22C3, 28-8, SP263, and SP142) on two different IHC platforms (Dako and Ventana) are approved by the FDA and Korea Food & Drug Administration. Each assay has its own scoring system. In the era of immunotherapy, reliability of assay results is critical to predict the likelihood of response to anti–PD-1 or anti–PD-L1 therapy. A few studies have investigated the reproducibility of assessing PD-L1 tests in NSCLC tissue samples [8-10].

This study aimed to investigate the interobserver reproducibility of assessing PD-L1 expression in NSCLC tissue samples. Furthermore, association with observer factors and reproducibility of PD-L1 assessment was also investigated.

Study material: A total of 107 cases of NSCLC were selected from the archives of the Department of Pathology of Samsung Medical Center from October 2016 and December 2016. Of these, 22 tissue samples were from resections, 66 tissue samples were from computed tomography-guided core biopsy, and 19 tissue samples were cell blocks of endobronchial ultrasound-guided transbronchial needle aspirate (EBUS-TBNA). The study material comprised 66 adenocarcinomas, 33 squamous cell carcinomas, and eight other non-small cell lung cancers. Selected tumor samples contained more than 100 cells per sample.
Immunohistochemical staining and evaluation: Tissue samples were stained for PD-L1 with the 22C3 pharmDx Kit (Agilent Technologies, Santa Clara, CA, USA) on the Dako Autostainer Link 48 platform (Agilent Technologies). Deparaffinization, rehydration, and target retrieval procedures were performed using EnVision FLEX Target Retrieval solution (1×, low pH) and EnVision FLEX wash buffer (1×). The tissue samples were then placed on the Autostainer Link 48. This instrument performed the staining process by applying appropriate reagent, monitoring incubation time, and rinsing slides between reagents. The reagent times were preprogrammed in the Dako Link software. A sample with primary antibody omitted was used as a negative control. Samples were subsequently counterstained with hematoxylin and mounted in non-aqueous, permanent mounting media. The stained slides were scanned by a Aperio scanner (Leica Biosystems, Buffalo Grove, IL, USA). Pathologists scored the virtual images using ImageScope software (Leica Biosystems).; Tumor proportion score (TPS) was defined as the percentage of viable tumor cells with any perceptible membrane staining irrespective of staining intensity. Normal cells and tumor-associated immune cells were excluded from scoring. Each case was divided into three levels based on TPS: <1% (no PD-L1 expression), 1%–49% (PD-L1 expression), or ≥50% (high PD-L1 expression) (Fig. 1).
Participating pathologists: All slides were independently evaluated by 20 pulmonary pathology specialists and seven surgical pathology fellows. Eleven experts participated in a 1-day training course by Dako (Agilent Technologies, Santa Clara, CA, USA) relating to evaluation of the PD-L1 22C3 assay. Participant practice duration ranged from 1 to 37 years with a median of 13 years. Sixteen of the 27 pathologists gained experience in PD-L1 assessment by practicing daily with two to 500 cases per day for 2 months before starting this study, and four of the 16 pathologists had assessed more than 100 cases.
Statistical analysis: The gold standard PD-L1 TPS was established as TPS and assessed by highly trained and experienced experts. To assess the concordance of TPS between the 27 pathologists and the gold standard, intraclass correlation coefficients (ICCs) were calculated. The weighted kappa (κ) coefficient was calculated to evaluate concordance of the 3-step assessment between pathologists and the gold standard. Agreement for 1% and 50% cut-offs was assessed using overall percent agreement (OPA), positive and negative percent agreement, and Cohen’s κ coefficient. Correlations between TPS and experience for PD-L1 test experience or practice duration were investigated using Spearman’s rank correlation coefficient. Correlations between the 3-step assessment and PD-L1 test experience or practice duration were also investigated using Spearman’s rank correlation coefficient. Differences in the concordance of TPS, 3-step assessment, and 1% and 50% cut-offs were compared between the expert group and fellow group using Wilcoxon rank sum tests. Correlations between TPS or 3-step assessment and training were assessed using Wilcoxon rank sum test. Correlations between TPS or 3-step assessment and histologic subtype or specimen type were assessed using Kruskal-Wallis Test.; An ICC is interpreted as follows: below 0.3 indicates poor agreement, 0.5 indicates moderate agreement, 0.7 indicates strong agreement, and 0.85 or more indicates almost perfect agreement. A κ coefficient of 0.4 or less is poor to fair agreement, greater than 0.4 to 0.6 is moderate agreement, greater than 0.6 to 0.8 is substantial agreement, and greater than 0.8 is almost perfect agreement. Spearman’s rank correlation coefficient is interpreted as follows: below 0.3 indicates little if any (linear) correlation, between 0.3 and 0.5 indicates low correlation, between 0.5 and 0.7 indicates moderate correlation, between 0.7 and 0.9 indicates high correlation, and 0.9 or more indicates very high correlation. A p-value of <.05 was considered statistically significant. Statistical analyses were performed using SAS ver. 9.4 (SAS Institute Inc., Cary, NC, USA).
Ethics statement: This study was approved by the Institutional Review Board of Inje University Ilsan Paik Hospital with a waiver of informed consent (2018-08-009).

Interobserver reproducibility: Among 107 samples, 22 samples (20.6%) had a TPS <1%, 40 samples (37.4%) had a TPS between 1% and 49%, and 45 samples (42.1%) had a TPS ≥50%. The ICC for TPS was 0.902± 0.058, indicating almost perfect agreement (Table 1). Weighted κ coefficient for the 3-step assessment was 0.748±0.093, indicating substantial agreement. Using ≥1% stained tumor cells as the cut-off for a positive test, the κ coefficient was calculated as 0.633±0.111, and OPA was 86.2±5.5% (Table 2). The κ coefficient was 0.834±0.095, and the OPA was 92.1±4.4% for cut-off ≥50%. These results indicate substantial agreement for 1% and almost perfect agreement for 50% cut-offs. The interobserver reproducibility was greater at the 50% cut-off than at the 1% cut-off.
Factors influencing interobserver reproducibility: There was significant association between interobserver reproducibility and specimen type (Table 3). The ICC for TPS was significantly higher in the resection group (0.926±0.050) than in the EBUS-TBNA group (0.887±0.099). The κ coefficient for the 3-step assessment was significantly higher in the biopsy group (0.776±0.104) than in the EBUS-TBNA group (0.669±0.089). The EBUS-TBNA group showed almost perfect agreement for the 50% cut-off, but fair agreement for the 1% cut-off.; ICCs for TPS were influenced by histologic subtype (Table 4). The squamous cell carcinoma group showed higher agreement than the adenocarcinoma group. There was no significant difference between κ coefficients for the 3-step assessment.; There was significant association between interobserver reproducibility and experience (formal PD-L1 training, more experience for PD-L1 test, and longer practice duration on surgical pathology). The ICC for TPS was significantly higher in the trained group (0.922±0.034) than in the untrained group (0.875±0.074) (Table 5). The κ coefficient for the 3-step assessment was also significantly higher in the trained group (0.776±0.079) than in the untrained group (0.709±0.101). There was low linear correlation between ICC for TPS and experience for PD-L1 assessment (Spearman correlation coefficient=0.422) (Table 6). There was no correlation between the κ coefficient for the 3-step assessment and experience for PD-L1 assessment. The ICC for TPS was significantly higher in the expert group (91.9%±3.4%) than in the fellow group (85.9%±8.8%) (Table 1). The κ coefficient for the 3-step assessment was also significantly higher in the expert group (0.772±0.073) than in the fellow group (0.680±0.115). The κ coefficient for the 3-step assessment was significantly different in biopsy specimens between the expert group (0.807± 0.034) and the fellow group (0.693±0.129) (p=0.0013). In EBUS-TBNA and resection specimens, the κ coefficient of the expert group was slightly higher than that of the fellow group but there was no statistically significant difference between the two groups. There were no differences in κ coefficients at 1% and 50% cut-offs between the expert group and fellow group (Table 2).

DISCUSSION

We investigated the interobserver reproducibility of assessment of PD-L1 expression in NSCLC and observed good interobserver agreement for PD-L1 scoring. Pathologists were highly concordant for TPS with an ICC of 0.902. Rimm et al. [8] examined the interobserver reproducibility for 22C3, 28-8, SP142, and E1L3N PD-L1 assays. Ninety samples were assessed by 13 pathologists. ICC was 0.882 for the 22C3 assay. The ICCs for 28-8, SP142, and E1L3N assays were 0.832, 0.869, and 0.859, respectively, showing high concordance. The findings suggest that PD-L1 assay is a reliable method for assessing PD-L1 expression in tumor cells.

We report good interobserver agreement for 1% and 50% cut-offs with κ coefficients of 0.633 and 0.834, respectively (Table 7). Cooper et al. [9] investigated the interobserver reproducibility for assessment of the 22C3 PD-L1 assay. Two separate sample sets of 60 samples each were designed for the 1% and 50% cut-offs that contained equally distributed PD-L1 positive and negative samples. The sample set for the 1% cut-off contained 10 positive samples close to the cut-off, and the sample set for the 50% cut-off contained 20 negative or positive samples close to the cut-off. Ten pathologists assessed a sample set of 108 samples obtained after pooling the 1% cut-off and the 50% cut-off sample sets together. The κ coefficient was 0.68 for the 1% cut-off and 0.58 for the 50% cut-off. Brunnström et al. examined interobserver reproducibility for the 28-8, 22C3, SP142, and SP263 assays [10]. Seven pathologists assessed 55 samples. For the 22C3 assay, 2%–20% of cases were differently classified by any one pathologist compared to the consensus at the 1% cut-off, and 0%–2% of the cases were differently classified by any one pathologist compared to the consensus at the 50% cut-off. For all four assays, there were 0%– 20% and 0%–5% differently classified cases at 1% and 50% cut-offs, respectively. Variation in the number of differently classified cases by any one pathologist compared to the consensus was statistically significant between cut-offs. The number of differently classified cases was significantly lower for the SP142 assay compared to that for the other three assays. This difference was probably because there were many obviously negative cases for SP142 [10]. Rimm et al. [8] reported that κ coefficients for the mean of all 4 assays were 0.537 and 0.749 at 1% and 50% cut-offs, respectively. Interobserver concordance for the PD-L1 assay was higher at the 50% cut-off than at the 1% cut-off, except for the results of Cooper et al. [9]. It is possible that Cooper et al. [9] used sample sets to artificially enrich with samples close to the cut-offs, which is not possible in clinical practice.

The interobserver concordance for TPS and the 3-step assessment correlated with practice duration and was higher in the expert group than in the fellow group. This is probably because pulmonary pathologists were familiar with the varied morphologies of cancer and cancer-associated immune cells and had experience assessing other immunohistochemistry biomarkers, such as anaplastic lymphoma kinase, human epidermal growth factor receptor 2, and estrogen receptor [9]. There was no difference in concordance between the expert group and fellow group for 1% and 50% cut-offs, similar to the results of Brunnström et al. [10]. Experience in conducting the PD-L1 test impacted interobserver concordance for TPS but not for 3-step assessment. These analyses suggested that the currently used 1% and 50% cut-offs are relatively reliable regardless of pathologist experience.

Our study reported that 1-day training improved interobserver reproducibility, whereas 1-hour training had no or very little impact on interobserver reproducibility [9]. Therefore, 1-day training may be more effective than 1-hour training.

Tumor histologic subtype and specimen type influenced interobserver reproducibility, which was not previously reported. EBUS-TBNA showed lower interobserver agreement than resection and biopsy specimens at the 1% cut-off. Although squamous cell carcinoma showed higher agreement for TPS than adenocarcinoma, no significant difference was observed for the 3-step assessment. Strong membranous staining of macrophages, non-specific cytoplasmic staining of tumor cells, weak and/or partial membranous staining of tumors cells, heterogeneous staining intensities, and patchy staining are well-known interpretation pitfalls in assessing any PD-L1 assay (Fig. 2) [8-10]. While using the 1% cut-off, misinterpretation of very few or even single cells may lead to false positive or false negative results. Training and an external quality assessment program should be organized with special focus on difficult cases and on assessing the 1% cut-off. Guidelines including examples and strategies for difficult cases should be developed.

The limitations of our study include the lack of a “true gold standard” and outcome data for therapy. However, gold standard assessment was undertaken by highly trained experienced specialists. The following were strengths of the present study: it is more representative of real clinical practice than previous studies, whole sections from many samples were used to evaluate the reproducibility of PD-L1 assays, more observers than previous studies, and participating pathologists worked in different centers and had different levels of experience.

In conclusion, our results indicate that PD-L1 staining provides a reliable basis for decisions regarding anti-PD-1 therapy in NSCLC. Although interobserver agreement for the 1% cut-off was relatively lower, it was substantial and acceptable. Better training, longer assay experience, and an external quality assessment program could improve interobserver reproducibility for the 1% cut-off.

Author contributions

Conceptualization: YLC.

Data curation: SC, HKP, YLC.

Formal analysis: HKP, YLC.

Funding acquisition: SC, SJJ.

Investigation: SC, HKP, YLC.

Methodology: SC, HKP, YLC.

Project administration: YLC.

Resources: YLC.

Supervision: YLC, SJJ.

Validation: SC, HKP, YLC.

Visualization: SC, HKP, YLC.

Writing—original draft: SC.

Writing—review & editing: SC, YLC, SJJ.

Conflicts of Interest

The authors declare that they have no potential conflicts of interest.

Funding

This research was supported by 2017 The Korean Society of Pathologists Grant.

Fig. 1.

Programmed cell death-ligand 1 (PD-L1) immunohistochemistry results in non-small cell lung cancer patients using 22C3 antibody on fully automated Dako Autostainer Link 48 platform. (A) Negative staining for PD-L1. (B) PD-L1 tumor proportion score (TPS) of 10%. (C) PD-L1 TPS of 70%. (D) PD-L1 TPS of 100%.

Fig. 2.

(A) Few tumor cells show weak and partial membrane staining for programmed cell death-ligand 1 (PD-L1) antibody. (B) Tumor associated immune cells show strong staining with lack of PD-L1 staining in tumor cells. (C) Tumor cells show heterogeneous membrane staining pattern with various staining intensities. (D) Tumor shows patchy membrane staining pattern.

Table 1.

Intraclass correlation coefficient of the tumor proportion score

	ICC for TPS	p-value	Weighted Kappa for 3 tiered assessment	p-value
Total (n=26)	0.902 ± 0.058	.037	0.748 ± 0.093	.043
Expert (n=19)	0.919 ± 0.034		0.772 ± 0.073
Fellow (n=7)	0.859 ± 0.088		0.680 ± 0.115

ICC, Intraclass correlation coefficient; TPS, tumor proportion score.

Table 2.

Interobserver reproducibility of the cut-off

	1% Cut-off	p-value	50% Cut-off	p-value
Cohen’s κ coefficient
Total (n = 26)	0.633 ± 0.111	.068	0.834 ± 0.095	.082
Expert (n = 19)	0.656 ± 0.104		0.858 ± 0.072
Fellow (n = 7)	0.570 ± 0.113		0.768 ± 0.123
OPA (%)
Total (n = 26)	86.2 ± 5.5	.067	92.1 ± 4.4	.075
Expert (n = 19)	87.3 ± 4.6		93.2 ± 3.3
Fellow (n = 7)	83.2 ± 6.8		89.1 ± 5.5
NPA (%)
Total (n = 26)	85.7 ± 16.0	.150	95.4 ± 4.3	.884
Expert (n = 19)	86.8 ± 17.1		95.3 ± 4.5
Fellow (n = 7)	82.5 ± 13.5		95.4 ± 4.0
PPA (%)
Total (n = 26)	86.3 ± 8.4	.385	87.5 ± 12.0	.149
Expert (n = 19)	87.4 ± 7.6		9.2 ± 9.2
Fellow (n = 7)	83.4 ± 1.2		8.3 ± 16.4

OPA, overall percent agreement; NPA, negative percent agreement; PPA, positive percent agreement.

Table 3.

Impact of specimen type on interobserver reproducibility

	EBUS-TBNA (n = 19)	Biopsy (n = 66)	Resection (n = 22)	p-value
ICC for TPS	0.887 ± 0.099	0.899 ± 0.061	0.926 ± 0.050	.023
κ for 3-step evaluation	0.669 ± 0.089	0.776 ± 0.104	0.716 ± 0.126	.001
κ for 1% cutoff	0.383 ± 0.134	0.713 ± 0.137	0.538 ± 0.209
κ for 50% cutoff	0.832 ± 0.128	0.830 ± 0.097	0.839 ± 0.118

EBUS-TBNA, Endobronchial ultrasound-guided transbronchial needle aspiration; ICC, intraclass correlation coefficient; TPS, tumor proportion score.

Table 4.

Impact of histologic subtype on interobserver reproducibility

	SCC (n = 33)	ADC (n = 66)	p-value
ICC for TPS	0.877 ± 0.071	0.917 ± 0.053	.024
κ for 3-step evaluation	0.757 ± 0.107	0.753 ± 0.089	.073

SCC, squamous cell carcinoma; ADC, adenocarcinoma; ICC, intraclass correlation coefficient; TPS, tumor proportion score.

Table 5.

Impact of training on interobserver reproducibility

	Training		p-value
	Yes (n = 15)	No (n = 11)	p-value
ICC for TPS	0.922 ± 0.034	0.875 ± 0.074	.043
κ for 3-step assessment	0.776 ± 0.079	0.709 ± 0.101	.026

ICC, intraclass correlation coefficient; TPS, tumor proportion score.

Table 6.

Impact of experience on interobserver reproducibility

	ICC for TPS		κ coefficient for 3-step evaluation
	Spearman correlation	p-value	Spearman correlation	p-value
PD-L1 test experience	0.422	.032	0.277	.170
Practice duration	0.477	.014	0.527	.005

ICC, intraclass correlation coefficient; TPS, tumor proportion score; PD-L1, programmed cell death-ligand 1.

Table 7.

Summary of the interobserver reproducibility study

	Sample	Observer	1% Cut-off	50% Cut-off
Rimm et al. [8]	90	13	κ = 0.537^a	κ = 0.749^a
Cooper et al. [9]	108	10	κ = 0.68	κ = 0.58
Brunnström et al. [10]	55	7	2%–20%^ab	0%–2%^ab
Current study	107	27	κ = 0.633	κ = 0.834

^aFor the mean of all four assay (22C3, 28-8, SP142, and E1L3N);

^bDifferently classified cases by any one pathologist compared with consensus.

REFERENCES

1. Guan J, Lim KS, Mekhail T, Chang CC. Programmed death ligand-1 (PD-L1) expression in the programmed death receptor-1 (PD-1)/PD-L1 blockade: a key player against various cancers. Arch Pathol Lab Med 2017; 141: 851-61. Article PubMed PDF
2. Reck M, Rodríguez-Abreu D, Robinson AG, et al. Pembrolizumab versus chemotherapy for PD-L1-positive non-small-cell lung cancer. N Engl J Med 2016; 375: 1823-33. Article PubMed
3. Langer CJ, Gadgeel SM, Borghaei H, et al. Carboplatin and pemetrexed with or without pembrolizumab for advanced, non-squamous non-small-cell lung cancer: a randomised, phase 2 cohort of the open-label KEYNOTE-021 study. Lancet Oncol 2016; 17: 1497-508. Article PubMed PMC
4. Gettinger S, Rizvi NA, Chow LQ, et al. Nivolumab monotherapy for first-line treatment of advanced non-small-cell lung cancer. J Clin Oncol 2016; 34: 2980-7. Article PubMed PMC
5. Roach C, Zhang N, Corigliano E, et al. Development of a companion diagnostic PD-L1 immunohistochemistry assay for pembrolizumab therapy in non-small-cell lung cancer. Appl Immunohistochem Mol Morphol 2016; 24: 392-7. Article PubMed PMC
6. Cree IA, Booton R, Cane P, et al. PD-L1 testing for lung cancer in the UK: recognizing the challenges for implementation. Histopathology 2016; 69: 177-86. Article PubMed PDF
7. Herbst RS, Baas P, Kim DW, et al. Pembrolizumab versus docetaxel for previously treated, PD-L1-positive, advanced non-small-cell lung cancer (KEYNOTE-010): a randomised controlled trial. Lancet 2016; 387: 1540-50. Article PubMed
8. Rimm DL, Han G, Taube JM, et al. A prospective, multi-institutional, pathologist-based assessment of 4 immunohistochemistry assays for PD-L1 expression in non-small cell lung cancer. JAMA Oncol 2017; 3: 1051-8. Article PubMed PMC
9. Cooper WA, Russell PA, Cherian M, et al. Intra- and interobserver reproducibility assessment of PD-L1 biomarker in non-small cell lung cancer. Clin Cancer Res 2017; 23: 4569-77. Article PubMed PDF
10. Brunnström H, Johansson A, Westbom-Fremer S, et al. PD-L1 immunohistochemistry in clinical diagnostics of lung cancer: interpathologist variability is higher than assay variability. Mod Pathol 2017; 30: 1411-21. Article PubMed PDF

Figure & Data

References

Citations

Citations to this article as recorded by

PD-L1 expression in PitNETs: Correlations with the 2022 WHO classification
Ethan Harel, Ekkehard Hewer, Stefano La Rosa, Jean Philippe Brouland, Nelly Pitteloud, Federico Santoni, Maxime Brunner, Roy Thomas Daniel, Mahmoud Messerer, Giulia Cossu
Brain and Spine.2025; 5: 104171. CrossRef
Relationship Between Short‐Term Outcomes and PD‐L1 Expression Based on Combined Positive Score and Tumor Proportion Score in Recurrent or Metastatic Head and Neck Cancers Treated With Anti‐PD‐1 Antibody Monotherapy
Akihiro Ohara, Taisuke Mori, Mai Itoyama, Kazuki Yokoyama, Shun Yamamoto, Ken Kato, Yoshitaka Honma
Cancer Reports.2025;[Epub] CrossRef
Convergence of evolving artificial intelligence and machine learning techniques in precision oncology
Elena Fountzilas, Tillman Pearce, Mehmet A. Baysal, Abhijit Chakraborty, Apostolia M. Tsimberidou
npj Digital Medicine.2025;[Epub] CrossRef
PD‐L1 Scoring Models for Non‐Small Cell Lung Cancer in China: Current Status, AI‐Assisted Solutions and Future Perspectives
Ziling Huang, Shen Wang, Jiansong Zhou, Haiquan Chen, Yuan Li
Thoracic Cancer.2025;[Epub] CrossRef
Weakly Supervised Deep Learning Predicts Immunotherapy Response in Solid Tumors Based on PD-L1 Expression
Marta Ligero, Garazi Serna, Omar S.M. El Nahhas, Irene Sansano, Siarhei Mauchanski, Cristina Viaplana, Julien Calderaro, Rodrigo A. Toledo, Rodrigo Dienstmann, Rami S. Vanguri, Jennifer L. Sauter, Francisco Sanchez-Vega, Sohrab P. Shah, Santiago Ramón y C
Cancer Research Communications.2024; 4(1): 92. CrossRef
Concordance of assessments of four PD-L1 immunohistochemical assays in esophageal squamous cell carcinoma (ESCC)
Xinran Wang, Jiankun He, Jinze Li, Chun Wu, Meng Yue, Shuyao Niu, Ying Jia, Zhanli Jia, Lijing Cai, Yueping Liu
Journal of Cancer Research and Clinical Oncology.2024;[Epub] CrossRef
A Deep Learning–Based Assay for Programmed Death Ligand 1 Immunohistochemistry Scoring in Non–Small Cell Lung Carcinoma: Does it Help Pathologists Score?
Hiroaki Ito, Akihiko Yoshizawa, Kazuhiro Terada, Akiyoshi Nakakura, Mariyo Rokutan-Kurata, Tatsuhiko Sugimoto, Kazuya Nishimura, Naoki Nakajima, Shinji Sumiyoshi, Masatsugu Hamaji, Toshi Menju, Hiroshi Date, Satoshi Morita, Ryoma Bise, Hironori Haga
Modern Pathology.2024; 37(6): 100485. CrossRef
Clinical Validation of Artificial Intelligence–Powered PD-L1 Tumor Proportion Score Interpretation for Immune Checkpoint Inhibitor Response Prediction in Non–Small Cell Lung Cancer
Hyojin Kim, Seokhwi Kim, Sangjoon Choi, Changhee Park, Seonwook Park, Sergio Pereira, Minuk Ma, Donggeun Yoo, Kyunghyun Paeng, Wonkyung Jung, Sehhoon Park, Chan-Young Ock, Se-Hoon Lee, Yoon-La Choi, Jin-Haeng Chung
JCO Precision Oncology.2024;[Epub] CrossRef
Reproducibility of c-Met Immunohistochemical Scoring (Clone SP44) for Non–Small Cell Lung Cancer Using Conventional Light Microscopy and Whole Slide Imaging
Christophe Bontoux, Véronique Hofman, Emmanuel Chamorey, Renaud Schiappa, Sandra Lassalle, Elodie Long-Mira, Katia Zahaf, Salomé Lalvée, Julien Fayada, Christelle Bonnetaud, Samantha Goffinet, Marius Ilié, Paul Hofman
American Journal of Surgical Pathology.2024; 48(9): 1072. CrossRef
A preliminary study on the diagnostic performance of the uPath PD-L1 (SP263) artificial intelligence (AI) algorithm in patients with NSCLC treated with PD-1/PD-L1 checkpoint blockade
Alessio Cortellini, Claudia Zampacorta, Michele De Tursi, Lucia R. Grillo, Serena Ricciardi, Emilio Bria, Maurizio Martini, Raffaele Giusti, Marco Filetti, Antonella Dal Mas, Marco Russano, Filippo Gustavo Dall’Olio, Fiamma Buttitta, Antonio Marchetti
Pathologica.2024; 116(4): 222. CrossRef
Impact of Prolonged Ischemia on the Immunohistochemical Expression of Programmed Death Ligand 1 (PD-L1)
Angels Barberà, Juan González, Montserrat Martin, Jose L. Mate, Albert Oriol, Fina Martínez-Soler, Tomas Santalucia, Pedro Luis Fernández
Applied Immunohistochemistry & Molecular Morphology.2023; 31(9): 607. CrossRef
A practical approach for PD-L1 evaluation in gastroesophageal cancer
Valentina Angerilli, Matteo Fassan, Paola Parente, Irene Gullo, Michela Campora, Chiara Rossi, Maria Luisa Sacramento, Gianmaria Pennelli, Alessandro Vanoli, Federica Grillo, Luca Mastracci
Pathologica.2023; 115(2): 57. CrossRef
EZH2 and POU2F3 Can Aid in the Distinction of Thymic Carcinoma from Thymoma
Julia R. Naso, Julie A. Vrana, Justin W. Koepplin, Julian R. Molina, Anja C. Roden
Cancers.2023; 15(8): 2274. CrossRef
Artificial intelligence-assisted system for precision diagnosis of PD-L1 expression in non-small cell lung cancer
Jianghua Wu, Changling Liu, Xiaoqing Liu, Wei Sun, Linfeng Li, Nannan Gao, Yajun Zhang, Xin Yang, Junjie Zhang, Haiyue Wang, Xinying Liu, Xiaozheng Huang, Yanhui Zhang, Runfen Cheng, Kaiwen Chi, Luning Mao, Lixin Zhou, Dongmei Lin, Shaoping Ling
Modern Pathology.2022; 35(3): 403. CrossRef
Immunohistochemistry as predictive and prognostic markers for gastrointestinal malignancies
Matthew W. Rosenbaum, Raul S. Gonzalez
Seminars in Diagnostic Pathology.2022; 39(1): 48. CrossRef
Gastric Cancer: Mechanisms, Biomarkers, and Therapeutic Approaches
Sangjoon Choi, Sujin Park, Hyunjin Kim, So Young Kang, Soomin Ahn, Kyoung-Mee Kim
Biomedicines.2022; 10(3): 543. CrossRef
Development and validation of a supervised deep learning algorithm for automated whole‐slide programmed death‐ligand 1 tumour proportion score assessment in non‐small cell lung cancer
Liesbeth M Hondelink, Melek Hüyük, Pieter E Postmus, Vincent T H B M Smit, Sami Blom, Jan H von der Thüsen, Danielle Cohen
Histopathology.2022; 80(4): 635. CrossRef
5-hmC loss is another useful tool in addition to BAP1 and MTAP immunostains to distinguish diffuse malignant peritoneal mesothelioma from reactive mesothelial hyperplasia in peritoneal cytology cell-blocks and biopsies
Ziyad Alsugair, Vahan Kepenekian, Tanguy Fenouil, Olivier Glehen, Laurent Villeneuve, Sylvie Isaac, Juliette Hommell-Fontaine, Nazim Benzerdjeb
Virchows Archiv.2022; 481(1): 23. CrossRef
Artificial intelligence–powered programmed death ligand 1 analyser reduces interobserver variation in tumour proportion score for non–small cell lung cancer with better prediction of immunotherapy response
Sangjoon Choi, Soo Ick Cho, Minuk Ma, Seonwook Park, Sergio Pereira, Brian Jaehong Aum, Seunghwan Shin, Kyunghyun Paeng, Donggeun Yoo, Wonkyung Jung, Chan-Young Ock, Se-Hoon Lee, Yoon-La Choi, Jin-Haeng Chung, Tony S. Mok, Hyojin Kim, Seokhwi Kim
European Journal of Cancer.2022; 170: 17. CrossRef
Artificial Intelligence-Assisted Score Analysis for Predicting the Expression of the Immunotherapy Biomarker PD-L1 in Lung Cancer
Guoping Cheng, Fuchuang Zhang, Yishi Xing, Xingyi Hu, He Zhang, Shiting Chen, Mengdao Li, Chaolong Peng, Guangtai Ding, Dadong Zhang, Peilin Chen, Qingxin Xia, Meijuan Wu
Frontiers in Immunology.2022;[Epub] CrossRef
Association of artificial intelligence-powered and manual quantification of programmed death-ligand 1 (PD-L1) expression with outcomes in patients treated with nivolumab ± ipilimumab
Vipul Baxi, George Lee, Chunzhe Duan, Dimple Pandya, Daniel N. Cohen, Robin Edwards, Han Chang, Jun Li, Hunter Elliott, Harsha Pokkalla, Benjamin Glass, Nishant Agrawal, Abhik Lahiri, Dayong Wang, Aditya Khosla, Ilan Wapinski, Andrew Beck, Michael Montalt
Modern Pathology.2022; 35(11): 1529. CrossRef
High interobserver and intraobserver reproducibility among pathologists assessing PD‐L1 CPS across multiple indications
Shanthy Nuti, Yiwei Zhang, Nabila Zerrouki, Charlotte Roach, Gudrun Bänfer, George L Kumar, Edward Manna, Rolf Diezko, Kristopher Kersch, Josef Rüschoff, Bharat Jasani
Histopathology.2022; 81(6): 732. CrossRef
Modifying factors of PD‐L1 expression on tumor cells in advanced non‐small‐cell lung cancer
Alejandro Avilés‐Salas, Diana Flores‐Estrada, Luis Lara‐Mejía, Rodrigo Catalán, Graciela Cruz‐Rico, Mario Orozco‐Morales, David Heredia, Laura Bolaño‐Guerra, Pamela Denisse Soberanis‐Piña, Edgar Varela‐Santoyo, Andrés F. Cardona, Oscar Arrieta
Thoracic Cancer.2022; 13(23): 3362. CrossRef
Comparability of laboratory-developed and commercial PD-L1 assays in non-small cell lung carcinoma
Julia R. Naso, Gang Wang, Norbert Banyi, Fatemeh Derakhshan, Aria Shokoohi, Cheryl Ho, Chen Zhou, Diana N. Ionescu
Annals of Diagnostic Pathology.2021; 50: 151590. CrossRef
Interobserver agreement in programmed cell death‐ligand 1 immunohistochemistry scoring in nonsmall cell lung carcinoma cytologic specimens
William Sinclair, Peter Kobalka, Rongqin Ren, Boulos Beshai, Abberly A. Lott Limbach, Lai Wei, Ping Mei, Zaibo Li
Diagnostic Cytopathology.2021; 49(2): 219. CrossRef
Automated PD-L1 Scoring for Non-Small Cell Lung Carcinoma Using Open-Source Software
Julia R. Naso, Tetiana Povshedna, Gang Wang, Norbert Banyi, Calum MacAulay, Diana N. Ionescu, Chen Zhou
Pathology and Oncology Research.2021;[Epub] CrossRef
The Immunohistochemical Expression of Programmed Death Ligand 1 (PD-L1) Is Affected by Sample Overfixation
Angels Barberà, Ruth Marginet Flinch, Montserrat Martin, Jose L. Mate, Albert Oriol, Fina Martínez-Soler, Tomas Santalucia, Pedro L. Fernández
Applied Immunohistochemistry & Molecular Morphology.2021; 29(1): 76. CrossRef
Programmed cell death-ligand 1 assessment in urothelial carcinoma: prospect and limitation
Kyu Sang Lee, Gheeyoung Choe
Journal of Pathology and Translational Medicine.2021; 55(3): 163. CrossRef
Comparison of Semi-Quantitative Scoring and Artificial Intelligence Aided Digital Image Analysis of Chromogenic Immunohistochemistry
János Bencze, Máté Szarka, Balázs Kóti, Woosung Seo, Tibor G. Hortobágyi, Viktor Bencs, László V. Módis, Tibor Hortobágyi
Biomolecules.2021; 12(1): 19. CrossRef
Immunization against ROS1 by DNA Electroporation Impairs K-Ras-Driven Lung Adenocarcinomas
Federica Riccardo, Giuseppina Barutello, Angela Petito, Lidia Tarone, Laura Conti, Maddalena Arigoni, Chiara Musiu, Stefania Izzo, Marco Volante, Dario Livio Longo, Irene Fiore Merighi, Mauro Papotti, Federica Cavallo, Elena Quaglino
Vaccines.2020; 8(2): 166. CrossRef
Utility of PD-L1 testing on non-small cell lung cancer cytology specimens: An institutional experience with interobserver variability analysis
Oleksandr Kravtsov, Christopher P. Hartley, Yuri Sheinin, Bryan C. Hunt, Juan C. Felix, Tamara Giorgadze
Annals of Diagnostic Pathology.2020; 48: 151602. CrossRef

PubReader
ePub Link

Cite this Article

Cite this Article: export Copy Download Format; Close

Download Citation

Download a citation file in RIS format that can be imported by all major citation management software, including EndNote, ProCite, RefWorks, and Reference Manager.

Format:

RIS — For EndNote, ProCite, RefWorks, and most other reference management software
BibTeX — For JabRef, BibDesk, and other BibTeX-specific software

Include:

Citation for the content below
Citation and abstract for the content below

Interobserver Reproducibility of PD-L1 Biomarker in Non-small Cell Lung Cancer: A Multi-Institutional Study by 27 Pathologists

J Pathol Transl Med. 2019;53(6):347-353. Published online October 28, 2019

DOI: https://doi.org/10.4132/jptm.2019.09.29

XML Download

Figure

Related articles

Usefulness of BRAF VE1 immunohistochemistry in non–small cell lung cancers: a multi-institutional study by 15 pathologists in Korea

Interobserver Reproducibility of PD-L1 Biomarker in Non-small Cell Lung Cancer: A Multi-Institutional Study by 27 Pathologists

Fig. 1. Programmed cell death-ligand 1 (PD-L1) immunohistochemistry results in non-small cell lung cancer patients using 22C3 antibody on fully automated Dako Autostainer Link 48 platform. (A) Negative staining for PD-L1. (B) PD-L1 tumor proportion score (TPS) of 10%. (C) PD-L1 TPS of 70%. (D) PD-L1 TPS of 100%.

Fig. 2. (A) Few tumor cells show weak and partial membrane staining for programmed cell death-ligand 1 (PD-L1) antibody. (B) Tumor associated immune cells show strong staining with lack of PD-L1 staining in tumor cells. (C) Tumor cells show heterogeneous membrane staining pattern with various staining intensities. (D) Tumor shows patchy membrane staining pattern.

Fig. 1.

Fig. 2.

Interobserver Reproducibility of PD-L1 Biomarker in Non-small Cell Lung Cancer: A Multi-Institutional Study by 27 Pathologists

	ICC for TPS	p-value	Weighted Kappa for 3 tiered assessment	p-value
Total (n=26)	0.902 ± 0.058	.037	0.748 ± 0.093	.043
Expert (n=19)	0.919 ± 0.034		0.772 ± 0.073
Fellow (n=7)	0.859 ± 0.088		0.680 ± 0.115

	1% Cut-off	p-value	50% Cut-off	p-value
Cohen’s κ coefficient
Total (n = 26)	0.633 ± 0.111	.068	0.834 ± 0.095	.082
Expert (n = 19)	0.656 ± 0.104		0.858 ± 0.072
Fellow (n = 7)	0.570 ± 0.113		0.768 ± 0.123
OPA (%)
Total (n = 26)	86.2 ± 5.5	.067	92.1 ± 4.4	.075
Expert (n = 19)	87.3 ± 4.6		93.2 ± 3.3
Fellow (n = 7)	83.2 ± 6.8		89.1 ± 5.5
NPA (%)
Total (n = 26)	85.7 ± 16.0	.150	95.4 ± 4.3	.884
Expert (n = 19)	86.8 ± 17.1		95.3 ± 4.5
Fellow (n = 7)	82.5 ± 13.5		95.4 ± 4.0
PPA (%)
Total (n = 26)	86.3 ± 8.4	.385	87.5 ± 12.0	.149
Expert (n = 19)	87.4 ± 7.6		9.2 ± 9.2
Fellow (n = 7)	83.4 ± 1.2		8.3 ± 16.4

	EBUS-TBNA (n = 19)	Biopsy (n = 66)	Resection (n = 22)	p-value
ICC for TPS	0.887 ± 0.099	0.899 ± 0.061	0.926 ± 0.050	.023
κ for 3-step evaluation	0.669 ± 0.089	0.776 ± 0.104	0.716 ± 0.126	.001
κ for 1% cutoff	0.383 ± 0.134	0.713 ± 0.137	0.538 ± 0.209
κ for 50% cutoff	0.832 ± 0.128	0.830 ± 0.097	0.839 ± 0.118

	SCC (n = 33)	ADC (n = 66)	p-value
ICC for TPS	0.877 ± 0.071	0.917 ± 0.053	.024
κ for 3-step evaluation	0.757 ± 0.107	0.753 ± 0.089	.073

	Training		p-value
	Yes (n = 15)	No (n = 11)	p-value
ICC for TPS	0.922 ± 0.034	0.875 ± 0.074	.043
κ for 3-step assessment	0.776 ± 0.079	0.709 ± 0.101	.026

	ICC for TPS		κ coefficient for 3-step evaluation
	Spearman correlation	p-value	Spearman correlation	p-value
PD-L1 test experience	0.422	.032	0.277	.170
Practice duration	0.477	.014	0.527	.005

	Sample	Observer	1% Cut-off	50% Cut-off
Rimm et al. [8]	90	13	κ = 0.537^a	κ = 0.749^a
Cooper et al. [9]	108	10	κ = 0.68	κ = 0.58
Brunnström et al. [10]	55	7	2%–20%^ab	0%–2%^ab
Current study	107	27	κ = 0.633	κ = 0.834

Table 1. Intraclass correlation coefficient of the tumor proportion score

ICC, Intraclass correlation coefficient; TPS, tumor proportion score.

Table 2. Interobserver reproducibility of the cut-off

OPA, overall percent agreement; NPA, negative percent agreement; PPA, positive percent agreement.

Table 3. Impact of specimen type on interobserver reproducibility

EBUS-TBNA, Endobronchial ultrasound-guided transbronchial needle aspiration; ICC, intraclass correlation coefficient; TPS, tumor proportion score.

Table 4. Impact of histologic subtype on interobserver reproducibility

SCC, squamous cell carcinoma; ADC, adenocarcinoma; ICC, intraclass correlation coefficient; TPS, tumor proportion score.

Table 5. Impact of training on interobserver reproducibility

ICC, intraclass correlation coefficient; TPS, tumor proportion score.

Table 6. Impact of experience on interobserver reproducibility

ICC, intraclass correlation coefficient; TPS, tumor proportion score; PD-L1, programmed cell death-ligand 1.

Articles

Abstract

MATERIALS AND METHODS

RESULTS

DISCUSSION

REFERENCES

Figure & Data

References

Citations

Fig. 1.

Fig. 2.

Table 1.

Table 2.

Table 3.

Table 4.

Table 5.

Table 6.

Table 7.