- 1School of Computer Science and Technology, Faculty of Electronics and Information Engineering, Xi’an Jiaotong University, Xi’an, China
- 2School of Management, Hefei University of Technology, Hefei, China
- 3The Ministry of Education Key Laboratory of Process Optimization and Intelligent Decision-Making, Hefei University of Technology, Hefei, China
- 4Nanjing Geneseeq Technology Inc., Nanjing, China
- 5School of Public Health, Nanjing Medical University, Nanjing, China
- 6State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-Sen University Cancer Center, Guangzhou, China
Tumor mutation burden (TMB) is a recognized stratification biomarker for immunotherapy. Nevertheless, the general TMB-high threshold is unstandardized due to severe clinical controversies, with the underlying cause being inconsistency between multiple assessment criteria and imprecision of the TMB value. The existing methods for determining TMB thresholds all consider only a single dimension of clinical benefit and ignore the interference of the TMB error. Our research aims to determine the TMB threshold optimally based on multifaceted clinical efficacies accounting for measurement errors. We report a multi-endpoint joint model as a generalized method for inferring the TMB thresholds, facilitating consistent statistical inference using an iterative numerical estimation procedure considering mis-specified covariates. The model optimizes the division by combining objective response rate and time-to-event outcomes, which may be interrelated due to some shared traits. We augment previous works by enabling subject-specific random effects to govern the communication among distinct endpoints. Our simulations show that the proposed model has advantages over the standard model in terms of precision and stability in parameter estimation and threshold determination. To validate the feasibility of the proposed thresholds, we pool a cohort of 73 patients with non-small-cell lung cancer and 64 patients with nasopharyngeal carcinoma who underwent anti-PD-(L)1 treatment, as well as validation cohorts of 943 patients. Analyses revealed that our approach could grant clinicians a holistic efficacy assessment, culminating in a robust determination of the TMB screening threshold for superior patients. Our methodology has the potential to yield innovative insights into therapeutic selection and support precision immuno-oncology.
Introduction
Immune checkpoint inhibitor (ICI) therapy has emerged as a promising strategy with confirmed efficacy for advanced or metastatic tumors (Bracarda et al., 2015; Motzer et al., 2015; Chiang et al., 2020; Kuryk et al., 2020; Wołącewicz et al., 2020; Majc et al., 2021). Tumor mutation burden (TMB, defined as the number of somatic mutations per mega-base) is a recognized biomarker of sensitivity to ICIs (Hellmann et al., 2018a; Cristescu et al., 2018; Bai et al., 2020; Wang et al., 2020), according to the underlying connection between the increasing number of somatic mutations and the neo-antigen that the activated T cells can recognize (Van Rooij et al., 2013), enhancing the tumor immunogenicity (Pardoll, 2012; Conway et al., 2018). A high TMB tends to trigger a favorable prognosis (Yarchoan et al., 2017; Legrand et al., 2018), which has been observed in urothelial carcinoma (Rosenberg et al., 2016), small-cell-lung cancer (Hellmann et al., 2018b), non-small-cell lung cancer (NSCLC) (Lim et al., 2015; Rizvi et al., 2015; Carbone et al., 2017; Hellmann et al., 2018c; Singal et al., 2019), and melanoma (Johnson et al., 2016; Goodman et al., 2017). TMB is a suggested test for patients undergoing immunotherapy by both NCCN and FDA (Lemery et al., 2017; Boyiadzis et al., 2018; Subbiah et al., 2020).
In clinical practices, the TMB threshold is a baseline for identifying patients with potential ICI benefits (Samstein et al., 2019). TMB thresholds are typically determined in two ways: either grouped by quartiles, which is obviously imprecise (Campesato et al., 2015; Colli et al., 2016; Riaz et al., 2017; Miao et al., 2018; Wood et al., 2020), or numerical thresholds generated from statistical tests of significance based on efficacy endpoints (Goodman et al., 2017). Notably, among these previous statistical studies, retrospective evaluations of efficacy are limited to a single dimension, most regularly the response. The primary endpoints for immuno-oncology include objective tumor response and time-to-event (TTE), where the TMB biomarker has been observed to be associated with both (Cao et al., 2019). Such diverse efficacy evaluation metrics have sparked controversy in the threshold standardization (Goodman et al., 2017). When assessments base on different endpoints over the same cohort, inconsistent thresholds arise, and clinicians are left inconclusive about which one to choose. Furthermore, clinical decisions need a comprehensive review of the diseased multifaceted efficacy rather than a single endpoint that exhibits a partial treatment effect. Therefore, there is an urgent clinical need for inference on multiple endpoints to derive a comprehensive TMB threshold. However, it is computationally challenging for two reasons. First, if several individual endpoints are to be inferred simultaneously, the intersection cannot be taken directly. Instead, some adjustment for multiple testing is required to control the familywise type I error rate (FWER) (Ristl et al., 2019). Constructing the joint distribution of different endpoints is preferable to the straightforward application of Bonferroni inequalities in terms of maximizing the utilization of available information, providing unbiased results, and allowing for statistical alpha levels closer to nominal levels while boosting the statistical power (Phillips et al., 2003; Asar et al., 2015; Guidance 2017). Secondly, the existing joint modeling studies have mostly taken a perspective on analyzing longitudinal biochemical markers within the survival analysis framework. Whereas the volatility of tumor genomic traits in immunotherapy trials is quite limited, we are more concerned with the within-subject dependence between different endpoints. Binary tumor responses conforming to the Bernoulli distribution do not satisfy the premise of a normal distribution in linear regression. The existing models have limited capacity to comprehend possible shared biologic processes on endpoints of tumor remission with survival and are not applicable to scenarios of immune efficacy investigation.
Moreover, the imprecision of TMB values is another source of threshold controversies (Wood et al., 2020). Despite the different calculation methods of TMB, the accuracy of variant callings can never reach 100% due to technical limitations (Xu et al., 2014; Alioto et al., 2015), and TMB always harbors measurement errors. Existing models neglect the difference between the actual and observed values of TMB, which lead to significant bias in statistical inference (Campesato et al., 2015; Colli et al., 2016; Goodman et al., 2017; Riaz et al., 2017; Miao et al., 2018; Wood et al., 2020). Parameter inference for statistical models is conventionally obtained by maximum likelihood estimation (MLE), and unbiasedness of the score function for likelihood (i.e., expectation equal to zero) is a critical criterion for ensuring estimate consistency. With the accurate TMB values being unascertainable, the observations TMB∗ (
Therefore, we report a generalized method for optimizing the identification of TMB-positive thresholds. Our method integrates binary response and continuous TTE endpoints to provide a comprehensive efficacy assessment, while, to our best knowledge, it is among the first statistical approaches accounting for TMB measurement errors. To verify the viability of the multi-endpoint joint model, we conducted a series of simulation experiments, and the results confirmed our superiority in the accuracy of parameter estimation and fault tolerance of threshold delineation compared with the standard separate regression model. Meanwhile, we gathered a cohort of 73 non-small-cell lung cancer (NSCLC) patients and 64 nasopharyngeal carcinoma (NPC) patients and validation cohorts of 943 patients who underwent ICI treatment to illustrate the applicability of the model across carcinomas. It is known that different cancer types and TMB calculations often yield different thresholds, but we provide here a generalized statistical method applicable for any known scenarios. The data results show that the proposed model can obtain a more comprehensive and robust TMB threshold to support therapeutic refinement for cancer patients. The source code reproduces the figures, and results can be downloaded from https://github.com/YixuanWang1120/TMB_JM.
Materials and Methods
To comprehensively determine TMB-positivity thresholds from multifaceted efficacy analyses while considering inevitable measurement errors, we present a general approach for the simultaneous joint modeling of multiple endpoints, yielding approximately consistent statistical inference for mis-specified covariates by developing an iterative numerical estimation procedure using the corrected-score method. The observed sample information contains the patient’s clinically recorded objective response rate (ORR) and TTE endpoints, other clinical indicators (correctly specified), and the corresponding TMB observations with measurement errors. The data consist of n independent observations of R, T, Δ, Z, and TMB∗, denoting the binary tumor response outcome, continuous survival time, event indicator, vector of accurately measured covariates, and mismeasured TMB, respectively. To simplify, the additive measurement error model relates the true unobserved TMB index to the observed TMB∗: TMB∗ = TMB + e, where
A Joint Model Considers Binary and Continuous Endpoints
For patient
where αz and αm denote the corresponding response regression coefficients, θ represents all unknown parameters in the joint model, and bi denotes the random effect. The exponent of the estimated parameter exp(α) for the logit regression of binary outcomes can be interpreted intuitively as the multiples of change in the odds ratio caused by a one-unit increase in the corresponding variable.
Let Ti denote the observed event time (such as tumor relapses, progression, death, etc.), which is taken as the minimum of the true event time Ui and the censoring time Ci, that is,
where h(t) describes the instantaneous risk for an event in the time interval [t, t + dt) provided survival up to t, while S(t) represents the survival probability. h0(t) is referred to as baseline hazard and follows the Weibull distribution
The maximum likelihood estimates are derived as the modes of the log-likelihood function corresponding to the joint distribution of the observed samples
where the likelihood of the response is:
while the likelihood of the survival is:
By incorporating random effects (Barbieri et al., 2020), it is feasible to jointly model the multiple endpoints and regulate intricate correlations between response probabilities and event times. Then, the joint logarithmic likelihood can be formulated as:
Inference about parameters θ is typically based on the maximization of Eq. 2, while integrals about random effects apparently have no analytical solution. Here, we approximate
where the function f(x) has a unique global maximum at x0. So, the first-order Laplace approximation to the observed-data joint log-likelihood is:
where
and the mode
The difference of Eq. 3 from the previous independent standard regressions lies in that the joint assessment entails examining the endpoint correlations, where
Estimates obtained by maximizing
Based on
where T0 is a pre-specified survival time.
Based on the joint probabilities that characterized the positive prognosis of the patients, we rank them and then label the populations to be analyzed according to the proportion of patients who would potentially benefit for ICI. Ultimately, the proposed joint model can stratify patients into two subgroups according to their TMB levels and the positive prognosis labels using the receiver operating characteristic curve (ROC) to balance the classification performance. Thresholds for the low- and high-TMB groups are selected from the local optima across a range of clinically meaningful values by Yoden Index.
The complete TMB threshold identification procedure based on the aforementioned joint model solved by Laplace approximation is given in Algorithm 1.
Algorithm 1. Identifying TMB threshold without measurement errors
Bias Arising From Measurement Error
Here, we further investigate the negative impact of measurement errors in TMB. The score function in Section 2.1 is unbiased. Base on Eqs. 3, 6,
where
The parameter
What will happen when measurement error exists? We assume the observed TMB∗ is subject to the measurement error model:
As a more specific illustration, we consult the part of survival function:
The additional term
due to the function
Correction of TMB Measurement Error for Threshold Optimization
To reduce the biasing effect caused by measurement errors and obtain a more robust TMB threshold, we integrated the widely applicable corrected score with the joint model, resulting in approximately consistent estimators based on the observed data. A corrected score is a function
which is conditionally unbiased for the true-data score function according to the property of conditional expectation,
Based on Eqs. 4 and 8 and the property of corrected score, we derive a correct
Let
where the complex variate
Furthermore, we obtain the joint corrected-score
We present the joint corrected scores based on the complex variable simulation extrapolation and the property of Eq. 11. Eq. 13 contains
The complete TMB threshold identification procedure based on the aforementioned Laplace approximation and corrected score is given in Algorithm 2.
Algorithm 2. Identifying TMB threshold with measurement errors
Experiments and Results
Simulation Study
In order to assess the performance of the proposed joint model with the corrected-score function, we conducted a series of simulation studies whose primary objective was to assess the fixed effect coefficient estimates and the variance of the random effects. Data are simulated in an oncology trial context, with random effects correlated among patients’ multiple endpoints. In the simulations, we assume 200 patients, i.e.,
Furthermore, we set
TABLE 1. Comparisons of bias and standard errors of estimators between joint model with standard model with varying measurement errors.
According to Table 1, the regression parameter estimates for the two function components perform reasonably well for a variety of measurement error conditions. In the absence of measurement errors, the joint model outperforms ordinary regression models in calculating regression coefficients because it more precisely reflects the potential connections between several endpoints. When considering different levels of measurement errors, the performance of the estimator based on corrected score was significantly superior to that of the naive estimator and only marginally poorer than that of the true-data estimator. Clearly, the performance of the naive estimator deteriorates with increasing error magnitude, which further suggests that the measurement error introduces a more significant bias effect on the parameter estimates. Overall, the results of the simulation experiments support the proposed joint multi-endpoint model and the iterative numerical estimation procedure, as well as the applicability of the associated random effects. Additionally, comparing SE and SD, the precision of the stated standard errors is generally satisfactory. The biases of the joint assessments compared to the standard separate regressions emphasize that the dependence among clinical endpoints could be an important and non-negligible confounder in analyzing the factors determining the treatment effect.
To further exhibit the disturbance of measurement errors on TMB thresholds and the stability of our proposed joint model, we additionally simulated the comparison of efficacy grouped by different TMB thresholds. We simulated the prognosis of a cohort of patients based on the assumption that there is a positive correlation between actual TMB levels and a favorable immunotherapy prognosis, with coefficients set exactly as above. The variance fluctuation of TMB measurement error was set to 1.0. We derive the different thresholds for classifying patients and comparing their efficacy based on the joint statistical inference with the TMB actual values, the quantile method with TMB observations, and the joint-correction statistical inference with TMB observations. The outcomes of the comparison are depicted in Figure 2. We can clearly observe that the discrepancies between the efficacies of different groups are minimized or even reversed (Figure 2C,D) when the patients were classified directly using quartiles in the presence of measurement errors. Contrary to the clinical theory that the higher TMB, the more antitumor immunogenic the patient, patients in the TMB-low subgroup demonstrated greater therapeutic benefit in terms of tumor remission and progression-free survival than those in the TMB-high subgroup. The confounding effect of the measurement errors would dilute the actual link between TMB levels and immunotherapy clinical efficacy (Figures 2A,B), preventing appropriate screening for superior patient populations. In contrast, the bias effect due to measurement errors is reduced when we use the joint model as well as the correction estimation procedure. As shown in Figure 2E,F, the TMB threshold determination based on our proposed method ensures both the validity and a certain degree of error tolerance.
FIGURE 2. Efficacy comparison of patients grouped based on different TMB thresholds. (A) (B) Comparison of ORR and survival curves based on the threshold derived from the joint statistical inference with TMB actual values. (C) (D) Comparison of ORR and survival curves based on median TMB observations. (E) (F) Comparison of ORR and survival curves based on the threshold derived from the joint-correction statistical inference with TMB observed values.
Patient Cohort Characteristics
Sun Yat-sen University Cancer Center recruited 95 NSCLC patients who received anti-PD-(L)1 monotherapy between December 2015 and August 2017, with data collected until January 2019. The study design has already been published (Fang et al., 2019). Between March 2016 and January 2018, R/M NPC patients have enrolled in two single-arm phase I trials (NCT02721589 and NCT02593786), where 128 patients were screened for eligibility. The dose escalation and expansion phases of the study were previously reported (Fang et al., 2018; Ma et al., 2019). Eligible patients aged from 18 to 70 years had histologically or cytologically confirmed locally advanced or metastatic NSCLC or NPC, had an Eastern Cooperative Oncology Group (ECOG) performance-status score of 0 or 1 (on a 5-point scale, with higher numbers indicating greater disability), had at least one measurable lesion according to the Response Evaluation Criteria in Solid Tumors (RECIST version 1.1 (Eisenhauer et al., 2009)), and had failed at least one prior line of systemic therapy. Figure 3 and Supplementary Table S1 depict the distribution of patients’ treatments. Radiographic tumor assessments were taken at the start of the study and every 6 weeks thereafter. The proportion of patients with complete response (CR) and partial response (PR) was known as the ORR. The time from the initial dose until PD or any-cause death was referred to as progression-free survival (PFS). Censored data documented the last radiographic assessment before cut-off, follow-up loss, or treatment change. Overall survival (OS) was defined as the time from the first dosage to death, and patients who remained alive were censored at the date of their last follow-up. The Sun Yat-sen University Cancer Center’s Ethical Review Committee approved this study, which was carried out in conformity with the Declaration of Helsinki. Each patient signed the written informed consent.
FIGURE 3. Patient samples included in the final analysis. (A) Flowchart for NSCLC sample inclusions. Among the 95 patients who underwent anti-PD-(L) 1 therapies and had available FFPE and/or biopsy tumor samples, we performed WES on samples from 73 patients. (B) Flowchart for NPC sample inclusions. Among the 128 patients who underwent anti-PD-(L) 1 or anti-CTLA-4 therapies, we performed targeted NGS on samples from 64 patients.
At Sun Yat-sen University Cancer Center, 95 Chinese patients with NSCLC were treated with anti-PD-(L)1 monotherapies, with 73 patients being included in the final analysis with evaluable radiological results. Concurrently, 128 patients with R/M NPC who had received anti-PD-(L)1 monotherapies were retrospectively investigated, of whom 64 patients were being screened for the final analysis based on sequencing quality and follow-up completeness. When both FFPE and biopsy samples were available for the patient, the FFPE sample was used in the analysis, given the limited intra-tumoral heterogeneity represented by a single biopsy sample. The study design and clinical characteristics of this cohort are summarized in Figure 3 and Table 2 with details in Supplementary Table S1. For lung cancer, 60% of the patients had adenocarcinoma, followed by squamous carcinoma (32%). Almost all patients (99%) were stage IV at diagnosis; the median age of patients with NSCLC and NPC at the treatment initiation was 55 and 46 years, respectively. 49% of the NSCLC patients and 25% NPC patients had a smoking history and more males in both cohorts (70% vs. 30% for NSCLC, 80% vs. 20% for NPC). ORR of the study cohorts was 19% and 12%, and the median progression-free survival (mPFS) was 91 days for lung cancer and 67.5 days for NPC. No difference in PFS was observed among the different immune agents.
In addition to the SYSUCC NSCLC cohort and NPC cohort described above, external cohorts of 943 patients from the public literatures treated with ICI are compiled in Supplementary Table S2, encompassing 453 melanoma patients (Snyder et al., 2014; Van Allen et al., 2015; Hugo et al., 2016; Goodman et al., 2017), 407 NSCLC patients (Goodman et al., 2017; Hellmann et al., 2018a; Miao et al., 2018; Rizvi et al., 2018), 56 RCC (Wood et al., 2020), and 27 bladder (Miao et al., 2018), along with treatment modality and outcome analyzed. The mutation callings are derived from three sequencing platforms (WES, F1CDx, and MSK-IMPACT). F1CDx and MSK-IMPACT are NGS targeted panel being authorized by the FDA as practical diagnostic assays. Table 3 summarizes the sequencing methodology and varied TMB thresholds employed in the gathered research.
Joint Model Prompts a Comprehensive and Robust TMB Subgrouping
The multi-endpoint joint analysis used to locate TMB thresholds is superior to the previous studies as it provides a more comprehensive analysis of patient clinical information. Based on the co-analyzed labels, it can give an overall picture of disease efficacy. Based on these compound indices to establish ROC curves to handle true- and false-positive rates in the classification, we selected a TMB threshold from clinically meaningful values to group patients in the experiment and validation sets.
As shown in Figure 4 and Table 4, we can discern that the ROC curves based on the mixed-endpoint joint labels generally had higher AUCs in either the experiment or validation groups, with an average improvement of about 0.2 over those based on ORR labels alone, and the range of confidence intervals likewise supports this conclusion. More importantly, all the AUCs established on the proposed indices exceeded 0.6, ranges from 0.663 to 0.972, reflecting our model’s more robust discrimination capabilities. For comparison, as for the ROCs based on original ORR labels, despite the classification ability varying among cancer types, the ROCs in most cases showed more inferiority, with half of the cases only marginally exceeding 0.5 not reaching 0.6, even equivalent to random chance. The results in Figure 4 and Table 4 fully demonstrate that the subgrouping of TMB under the joint modeling of multiple endpoints is significantly improved compared to the existing subgrouping based on the ORR single label. We attribute this phenomenon to a proportion of the patients with opposing effects on the two rubrics present in these cases. Although high TMB was reported associated with ICI treatment improvement in terms of overall trends, the status of a single indicator alone is not fully representative of the patient’s actual matter. This is why the ROC curves established based on only a single endpoint have such poor performance. Integrating patients’ multi-dimensional information and joint modeling mixed-endpoints can prompt a more comprehensive stratification of TMB. Our approach could provide clinicians with a full assessment of efficacy, resulting in a comprehensive determination of the TMB screening threshold for superior patients.
FIGURE 4. Receiver operating characteristic curves of the predictive capacity of prognosis label for two experiment cohorts and validation cohorts, depicting the true-positive rate (sensitivity, y-axis) and false-positive rate (1-specificity, x-axis) for the metric across all possible TMB thresholds. The corresponding area under the curve (AUC) is illustrated in the figure legends. (A) ROC curves for NPC (panel) experiment cohort, bladder cohort, and RCC cohort based on ORR labels alone. (B) ROC curves for NSCLC (WES) experiment cohort, NSCLC_35 cohort, NSCLC_57 cohort, and NSCLC_240 cohort based on ORR labels alone. (C) ROC curves for Mel_37 cohort, Mel_52 cohort, Mel_64 cohort, and Mel_105 cohort based on ORR labels alone. (D) ROC curves for NPC (panel) experiment cohort, bladder cohort, and RCC cohort based on the mixed-endpoint labels. (E) ROC curves for NSCLC (WES) experiment cohort, NSCLC_35 cohort, NSCLC_57 cohort, and NSCLC_240 cohort based on the mixed-endpoint labels. (F) ROC curves for Mel_37 cohort, Mel_52 cohort, Mel_64 cohort, and Mel_105 cohort based on the mixed-endpoint labels.
TABLE 4. AUC comparison. The table reports the area under the curve (AUC), as well as the corresponding 0.95 confidence interval, for each metric (columns) applied to a different cancer cohort (rows). Bold-faced values indicate the best value for each cancer cohort.
To verify that our proposed threshold delineation method for TMB remains valid and robust under the perturbation of measurement errors, we added 10%–20% artificial noise according to the actual TMB level. Given the small number of patients in some cases, which are over-sensitive to data noise, we selected several groups of cases with more patients for analysis. The results are shown in Figure 5 and Table 5. Under the perturbation of artificial noise, the AUC of each group showed mostly a slight decrease compared to the error-free cases. However, the ROC curves based on our proposed joint labels still maintain a high AUC, which is about 0.3 higher on average than the ROC curves based on ORR labels only. These results demonstrate the high error tolerance of our proposed joint model.
FIGURE 5. Receiver operating characteristic curves of the predictive capacity of prognosis label for two experiment cohorts and validation cohorts, depicting the true-positive rate (sensitivity, y-axis) and false-positive rate (1-specificity, x-axis) for the metric across all possible TMB thresholds considering measurement errors. The corresponding area under the curve (AUC) is illustrated in the figure legends. (A) ROC curves for NPC (Panel), NSCLC (WES) experiment cohort, Mel_64 cohort, Mel_105 cohort, and NSCLC_240 cohort based on ORR labels considering TMB measurement errors. (B) ROC curves for NPC (panel), NSCLC (WES) experiment cohort, Mel_64 cohort, Mel_105 cohort, and NSCLC_240 cohort based on the mixed-endpoint labels considering TMB measurement errors.
TABLE 5. AUC comparison. The table reports the area under the curve (AUC), as well as the corresponding 0.95 confidence interval, for each metric (columns) applied to a different cancer cohort (rows). Bold-faced values indicate the best value for each cancer cohort.
Joint Analysis Prompts a Significant and Error-Tolerant Patient Subgrouping
In addition to the strengths shown in the ROC curves, based on the derived TMB thresholds, we can classify experimental NSCLC patients into two groups with apparently stratified efficacy. The effect of the dichotomy is shown in Figure 6, where we can notice a significant difference between patients in TMB_Low and TMB_High in terms of immunotherapy benefit (p-values = 0.017 and 0.089). The grouping results on the other cohorts can be seen in Supplementary Figures.
FIGURE 6. Survival curves and ORR comparison between experimental NSCLC patients (n = 73) with low and high TMB. Improved progression-free survival (PFS) and a trend toward increased objective response rate (ORR) are observed in patients with high TMB.
To demonstrate that the TMB thresholds derived from our proposed joint model can significantly separate the treatment effects of patients receiving immunotherapy, we statistically compared the patient outcomes obtained based on our thresholds with those obtained from different quartiles (median, upper tertile, upper quartile) using the log-rank test and the two-sided Mann–Whitney U test. As shown by the results in Table 6, our model-derived TMB thresholds performed satisfactorily and consistently for cohort patient segmentation. This predominance is mainly reflected in the p-values of the statistical tests, which are essentially the lowest among all division scenarios under the threshold division based on the joint model, indicating that the proposed model is predominate. NSCLC_35 and NSCLC_240 were the only two situations in which the p-values performed marginally worse than the quantile divisions. Similarly, five groups of patients were selected to validate the stability of the proposed model in the face of the TMB measurement error. As shown by the results in Table 7, our proposed threshold delineation method for TMB remained efficient and robust under perturbation of measurement error.
TABLE 6. Immunotherapy mPFS or mOS and response probability based on different tumor mutation burden (TMB) thresholds for non-small-cell lung cancer (NSCLC), nasopharyngeal carcinoma (NPC), bladder, renal cell carcinoma (RCC), and melanoma. p values are reported by log-rank test and the two-sided Mann–Whitney U test.
TABLE 7. Immunotherapy mPFS or mOS and response probability based on different tumor mutation burden (TMB) thresholds with measurement errors for non-small-cell lung cancer (NSCLC), nasopharyngeal carcinoma (NPC), bladder, renal cell carcinoma (RCC), and melanoma. p values are reported by log-rank test and the two-sided Mann–Whitney U test.
Discussion
Tumor mutation burden has recently become an area of interest, as high TMB is associated with improved response to ICI therapies. However, the threshold defining the TMB-high/TMB-positive patients is controversial in clinical, which is exacerbated by the presence of multiple evaluation metrics and TMB calculation errors. The existing TMB threshold-identifying approaches are merely based on a single endpoint, which may suffer from excessive information loss. TMB metric, as a predictive marker, is closely associated with both of the two types of clinical endpoints (ORR and TTE), where the effect in two endpoints may be of different magnitude or even point in different directions. Herein, we report a generalized framework for comprehensively determining the positivity TMB thresholds based on a mixed-endpoint joint model and an iterative numerical estimation procedure considering measurement errors. In our joint model, we choose the Weibull–Cox proportional hazard model for the TTE endpoint. Although the baseline risk h0(t) in standard survival analysis usually be left unspecified, such as the advantageous partial likelihood method. However, within the joint modeling framework, it turns out that following such a route may lead to an underestimation of the standard errors of the parameter estimates (Hsieh et al., 2006). Thus, we recommend choosing an explicit definition of h0(t) based on the dataset characteristics, corresponding to a parametric distribution. The Weibull, the log–normal, and the Gamma distributions are typically employed in the survival analysis context. By analyzing the progression-free survival of patients receiving immunotherapy, we found that the trend of their baseline cumulative hazard distribution was consistent with the Weibull distribution with a scale parameter equal to 1 (see in Figure 1), so the Weibull–Cox proportional hazard model was employed in this article. Our joint model sheds new light on the tumor mutation burden stratification based on a multi-endpoint assessment of immunotherapy benefits, suggesting more comprehensive and robust TMB-positive thresholds for clinical physicians. Attending physicians should make treatment recommendations based on patients’ multi-dimensional information.
Conclusion
The existing statistical methods for determining TMB thresholds are based on a single clinical endpoint while ignoring the difference between the true and observed TMB values. Our study considers TMB measurement error and integrates multifaceted clinical efficacy to optimize TMB thresholds. We report a multi-endpoint joint model as a generalized method for inferring TMB thresholds that facilitates consistent statistical inference using an iterative numerical estimation procedure considering mis-specified TMB. Our simulation results show that the proposed model maintains higher accuracy and stability than standard regressions, in terms of both parameter estimation and threshold determination. To validate the feasibility of the proposed thresholds, we pooled a cohort of 73 patients with non-small-cell lung cancer and 64 patients with nasopharyngeal carcinoma treated with anti-PD-(L)1, as well as a validation cohort of 943 patients for retrospective analysis. From the simulation and experimental results, we reasonably conclude that 1) our proposed joint model with the parameter estimation procedure can more robustly assess patient efficacy even under the interference of measurement error in TMB. 2) Integrating patients’ multi-dimensional information to employ multi-endpoint efficacy analysis can prompt a more comprehensive TMB subgrouping. 3) The TMB-positive threshold derived from multi-endpoint joint analysis can classify patients into two groups with more apparently stratified efficacy. Our model is applicable to clinical multiple endpoint data and can better assist physicians in their clinical decisions.
Data Availability Statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.
Ethics Statement
The studies involving human participants were reviewed and approved by the Sun Yat-Sen University Cancer Center’s Ethical Review Committee. The patients/participants provided their written informed consent to participate in this study.
Author Contributions
YW, XL, JW and WF conceived and designed the study. YW, XL and JW developed the methodology. WF, YS, LZ, YW, XL and YX collected and managed the data. YW wrote the first draft. YW, XL, JW, LZ and WF reviewed, edited, and approved the manuscript. XL, JW, YX, XPZ, XYZ, YL, LZ and WF provided administrative, technical, or material support. JW was primarily responsible for the final manuscript.
Funding
This work was funded by the National Natural Science Foundation of China, grant number 92046009, the Natural Science Basic Research Program of Shaanxi, grant number 2020JC-01, the National Natural Science Foundation of China, grant number 82173101, and the National Natural Science Foundation of China, grant number 81972556.
Conflict of Interest
The author YS was employed by the company Nanjing Geneseeq Technology Inc.
The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s Note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors, and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Acknowledgments
The authors thank the patients and their families for participation in the study and the reviewer for the constructive and valuable suggestions.
Supplementary Material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2022.915839/full#supplementary-material
References
Alioto, T. S., Buchhalter, I., Derdak, S., Hutter, B., Eldridge, M. D., Hovig, E., et al. (2015). A Comprehensive Assessment of Somatic Mutation Detection in Cancer Using Whole-Genome Sequencing. Nat. Commun. 6, 10001. doi:10.1038/ncomms10001
Asar, Ö., Ritchie, J., Kalra, P. A., and Diggle, P. J. (2015). Joint Modelling of Repeated Measurement and Time-To-Event Data: an Introductory Tutorial. Int. J. Epidemiol. 44 (1), 334–344. doi:10.1093/ije/dyu262
Augustin, T. (2004). An Exact Corrected Log-Likelihood Function for Cox's Proportional Hazards Model under Measurement Error and Some Extensions. Scand. J. Stat. 31 (1), 43–50. doi:10.1111/j.1467-9469.2004.00371.x
Bai, R., Lv, Z., Xu, D., and Cui, J. (2020). Predictive Biomarkers for Cancer Immunotherapy with Immune Checkpoint Inhibitors. Biomark. Res. 8, 34. doi:10.1186/s40364-020-00209-0
Barbieri, A., and Legrand, C. (2020). Joint Longitudinal and Time-To-Event Cure Models for the Assessment of Being Cured. Stat. Methods Med. Res. 29 (4), 1256–1270. doi:10.1177/0962280219853599
Boyiadzis, M. M., Kirkwood, J. M., Marshall, J. L., Pritchard, C. C., Azad, N. S., and Gulley, J. L. (2018). Significance and Implications of FDA Approval of Pembrolizumab for Biomarker-Defined Disease. J. Immunother. cancer 6 (1), 35. doi:10.1186/s40425-018-0342-x
Bracarda, S., Altavilla, A., Hamzaj, A., Sisani, M., Marrocolo, F., Del Buono, S., et al. (2015). Immunologic Checkpoints Blockade in Renal Cell, Prostate, and Urothelial Malignancies. Seminars Oncol. 42 (3), 495–505. doi:10.1053/j.seminoncol.2015.02.004
Campesato, L. F., Barroso-Sousa, R., Jimenez, L., Correa, B. R., Sabbaga, J., Hoff, P. M., et al. (2015). Comprehensive Cancer-Gene Panels Can Be Used to Estimate Mutational Load and Predict Clinical Benefit to PD-1 Blockade in Clinical Practice. Oncotarget 6 (33), 34221–34227. doi:10.18632/oncotarget.5950
Cao, D., Xu, H., Xu, X., Guo, T., and Ge, W. (2019). High Tumor Mutation Burden Predicts Better Efficacy of Immunotherapy: a Pooled Analysis of 103078 Cancer Patients. Oncoimmunology 8 (9), e1629258. doi:10.1080/2162402X.2019.1629258
Carbone, D. P., Reck, M., Paz-Ares, L., Creelan, B., Horn, L., Steins, M., et al. (2017). First-Line Nivolumab in Stage IV or Recurrent Non-small-cell Lung Cancer. N. Engl. J. Med. 376 (25), 2415–2426. doi:10.1056/NEJMoa1613493
Carroll, R. J., Ruppert, D., Stefanski, L. A., and Crainiceanu, C. M. (2006). Measurement Error in Nonlinear Models: A Modern Perspective. London, UK: Chapman and Hall/CRC.
Chalmers, Z. R., Connelly, C. F., Fabrizio, D., Gay, L., Ali, S. M., Ennis, R., et al. (2017). Analysis of 100,000 Human Cancer Genomes Reveals the Landscape of Tumor Mutational Burden. Genome Med. 9 (1), 34. doi:10.1186/s13073-017-0424-2
Chiang, A. C., and Herbst, R. S. (2020). Frontline Immunotherapy for NSCLC - the Tale of the Tail. Nat. Rev. Clin. Oncol. 17 (2), 73–74. doi:10.1038/s41571-019-0317-y
Colli, L. M., Machiela, M. J., Myers, T. A., Jessop, L., Yu, K., and Chanock, S. J. (2016). Burden of Nonsynonymous Mutations Among TCGA Cancers and Candidate Immune Checkpoint Inhibitor Responses. Cancer Res. 76 (13), 3767–3772. doi:10.1158/0008-5472.CAN-16-0170
Conway, J. R., Kofman, E., Mo, S. S., Elmarakeby, H., and Van Allen, E. (2018). Genomics of Response to Immune Checkpoint Therapies for Cancer: Implications for Precision Medicine. Genome Med. 10 (1), 93. doi:10.1186/s13073-018-0605-7
Cristescu, R., Mogg, R., Ayers, M., Albright, A., Murphy, E., Yearley, J., et al. (2018). Pan-tumor Genomic Biomarkers for PD-1 Checkpoint Blockade-Based Immunotherapy. Science 362 (6411), eaar3593. doi:10.1126/science.aar3593
Davis, P. J., and Rabinowitz, P. (2007). Methods of Numerical Integration. Chelmsford, MA, USA: Courier Corporation.
Eisenhauer, E. A., Therasse, P., Bogaerts, J., Schwartz, L. H., Sargent, D., Ford, R., et al. (2009). New Response Evaluation Criteria in Solid Tumours: Revised RECIST Guideline (Version 1.1). Eur. J. Cancer 45 (2), 228–247. doi:10.1016/j.ejca.2008.10.026
Fang, W., Ma, Y., Yin, J. C., Hong, S., Zhou, H., Wang, A., et al. (2019). Comprehensive Genomic Profiling Identifies Novel Genetic Predictors of Response to Anti-PD-(L)1 Therapies in Non-small Cell Lung Cancer. Clin. Cancer Res. 25 (16), 5015–5026. doi:10.1158/1078-0432.CCR-19-0585
Fang, W., Yang, Y., Ma, Y., Hong, S., Lin, L., He, X., et al. (2018). Camrelizumab (SHR-1210) Alone or in Combination with Gemcitabine Plus Cisplatin for Nasopharyngeal Carcinoma: Results from Two Single-Arm, Phase 1 Trials. Lancet Oncol. 19 (10), 1338–1350. doi:10.1016/S1470-2045(18)30495-9
Goodman, A. M., Kato, S., Bazhenova, L., Patel, S. P., Frampton, G. M., Miller, V., et al. (2017). Tumor Mutational Burden as an Independent Predictor of Response to Immunotherapy in Diverse Cancers. Mol. Cancer Ther. 16 (11), 2598–2608. doi:10.1158/1535-7163.MCT-17-0386
Guidance, D. (2017). Multiple Endpoints in Clinical Trials Guidance for Industry. White Oak: Center for Biologics Evaluation and Research (CBER.
Hellmann, M. D., Callahan, M. K., Awad, M. M., Calvo, E., Ascierto, P. A., Atmaca, A., et al. (2018a). Tumor Mutational Burden and Efficacy of Nivolumab Monotherapy and in Combination with Ipilimumab in Small-Cell Lung Cancer. Cancer Cell. 33 (5), 853–861. e4. doi:10.1016/j.ccell.2018.04.001
Hellmann, M. D., Ciuleanu, T.-E., Pluzanski, A., Lee, J. S., Otterson, G. A., Audigier-Valette, C., et al. (2018b). Nivolumab Plus Ipilimumab in Lung Cancer with a High Tumor Mutational Burden. N. Engl. J. Med. 378 (22), 2093–2104. doi:10.1056/NEJMoa1801946
Hellmann, M. D., Nathanson, T., Rizvi, H., Creelan, B. C., Sanchez-Vega, F., Ahuja, A., et al. (2018c). Genomic Features of Response to Combination Immunotherapy in Patients with Advanced Non-small-cell Lung Cancer. Cancer Cell. 33 (5), 843–852. e4. doi:10.1016/j.ccell.2018.03.018
Hsieh, F., Tseng, Y.-K., and Wang, J.-L. (2006). Joint Modeling of Survival and Longitudinal Data: Likelihood Approach Revisited. Biometrics 62 (4), 1037–1043. doi:10.1111/j.1541-0420.2006.00570.x
Hugo, W., Zaretsky, J. M., Sun, L., Song, C., Moreno, B. H., Hu-Lieskovan, S., et al. (2016). Genomic and Transcriptomic Features of Response to Anti-PD-1 Therapy in Metastatic Melanoma. Cell. 165 (1), 35–44. doi:10.1016/j.cell.2016.02.065
Johnson, D. B., Frampton, G. M., Rioth, M. J., Yusko, E., Xu, Y., Guo, X., et al. (2016). Targeted Next Generation Sequencing Identifies Markers of Response to PD-1 Blockade. Cancer Immunol. Res. 4 (11), 959–967. doi:10.1158/2326-6066.CIR-16-0143
Kuryk, L., Bertinato, L., Staniszewska, M., Pancer, K., Wieczorek, M., Salmaso, S., et al. (2020). From Conventional Therapies to Immunotherapy: Melanoma Treatment in Review. Cancers 12 (10), 3057. doi:10.3390/cancers12103057
Legrand, F. A., Gandara, D. R., Mariathasan, S., Powles, T., He, X., Zhang, W., et al. (2018). Association of High Tissue Tmb and Atezolizumab Efficacy across Multiple Tumor Types. J. Clin. Oncol. 36 (15_Suppl. l), 12000. doi:10.1200/jco.2018.36.15_suppl.12000
Lemery, S., Keegan, P., and Pazdur, R. (2017). First FDA Approval Agnostic of Cancer Site - when a Biomarker Defines the Indication. N. Engl. J. Med. 377 (15), 1409–1412. doi:10.1056/NEJMp1709968
Lim, C., Tsao, M. S., Le, L. W., Shepherd, F. A., Feld, R., Burkes, R. L., et al. (2015). Biomarker Testing and Time to Treatment Decision in Patients with Advanced Nonsmall-Cell Lung Cancer. Ann. Oncol. 26 (7), 1415–1421. doi:10.1093/annonc/mdv208
Lin, X., Taylor, J. M. G., and Ye, W. (2008). A Penalized Likelihood Approach to Joint Modeling of Longitudinal Measurements and Time-To-Event Data. Statistics its Interface 1 (1), 33–45. doi:10.4310/sii.2008.v1.n1.a4
Ma, Y., Fang, W., Zhang, Y., Yang, Y., Hong, S., Zhao, Y., et al. (2019). A Phase I/II Open-Label Study of Nivolumab in Previously Treated Advanced or Recurrent Nasopharyngeal Carcinoma and Other Solid Tumors. Oncologist 24 (7), 891–e431. doi:10.1634/theoncologist.2019-0284
Majc, B., Novak, M., Kopitar-Jerala, N., Jewett, A., and Breznik, B. (2021). Immunotherapy of Glioblastoma: Current Strategies and Challenges in Tumor Model Development. Cells 10 (2), 265. doi:10.3390/cells10020265
Miao, D., Margolis, C. A., Vokes, N. I., Liu, D., Taylor-Weiner, A., Wankowicz, S. M., et al. (2018). Genomic Correlates of Response to Immune Checkpoint Blockade in Microsatellite-Stable Solid Tumors. Nat. Genet. 50 (9), 1271–1281. doi:10.1038/s41588-018-0200-2
Motzer, R. J., Escudier, B., McDermott, D. F., George, S., Hammers, H. J., Srinivas, S., et al. (2015). Nivolumab versus Everolimus in Advanced Renal-Cell Carcinoma. N. Engl. J. Med. 373 (19), 1803–1813. doi:10.1056/NEJMoa1510665
Nakamura, T. (1990). Corrected Score Function for Errors-In-Variables Models: Methodology and Application to Generalized Linear Models. Biometrika 77 (1), 127–137. doi:10.1093/biomet/77.1.127
Novick, S. J., and Stefanski, L. A. (2002). Corrected Score Estimation via Complex Variable Simulation Extrapolation. J. Am. Stat. Assoc. 97 (458), 472–481. doi:10.1198/016214502760047005
Pardoll, D. M. (2012). The Blockade of Immune Checkpoints in Cancer Immunotherapy. Nat. Rev. Cancer 12 (4), 252–264. doi:10.1038/nrc3239
Phillips, A., and Haudiquet, V. (2003). ICH E9 Guideline ?Statistical Principles for Clinical Trials? a Case Study. Stat. Med. 22 (1), 1–11. discussion 13-7. doi:10.1002/sim.1328
Riaz, N., Havel, J. J., Makarov, V., Desrichard, A., Urba, W. J., Sims, J. S., et al. (2017). Tumor and Microenvironment Evolution during Immunotherapy with Nivolumab. Cell. 171 (4), 934–949. e16. doi:10.1016/j.cell.2017.09.028
Ristl, R., Urach, S., Rosenkranz, G., and Posch, M. (2019). Methods for the Analysis of Multiple Endpoints in Small Populations: A Review. J. Biopharm. Statistics 29 (1), 1–29. doi:10.1080/10543406.2018.1489402
Rizopoulos, D. (2011). Dynamic Predictions and Prospective Accuracy in Joint Models for Longitudinal and Time-To-Event Data. Biometrics 67 (3), 819–829. doi:10.1111/j.1541-0420.2010.01546.x
Rizopoulos, D., and Lesaffre, E. (2014). Introduction to the Special Issue on Joint Modelling Techniques. Stat. Methods Med. Res. 23 (1), 3–10. doi:10.1177/0962280212445800
Rizvi, H., Sanchez-Vega, F., La, K., Chatila, W., Jonsson, P., Halpenny, D., et al. (2018). Molecular Determinants of Response to Anti-programmed Cell Death (PD)-1 and Anti-programmed Death-Ligand 1 (PD-L1) Blockade in Patients with Non-small-cell Lung Cancer Profiled with Targeted Next-Generation Sequencing. J. Clin. Oncol. 36 (7), 633–641. doi:10.1200/JCO.2017.75.3384
Rizvi, N. A., Hellmann, M. D., Snyder, A., Kvistborg, P., Makarov, V., Havel, J. J., et al. (2015). Mutational Landscape Determines Sensitivity to PD-1 Blockade in Non-small Cell Lung Cancer. Science 348 (6230), 124–128. doi:10.1126/science.aaa1348
Rosenberg, J. E., Hoffman-Censits, J., Powles, T., van der Heijden, M. S., Balar, A. V., Necchi, A., et al. (2016). Atezolizumab in Patients with Locally Advanced and Metastatic Urothelial Carcinoma Who Have Progressed Following Treatment with Platinum-Based Chemotherapy: a Single-Arm, Multicentre, Phase 2 Trial. Lancet 387 (10031), 1909–1920. doi:10.1016/S0140-6736(16)00561-4
Samstein, R. M., Lee, C.-H., Shoushtari, A. N., Hellmann, M. D., Shen, R., Janjigian, Y. Y., et al. (2019). Tumor Mutational Load Predicts Survival after Immunotherapy across Multiple Cancer Types. Nat. Genet. 51 (2), 202–206. doi:10.1038/s41588-018-0312-8
Singal, G., Miller, P. G., Agarwala, V., Li, G., Kaushik, G., Backenroth, D., et al. (2019). Association of Patient Characteristics and Tumor Genomics with Clinical Outcomes Among Patients with Non-small Cell Lung Cancer Using a Clinicogenomic Database. JAMA 321 (14), 1391–1399. doi:10.1001/jama.2019.3241
Snyder, A., Makarov, V., Merghoub, T., Yuan, J., Zaretsky, J. M., Desrichard, A., et al. (2014). Genetic Basis for Clinical Response to CTLA-4 Blockade in Melanoma. N. Engl. J. Med. 371 (23), 2189–2199. doi:10.1056/NEJMoa1406498
Subbiah, V., Solit, D. B., Chan, T. A., and Kurzrock, R. (2020). The FDA Approval of Pembrolizumab for Adult and Pediatric Patients with Tumor Mutational Burden (TMB) ≥10: a Decision Centered on Empowering Patients and Their Physicians. Ann. Oncol. 31 (9), 1115–1118. doi:10.1016/j.annonc.2020.07.002
Van Allen, E. M., Miao, D., Schilling, B., Shukla, S. A., Blank, C., Zimmer, L., et al. (2015). Genomic Correlates of Response to CTLA-4 Blockade in Metastatic Melanoma. Science 350 (6257), 207–211. doi:10.1126/science.aad0095
Van Rooij, N., van Buuren, M. M., Philips, D., Velds, A., Toebes, M., Heemskerk, B., et al. (2013). Tumor Exome Analysis Reveals Neoantigen-specific T-Cell Reactivity in an Ipilimumab-Responsive Melanoma. J. Clin. Oncol. 31 (32), e439–e442. doi:10.1200/JCO.2012.47.7521
Wang, L., Ge, J., Lan, Y., Shi, Y., Luo, Y., Tan, Y., et al. (2020). Tumor Mutational Burden Is Associated with Poor Outcomes in Diffuse Glioma. BMC Cancer 20 (1), 213. doi:10.1186/s12885-020-6658-1
Wołącewicz, M., Hrynkiewicz, R., Grywalska, E., Suchojad, T., Leksowski, T., Roliński, J., et al. (2020). Immunotherapy in Bladder Cancer: Current Methods and Future Perspectives. Cancers 12 (5), 1181. doi:10.3390/cancers12051181
Wood, M. A., Weeder, B. R., David, J. K., Nellore, A., and Thompson, R. F. (2020). Burden of Tumor Mutations, Neoepitopes, and Other Variants Are Weak Predictors of Cancer Immunotherapy Response and Overall Survival. Genome Med. 12 (1), 33. doi:10.1186/s13073-020-00729-2
Xu, H., DiCarlo, J., Satya, R. V., Peng, Q., and Wang, Y. (2014). Comparison of Somatic Mutation Calling Methods in Amplicon and Whole Exome Sequence Data. BMC Genomics 15, 244. doi:10.1186/1471-2164-15-244
Keywords: clinical immunology, stratification biomarker, tumor mutation burden, joint modeling, multiple endpoints, measurement error
Citation: Wang Y, Lai X, Wang J, Xu Y, Zhang X, Zhu X, Liu Y, Shao Y, Zhang L and Fang W (2022) A Joint Model Considering Measurement Errors for Optimally Identifying Tumor Mutation Burden Threshold. Front. Genet. 13:915839. doi: 10.3389/fgene.2022.915839
Received: 08 April 2022; Accepted: 13 June 2022;
Published: 04 August 2022.
Edited by:
Jian Song, University Hospital Münster, GermanyReviewed by:
Song Cao, Washington University in St. Louis, United StatesHongmin Cai, South China University of Technology, China
Copyright © 2022 Wang, Lai, Wang, Xu, Zhang, Zhu, Liu, Shao, Zhang and Fang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Jiayin Wang, wangjiayin@mail.xjtu.edu.cn; Wenfeng Fang, fangwf@sysucc.org.cn
†These authors have contributed equally to this work and share first authorship