Assessing Cognitive Change and Quality of Life 12 Months After Epilepsy Surgery—Development and Application of Reliable Change Indices and Standardized Regression-Based Change Norms for a Neuropsychological Test Battery in the German Language

Conradi, Nadine; Behrens, Marion; Hermsen, Anke M.; Kannemann, Tabitha; Merkel, Nina; Schuster, Annika; Freiman, Thomas M.; Strzelczyk, Adam; Rosenow, Felix

doi:10.3389/fpsyg.2020.582836

ORIGINAL RESEARCH article

Front. Psychol., 15 October 2020

Sec. Neuropsychology

Volume 11 - 2020 | https://doi.org/10.3389/fpsyg.2020.582836

Assessing Cognitive Change and Quality of Life 12 Months After Epilepsy Surgery—Development and Application of Reliable Change Indices and Standardized Regression-Based Change Norms for a Neuropsychological Test Battery in the German Language

$\r\nNadine Conradi,*$ Nadine Conradi^1,2*

Marion Behrens³

Anke M. Hermsen^1,2

Tabitha Kannemann¹

Nina Merkel^1,2

Annika Schuster¹

Thomas M. Freiman^1,2,4

Adam Strzelczyk^1,2

Felix Rosenow^1,2

¹Epilepsy Center Frankfurt Rhine-Main, Department of Neurology, University Hospital Frankfurt and Goethe University, Frankfurt, Germany
²LOEWE Center for Personalized Translational Epilepsy Research, Goethe University, Frankfurt, Germany
³Department of Neurology, University Hospital Frankfurt and Goethe University, Frankfurt, Germany
⁴Department of Neurosurgery, University Hospital Frankfurt and Goethe University, Frankfurt, Germany

Objective: The establishment of patient-centered measures capable of empirically determining meaningful cognitive change after surgery can significantly improve the medical care of epilepsy patients. Thus, this study aimed to develop reliable change indices (RCIs) and standardized regression-based (SRB) change norms for a comprehensive neuropsychological test battery in the German language.

Methods: Forty-seven consecutive patients with temporal lobe epilepsy underwent neuropsychological assessments, both before and 12 months after surgery. Practice-effect-adjusted RCIs and SRB change norms for each test score were computed. To assess their usefulness, the presented methods were applied to a clinical sample, and binary logistic regression analyses were conducted to model the odds of achieving improvement in quality of life (QOL) after surgery.

Results: The determined RCIs at 90% confidence intervals and the SRB equations for each test score included in the test battery are provided. Cohen’s kappa analyses revealed a moderate mean agreement between the two measures, varying from slight to almost perfect agreement across test scores. Using these measures, a negative association between improvement in QOL and decline in verbal memory functions after surgery was detected (adjusted odds ratio = 0.09, p = 0.006).

Significance: To the best of our knowledge, this study is the first to develop RCIs and SRB change norms necessary for the objective determination of neuropsychological change in a comprehensive test battery in the German language, facilitating the individual monitoring of improvement and decline in each patients’ cognitive functioning and psychosocial situations after epilepsy surgery. The application of the described measures revealed a strong negative association between improvement in QOL and decline in verbal memory functions after surgery.

Introduction

In approximately 30% of epilepsy patients, anti-seizure drugs (ASD) fail to sufficiently control seizures (Sander, 2003). Temporal lobe epilepsy (TLE) represents the most common subgroup of drug resistant focal epilepsies (Engel, 2001). As demonstrated by numerous studies (Wiebe et al., 2001; Engel et al., 2012; Engel, 2018; Mohan et al., 2018), for selected drug-resistant patients, epilepsy surgery is considered to be a safe, evidence-based, and effective treatment option that is superior to continued medical therapy. Although seizure control remains the primary aim and is the most examined outcome of epilepsy surgery, the investigation of treatment effects on the patients’ cognitive functioning and psychosocial situations has gained increasing importance in the field. Through the repeated administration of standardized psychometric tests and self-report questionnaires, clinical neuropsychology aims (1) to quantify the presurgical functionality of affected brain structures to facilitate individualized predictions of possible cognitive risks (Elger et al., 2004; Helmstaedter, 2004, 2008) and (2) to evaluate postsurgical cognitive and behavioral change and its impact on the patients’ quality of life (QOL) (Baxendale, 2008; Sherman et al., 2011; Ives-Deliperi and Butler, 2017).

Traditionally, unstandardized difference scores and arbitrary cut-off values were applied to determine cognitive and behavioral change (either across time or in response to intervention). However, the assessment of neuropsychological change has been associated with several psychometric difficulties, as change in test results can be attributed to multiple factors that are unrelated to the intervention investigated (e.g., low retest reliability, or measurement errors). Therefore, empirical methods are required to determine whether change in repeated neuropsychological test results represent true change, which can be considered statistically meaningful, or change caused by random fluctuations in measurements (Hermann et al., 1996; Chelune and Franklin, 2003; Perdices, 2005).

1. The reliable change index (RCI), which was originally introduced by Jacobson and Truax (1991), is the most popular empirical technique for the assessment of neuropsychological change. RCIs are computed as ratios, in which the difference between retest- and baseline scores is divided by an error term (standard error of measurement of the difference). Because the calculation of the error term has been continuously debated among researchers, numerous modified RCI approaches exist in the literature, and the choice of RCI approach should be directed by practical and theoretical considerations (Perdices, 2005).

2. Standardized regression-based (SRB) change norms were initially developed by McSweeny et al. (1993) and are based on multiple regression modeling. During this statistical procedure, baseline scores, together with moderating demographic and clinical variables, are used to derive regression equations to predict retest scores. These predictions are compared to the observed retest scores, and differences are z-transformed.

3. The minimal clinically important difference (MCID) describes the amount of change, i.e., the smallest difference between retest- and baseline scores in self-report questionnaires, which a patient considers to be important (McGlothlin and Lewis, 2014). Thus, MCID is a patient-centered approach that not only incorporates the magnitude of change but also its value for the patient. Previous studies have provided thresholds for MCID in commonly used (epilepsy-specific) questionnaires (Wiebe et al., 2002; Button et al., 2015).

Despite their superior informative value, these patient-centered empirical methods have been incorporated into neuropsychological research and clinical routine rather slowly. As reviewed by Sherman et al. (2011), many clinicians continue to apply unstandardized measures, and studies often focus on group-level analyses, during which the results of patients who experience improvement, decline, and no change between assessments are aggregated, which can mask individual change. The slow implementation of empirical measures may be due to availability problems. Although publications examining empirical measures for a wide range of more popular, mostly psychometric tests in the English language exist, data for specific, less widely used tests in languages other than English is often less readily available (Zahra and Hedge, 2010).

The primary objective of this study was to address this issue by developing empirical measures for a comprehensive neuropsychological test battery in the German language. To the best of our knowledge, this study is the first to develop both RCIs and SRB change norms for the assessment of meaningful change for each test score included in the standard test battery, as recommended by the German ILAE Chapter and the Austrian, German and Swiss Working Group on Presurgical Epilepsy Diagnosis and Epilepsy Surgery (Brückner, 2012; Rosenow et al., 2016).

A secondary objective was to illustrate how the application of the presented measures (RCIs, SRB change norms, and MCID thresholds) can facilitate the objective determination of cognitive and behavioral change in individual patients, by following a clinical sample of TLE patients over 12 months after epilepsy surgery. Improvement and decline in the patients’ cognitive functioning and psychosocial situations were monitored individually, and factors that affected improvement in QOL after surgery were assessed.

Materials and Methods

Patients

This study used longitudinal data of a consecutive clinical sample of 50 TLE patients, who underwent epilepsy surgery at the Epilepsy Center Frankfurt Rhine-Main. Patients without formal education (n = 2) or with diagnosed psychiatric comorbidities (n = 1) were excluded (6.0%), resulting in a final sample of 47 patients (70.2% women; mean age: 32.8 years, SD = 12.0). The study was approved by the Ethics Committee of the University of Frankfurt Medical Faculty. The informed consent was waived by the ethics committee because of the retrospective nature of the analysis. Epilepsy syndrome diagnoses were obtained during video-EGG-monitoring, and the classifications of epilepsies and etiologies were based on the latest definitions proposed by the ILAE (Fisher et al., 2017; Scheffer et al., 2017) and the four-dimensional epilepsy classification (Lüders et al., 2019a,b; Rosenow et al., 2020). The neuropsychological assessments followed the standards established by the German ILAE Chapter and the Austrian, German, and Swiss Working Group on Presurgical Epilepsy Diagnosis and Epilepsy Surgery (Brückner, 2012; Rosenow et al., 2016). A standard neuropsychological test battery and two self-report questionnaires were performed, both before and 12 months after epilepsy surgery. Hemispheric language lateralization was assessed by functional transcranial Doppler sonography as described previously (Conradi et al., 2019) (46.8%), functional MRI (38.3%) or the Wada test (14.9%). All surgeries were conducted within the temporal lobe, and 48.9% of surgeries were conducted within the language-dominant hemisphere (dominant TL surgery), whereas 51.1% were conducted within the non-dominant hemisphere (non-dominant TL surgery). Sixteen patients (34.0%) underwent classical two-third temporal lobectomies, 13 patients (27.7%) received amygdalohippocampectomies including the temporal pole, 2 patients (4.3%) received subtemporal selective amygdalohippocampectomies, and 16 patients (34.0%) received extended lesionectomies. Seizure outcome was classified using the Engel Epilepsy Surgery Outcome Scale (Engel, 1993). The socio-demographic and clinical characteristics of the sample are summarized in Table 1.

TABLE 1

Table 1. Socio-demographic and clinical characteristics of the full sample (n = 47) and the two subgroups of patients with dominant TL surgery (n = 23) and patients with non-dominant TL surgery (n = 24).

Neuropsychological Assessments

All neuropsychological assessments were performed in a standardized fashion, by trained neuropsychologists, and lasted approximately 3 h each. Patients were verified as not currently being treated with topiramate, receiving no acute treatments with benzodiazepines, and not having seizures or status epilepticus within the 24 h immediately before or during the assessments.

The neuropsychological test battery recommended by the German ILAE Chapter and the Austrian, German and Swiss Working Group on Presurgical Epilepsy Diagnosis and Epilepsy Surgery (Brückner, 2012; Rosenow et al., 2016) was administered at the pre- and postsurgical assessments, consisting of the following standardized psychometric tests, as described elsewhere (Conradi et al., 2020): (1) To evaluate attentional functions, the subtest Divided Attention of the computerized “Testbatterie zur Aufmerksamkeitsprüfung” (TAP) (Zimmermann and Fimm, 2007) was applied. (2) For the assessment of verbal learning and memory functions, the “Verbaler Lern- und Merkfähigkeitstest” (VLMT) (Helmstaedter et al., 2001) was used. (3) For the evaluation of non-verbal learning and memory functions, the “Diagnosticum für Cerebralschädigung II” (DCS-II) (Weidlich et al., 2011) was performed. (4) Additionally, the “Rey-Osterrieth Complex Figure Test” (ROCFT) (Rey, 1941; Osterrieth, 1944) was used. (5) To measure short-term memory and working memory, the subtests Digit Span and Visual Memory Span of the “Wechsler Memory Scale-Revised” (WMS-R) (Härting et al., 2000) were performed. (6) Visuospatial functioning was measured by the completeness of the copy of the complex figure from the ROCFT. (7) Moreover, the subtests Silhouettes and Position Discrimination of the “Visual Object and Space Perception Battery” (VOSP) (Warrington and James, 1992) were used. (8) Language functions were assessed by the phonemic and semantic verbal fluency subtests of the “Regensburger Wortflüssigkeitstest” (RWT) (Aschenbrenner et al., 2001). (10) To assess an aspect of executive functioning, the Flexibility subtest of the TAP was used.

During both assessments, self-reported symptoms of depression were evaluated using the 21-item “Beck Depression Inventory-II” (BDI-II) (Beck et al., 1996), with higher scores indicating greater severity of symptoms (0–13: minimal depression; 14–19: mild depression; 20–28: moderate depression; and 29–63: severe depression). The 31-item “Quality of Life in Epilepsy Inventory-31” (QOLIE-31) (Devinsky et al., 1995) was used during both assessments, to measure the self-reported quality of life on seven subscales (seizure worry, overall QOL, emotional well-being, energy and fatigue, cognitive functioning, medication effects, and social functioning), which were combined for a total score, with standardized scores ranging from 0 to 100 and higher scores demonstrating better QOL.

Additionally, during the presurgical assessments, intelligence was estimated using a prediction formula based on socio-demographic characteristics (Jahn et al., 2013), verbal IQ was measured using a multiple-choice vocabulary test (“Mehrfachwahl-Wortschatz-Intelligenztest”, MWT-B) (Lehrl, 1999), and handedness was determined by the “Edinburgh Handedness Inventory” (EHI) (Oldfield, 1971).

Statistical Analyses

Test results obtained during the presurgical assessments (t₁) are referred to as “baseline scores,” whereas test results from the postsurgical assessments (t₂) are referred to as “retest scores.” The median interval between t₁ and epilepsy surgery was 6 months (SD = 14.47), between epilepsy surgery and t₂ was 12 months (SD = 1.01), and between t₁ and t₂ was 18 months (SD = 14.05).

Analyses of the socio-demographic and clinical characteristics of the full sample and the two subgroups (dominant vs. non-dominant TL surgery) were conducted using independent-samples t-tests for numerical data and chi-square tests for categorical data. Paired-samples t-tests were applied to examine differences between retest and baseline scores on group-level. Analyses were performed in IBM SPSS Statistics 22 (IBM Corporation, Armonk, NY, United States) and Microsoft Excel 2016 (Microsoft, Seattle, WA, United States).

Development of the Presented Measures

To not only allow determining the degree of expected change in repeated neuropsychological test results associated with random fluctuations in measurements (e.g., practice with test materials), but to also take the influence of disease-related factors (e.g., continued medical therapy, or brain surgery in general) into account, the computation of RCIs and SRB change norms was based on data obtained from a surgical cohort of epilepsy patients. However, to preclude expected material-specific cognitive change after brain surgery [dominant TL surgery often associated with decline in verbal learning and memory functions, and non-dominant TL surgery often associated with decline in non-verbal learning and memory functions (Sherman et al., 2011)], the calculations for VLMT test scores were only based on data from the subgroup of patients with non-dominant TL surgery. Accordingly, calculations for DCS-II and ROCFT test scores were only based on data from the subgroup of patients with dominant TL surgery. All other calculations were based on the full sample.

Cohen’s kappa analyses were performed to assess the agreement between the two computed measures (RCIs and SRB change norms) in each neuropsychological test score.

Reliable change indices

Practice-effect (PE)-adjusted RCIs for each test score included in the neuropsychological test battery were computed according to the approach described by Chelune et al. (1993). First, retest reliability coefficients (r_t₁_t₂) for each test score were obtained. Then, the standard error of measurement of the difference (SE_diff) for each test score was computed as follows: SE_diff = [SD_t₁ × 2 (1 – r_t₁_t₂)^1/2] – PE. Next, 90% confidence intervals were generated for the resulting SE_diff scores by multiplying them by ± 1.64. To adjust for practice effects, the mean change between t₂ and t₁ for each particular test score was added to the upper and lower limits of the confidence interval.

Standardized regression-based change norms

Utilizing the methods described by McSweeny et al. (1993), multiple linear regression analyses were conducted to predict the retest score for each neuropsychological test score using the baseline score combined with potential moderating demographic and clinical variables (intelligence, age at t₁, age at onset of epilepsy, duration of epilepsy, and the interval between t₁ and t₂) confirmed in previous studies (Hermann et al., 1996; Busch et al., 2015). Collinearity statistics were computed to preclude concerns regarding multicollinearity between the predictor variables, and a stepwise procedure (0.05 = threshold for variable entry; 0.10 = threshold for variable removal) was used.

Longitudinal Follow-Up of TLE Patients

The computed RCIs and SRB change norms were applied to determine meaningful change for each patient in each test score included in the neuropsychological test battery. The MCID in each scale of the self-report questionnaires was assessed for each patient using the thresholds provided in the literature (Wiebe et al., 2002; Button et al., 2015). The proportions of patients who achieved meaningful change in each test score and scale were compared between the two subgroups (dominant vs. non-dominant TL surgery) using chi-square tests.

To investigate the patients’ clinical characteristics and neuropsychological factors that influenced improvement in QOL after surgery, univariate binary logistic regression analyses were computed. Crude odds ratios and respective 95% confidence intervals were used to measure the magnitude of associations between improvement in QOL and the predictor variables. Prior to conducting multiple binary logistic regression analyses, the suitability of the data characteristics for this operation was confirmed. A forward stepwise procedure (Wald, 0.05 = threshold for variable entry; 0.10 = threshold for variable removal) was used, and an adjusted odds ratio and the respective 95% confidence interval for each included predictor variable were reported. The overall percentage accuracy in classification (PAC), specificity, sensitivity, and Nagelkerke R² were computed to evaluate the quality of the resulting models.

Results

Twelve months after surgery, 33 patients were free of disabling seizures (70.2%, Engel class I), among which 29 patients were completely seizure-free since surgery (Engel class IA). Eight patients had rare disabling seizures (17.0%, Engel class II), two patients achieved a worthwhile improvement (4.3%, Engel class III), and four patients experienced no worthwhile improvement in seizure control (8.5%, Engel class IV) after surgery. No significant differences were identified between the two subgroups (dominant vs. non-dominant TL surgery) with regard to socio-demographic and clinical characteristics, except for verbal IQ, which was lower in patients with an epileptogenic focus in the language-dominant hemisphere (Table 1).

A summary of the mean baseline- and retest raw scores, the mean change scores and results of the group-level analyses, and the retest reliability coefficients for each test score included in the neuropsychological test battery and each scale of the self-report questionnaires are presented in Table 2.

TABLE 2

Table 2. Mean raw scores, standard deviation (SD) and mean change between t₂ and t₁ (compared by paired samples t-tests) together with retest reliability coefficients for each test score and scale.

Reliable Change Indices

The PE-adjusted RCIs at 90% confidence intervals for each neuropsychological test score are provided in Table 3.

TABLE 3

Table 3. Practice-effect (PE)-adjusted reliable change indices (RCIs) and respective 90% confidence intervals (CI) for each neuropsychological test score.

Application

Individual change scores (i.e., differences between retest- and baseline scores) for each test score for each patient have to be computed. Change scores that fall within the RCI interval represent change expected to occur by chance in 90% of cases. Change scores that exceed the RCI interval indicate meaningful change.

Example

A patient achieving a 145-ms improvement in auditory reaction times between t₂ and t₁ (which exceeds the lower limit of the RCI interval, as depicted in Table 3) would show a clinically meaningful change. In contrast, an improvement of 145 ms in visual reaction times (which falls within the RCI interval) would be considered insignificant.

Standardized Regression-Based Change Norms

The SRB equations for each neuropsychological test score, derived from multiple linear regression analyses, are presented in Table 4. Where applicable, one equation with and one without moderating demographic and clinical variables (MV) is provided. The respective baseline score was a significant predictor of the retest score in 67.7% of equations.

TABLE 4

Table 4. SRB equations for each neuropsychological test score, with and without moderating demographic and clinical variables (MV) where applicable.

Application

By using the SRB equations, the retest scores for each patient in each test score can be predicted. Each difference between the predicted and the observed retest score can then be transformed into a standardized z-score (SRB change score) by dividing the difference by the respective standard error of the estimate (SE_est). SRB change scores that exceed the 90% confidence interval (z = ± 1.64) indicate meaningful change.

Example

A 49-year-old patient achieves a baseline auditory reaction time of 495 ms and a retest auditory reaction time of 431 ms. By using the respective SRB equation from Table 4, the patients’ retest score can be predicted as follows: predicted retest score = (B × baseline score) + (B_A_ge × age at t₁) = (0.80 × 495) + (3.53 × 49) = 568.97. The difference between the predicted retest score (568.97) and the observed retest score (431) can then be z-transformed as follows: SRB change score = (observed retest score – predicted retest score)/SE_est = (431 –568.97)/80.90 = −1.71. Because the resulting SRB change score exceeds the 90% confidence interval at the lower limit, this 64-ms improvement between t₂ and t₁ can be interpreted as a meaningful change.

Comparison Between RCIs and SRB Change Norms

Cohen’s kappa analyses revealed a moderate mean agreement (Landis and Koch, 1977) between RCIs and SRB change norms across test scores (mean κ = 0.44, SD = 0.26). The coefficients ranged between slight agreement (κ = 0.05, p = 0.511) for the subtest Digit Span forwards of the WMS-R, and almost perfect agreement (κ = 0.91, p = 0.000) for repetitions in the VLMT. SRB change norms were more conservative (i.e., indicating no meaningful change more frequently) in 72.3% of cases. Therefore, only those results based on SRB change norms are reported in the following.

Application of the Presented Measures

According to MCID thresholds, after surgery, 34 patients (72.4%) experienced meaningful improvement in symptoms of depression, and 20 patients (42.5%) achieved meaningful improvement in QOL. After surgery, 39 patients (83.0%) reported reduced seizure worry, 23 patients (48.9%) reported improved overall QOL, 20 patients (42.5%) reported improved emotional well-being, 17 patients (36.2%) reported improved energy, 14 patients (29.8%) reported improved cognitive functioning, 18 patients (38.3%) reported reduced medication side effects, and 14 patients (29.8%) reported improved social functioning. A decline in mood was less frequent than an improvement or no change in every scale of the self-report questionnaires.

The proportions of patients who experienced decline, improvement, or no meaningful change in each neuropsychological test score (according to SRB change norms) and each scale of the self-report questionnaires (according to MCID thresholds) were computed separately for the two subgroups and are presented in Table 5. Chi-Square tests revealed that a significantly higher proportion of patients who underwent dominant compared to non-dominant TL surgery experienced decline in verbal learning functions (VLMT, fifth repetition, learned words: 47.8 vs. 4.2%, p = 0.001; all repetitions, learned words: 39.1 vs. 4.2%, p = 0.003) and verbal memory functions (VLMT, recall after distraction, forgotten words: 17.4 vs. 0.0%, p = 0.033; delayed recall: 39.1 vs. 8.3%, p = 0.013; recognition, mistakes: 47.8 vs. 4.2%, p = 0.001; interferences: 34.8 vs. 8.3%, p = 0.027) after surgery. The proportion of patients with non-dominant TL surgery that achieved improvement in emotional well-being (62.5 vs. 21.7%, p = 0.005) and cognitive functioning (45.8 vs. 13.0%, p = 0.014) after surgery was significantly higher compared to patients with dominant TL surgery.

TABLE 5

Table 5. Proportion of patients achieving decline, improvement or no meaningful change in each neuropsychological test score (according to SRB change norms) and each scale of the self-report questionnaires (according to MCID thresholds), computed separately for the two subgroups (compared by chi-square tests).

The results of the univariate binary logistic regression analyses used to assess several pre- and postsurgical predictor variables for QOL improvement (QOLIE-31 total score) after surgery are presented in Table 6. Significant associations were observed between improvement in QOL and etiology (long-term epilepsy-associated tumor vs. hippocampal sclerosis, p = 0.019), focal to bilateral tonic-clonic seizures (never occurred vs. occurred before surgery, p = 0.019), side of surgery (dominant vs. non-dominant TL surgery, p = 0.029), verbal memory functions (decline vs. no decline, p = 0.018), and baseline QOLIE-31 total score (for every score point higher, p = 0.015).

TABLE 6

Table 6. Crude odds ratios (OR) and respective 95% confidence intervals (CI) measuring the association between improvement in quality of life (QOLIE-31 total score) and several pre- and postsurgical predictor variables.

Table 7 shows the multiple binary logistic regression models for the predictor variables independently associated with improvement in QOL (QOLIE-31 total score) and each QOLIE-31 subscale. The likelihood of achieving QOL improvement after surgery was significantly reduced by 48–98% among patients who experienced decline in verbal memory functions after surgery (p = 0.006), and by 3–18% for every score point higher in the baseline QOLIE-31 total score (p = 0.005). Regarding QOLIE-31 subscales, the likelihood of reporting an improved overall QOL after surgery was negatively associated with a decline in verbal memory functions after surgery (p = 0.003) and with the respective baseline QOLIE-31 score (p = 0.005). The likelihood of reporting improved emotional well-being after surgery was negatively associated with dominant TL surgery (p = 0.016), the etiology of long-term epilepsy-associated tumors (p = 0.034), and the respective baseline QOLIE-31 score (p = 0.006). The likelihood of reporting improved cognitive functioning after surgery was negatively associated with decline in verbal memory functions after surgery (p = 0.012), and with the respective baseline QOLIE-31 score (p = 0.003). Models assessing the odds of reporting reduced seizure worry, reduced medication side effects, and improved social functioning after surgery are depicted in Table 7. No significant model regarding energy and fatigue could be obtained.

TABLE 7

Table 7. Multiple binary logistic regression models with adjusted odds ratios (OR) and respective 95% confidence intervals (CI) for the predictor variables that were independently associated with improvement in quality of life (QOLIE-31 total score) and each QOLIE-31 subscale.

Discussion

The prediction and evaluation of meaningful change in epilepsy patients after surgery, through the repeated administration of standardized psychometric tests and questionnaires, represents an important task of clinical neuropsychology. Because epilepsy is a heterogeneous condition, the patients’ clinical, demographic, and etiologic circumstances are diverse. Accordingly, an individual, patient-centered approach is required to precisely monitor cognitive and behavioral change after surgery in each patient. Furthermore, empirically based criteria have to be applied to determine whether the observed change is statistically meaningful or has occurred due to random fluctuations in measurements.

The primary objective of this study was to develop empirical measures necessary for the objective determination of neuropsychological change in a comprehensive test battery in the German language. Because this test battery is recommended by the German ILAE Chapter and the Austrian, German and Swiss Working Group on Presurgical Epilepsy Diagnosis and Epilepsy Surgery (Brückner, 2012; Rosenow et al., 2016) and is used as a standard assessment protocol in many German-speaking epilepsy centers (Conradi et al., 2020), developing empirical measures for this test battery has many advantages. First, clinicians could use them to objectively evaluate postsurgical cognitive and behavioral change in patients and to individually monitor their psychosocial situations after epilepsy surgery. Compared with using unstandardized difference scores or group-level analyses, no individual change would be masked when using these measures, and both improvement and decline could be examined precisely in each patient. Second, communications between epilepsy centers regarding the patients’ postsurgical cognitive functioning could be facilitated using the same measures. Third, researchers could use these measures to pursue (multicenter) studies of neuropsychological change.

In the current study, both RCIs and SRB change norms for each test score included in the standard test battery were developed. We obtained a moderate mean agreement between the two measures, with SRB change norms being more conservative in the majority of cases. Because RCIs provide thresholds for determining meaningful change for each test score, no additional calculations are necessary beyond obtaining the patients’ individual change scores. Therefore, the quick and easy application to patient data represents an important advantage of the RCI method. In contrast, the SRB method is more complicated to use. However, the incorporation of potential moderating demographic and clinical variables and the transformation into a common metric (z-scores) represent clear advantages of SRB change norms.

A secondary objective of this study was to evaluate the usefulness of the presented empirical measures (RCIs, SRB change norms, and MCID thresholds) by applying them in a clinical sample of TLE patients. As expected, the application of the provided measures allowed the objective assessment of meaningful neuropsychological change in each individual patient, 12 months after epilepsy surgery. In line with pooled estimates derived from a large review (Sherman et al., 2011), we observed that a significantly higher proportion of patients with dominant TL surgery experienced decline in verbal learning and memory functions, compared with patients with non-dominant TL surgery. Consistently, we also found no differences between the rates of decline or improvement in non-verbal learning and memory functions associated with the side of surgery, and obtained comparatively low rates of change for attentional functions and executive functioning. Our results also confirmed the finding of an overall positive impact of epilepsy surgery on the patients’ mood, as assessed by improvement in symptoms of depression and QOL, which was demonstrated in previous research (Seiam et al., 2011; Ives-Deliperi and Butler, 2017).

Although cognitive change after epilepsy surgery is a relatively well-documented phenomenon in the literature, comparatively few studies have examined its impacts on the patients’ QOL (Baxendale, 2008). Langfitt et al. (2007) were the first to assess the strong interdependence between QOL, cognitive functioning, and seizure control in patients after epilepsy surgery. In the current study, the patients’ clinical characteristics (Fiest et al., 2014; Pauli et al., 2017) and neuropsychological factors (Langfitt et al., 2007) were used to model the odds of achieving improvement in QOL after surgery. According to multiple logistic regression analysis, higher baseline QOLIE-31 total scores and decline in verbal memory functions remained independently associated with a reduced likelihood of achieving QOL improvement after surgery. The significant association with the baseline QOLIE-31 total score might be explained by a ceiling effect. For a patient with a high QOL before surgery, the likelihood of further improvement is comparatively small. In contrast, a patient who initially reports a low QOL has a rather high likelihood of experiencing improvement after surgery. Our finding of a significant association between decline in verbal memory functions and the reduced likelihood of achieving QOL improvement after surgery adds support to the evidence of many studies demonstrating the strong interdependence between decline in cognitive functioning and QOL after epilepsy surgery (Lineweaver et al., 2004; Langfitt et al., 2007; Helmstaedter, 2008; Seiam et al., 2011; Fiest et al., 2014; Pauli et al., 2017).

Of interest, when analyzing the QOLIE-31 subscales, multiple logistic regression analyses revealed that patients with the etiology of long-term epilepsy-associated tumors and patients who underwent dominant TL surgery were less likely to report improvement in emotional well-being after surgery. Because epilepsy patients with a tumor etiology may experience a variety of negative circumstances, such as stigma and anxiety related to oncologic conditions, even in the absence of chemotherapy or radiation, the negative association observed between tumor etiology and improvement in emotional well-being after surgery is plausible. The association with the side of surgery is in line with previous research (Hamid et al., 2014; Pauli et al., 2017) and may be related to the patients’ increased expectations of a decline in cognitive functioning after dominant TL surgery, due to a more conservative presurgical medical consultation (Lineweaver et al., 2004).

Surprisingly, in our patients, no significant association between improvement in QOL and seizure-freedom after surgery could be obtained. This result may represent a false-negative finding, which would suggest that this study was unable to detect an association due to the distribution of seizure outcome in our sample. As fortunately the majority of patients were completely seizure-free after surgery, seizure outcome was categorized as “seizure-free” (70.2%, Engel class I) and “remaining seizures” (29.8%, Engel class II–IV). However, even partial reductions in seizure frequency (Engel class II and III) can result in improvement in QOL after surgery (Hamid et al., 2014), and, thus, seizure outcome should have been categorized as “improvement in seizure control” (91.5%, Engel class I–III) and “no improvement in seizure control” (8.5% Engel class IV). However, due to the favorable sparseness of patients in the latter category, this analysis could not be conducted in our sample.

Limitations

Several limitations of the current study deserve further discussion. First, the development of RCIs and SRB change norms was based on a surgical cohort of epilepsy patients, as no suitable control sample could be identified: using a healthy control group appeared to be inappropriate, because we aimed to not only incorporate measurement- but also disease-related factors that might influence repeated neuropsychological test results. Also, using a cohort of non-surgical epilepsy patients seemed to be unsuitable, as patients not considered as candidates for epilepsy surgery often show clinical characteristics (e.g., less ASDs, multifocal etiologies, or diagnosed comorbidities) not comparable to those of surgical patients. Thus, the most appropriate control sample would have been a surgical cohort of epilepsy patients who underwent both neuropsychological assessments prior to surgery. However, due to ethical considerations, a study design with an artificial delay of epilepsy surgery would not have been feasible.

Second, some test scores showed rather low retest reliability coefficients in our sample, for example 0.14 for learned figures in the first repetition of the DCS-II, or 0.02 for the subtest Position Discrimination of the VOSP. This finding not only decreases the interpretability of the corresponding RCIs and SRB change norms, but also raises the question of whether these psychometric tests are at all suitable to be included in the standard neuropsychological test battery used in many German-speaking epilepsy centers. In line with that, previous studies examining the appropriateness of the applied measures (Vogt et al., 2017; Conradi et al., 2020) came to the conclusion that the selection of tests assessing non-verbal learning and memory functions requires further improvement.

Third, due to our limited and heterogeneous clinical sample of TLE patients, we did not focus on the development of generalizable normative data: a larger and more homogenous sample of epilepsy patients (e.g., only TLE patients with the etiology of hippocampal sclerosis) would have been required to provide clinicians and researchers with empirical measures that can be applied as a standard in the German-speaking field of neuropsychology. In contrast, we aimed to pursue future studies to build upon our results and to further examine empirically based criteria, by pointing out advantages of this approach and demonstrating the usefulness of empirical measures to objectively and individually determine neuropsychological change.

Conclusion

In this study, we developed both RCIs and SRB change norms for each test score included in a comprehensive neuropsychological test battery in the German language. As illustrated by the longitudinal follow-up in a clinical sample of TLE patients, the application of the provided measures allowed the precise determination of cognitive and behavioral change in each individual patient, 12 months after epilepsy surgery.

Our finding of a strong negative association between improvement in QOL and decline in verbal memory functions after surgery adds support to the special importance of an individual and objective assessment of cognitive change and its influence on the patients’ psychosocial situation after surgery. Thus, the establishment of patient-centered measures designed to empirically assess meaningful change represents an important contribution to the improved medical care of epilepsy patients.

Future studies that implement empirical measures and refine our results are required to further resolve the interdependence between QOL, cognitive functioning, and seizure control in patients after epilepsy surgery, and to promote the development of patient-centered interventional strategies and rehabilitation approaches, based on these findings.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics Statement

The studies involving human participants were reviewed and approved by the Ethics Committee of the University of Frankfurt Medical Faculty. The informed consent was waived by the ethics committee because of the retrospective nature of the analysis.

Author Contributions

NC developed the presented idea, designed the computational framework, and analyzed the data. NC, MB, AH, TK, NM, and ASc performed the neuropsychological assessments and contributed to the interpretation of the results. TF, ASt, and FR were involved in planning and supervised the work. All authors discussed the results and contributed to the final manuscript.

Funding

The authors were supported by the LOEWE Center for Personalized Translational Epilepsy Research (CePTER), Goethe University, Frankfurt, Germany, funded by The Hessen State Ministry for Higher Education, Research and the Arts (HMWK) and by the Detlev-Wrobel-Fonds for Epilepsy Research.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Aschenbrenner, S., Tucha, O., and Lange, K. (2001). RWT, Regensburger Wortflüssigkeits-Test. Göttingen: Hogrefe.