- 1Department of Psychology, Health Technology, University of Twente, Enschede, Netherlands
- 2Department of Research Methodology, Measurement Data Analysis, University of Twente, Enschede, Netherlands
- 3Tactus Addiction Treatment, Enschede, Netherlands
- 4Netherlands eScience Center (NWO), Amsterdam, Netherlands
Nowadays, traditional forms of psychotherapy are increasingly complemented by online interactions between client and counselor. In (some) web-based psychotherapeutic interventions, meetings are exclusively online through asynchronous messages. As the active ingredients of therapy are included in the exchange of several emails, this verbal exchange contains a wealth of information about the psychotherapeutic change process. Unfortunately, drop-out-related issues are exacerbated online. We employed several machine learning models to find (early) signs of drop-out in the email data from the “Alcohol de Baas” intervention by Tactus. Our analyses indicate that the email texts contain information about drop-out, but as drop-out is a multidimensional construct, it remains a complex task to accurately predict who will drop out. Nevertheless, by taking this approach, we present insight into the possibilities of working with email data and present some preliminary findings (which stress the importance of a good working alliance between client and counselor, distinguish between formal and informal language, and highlight the importance of Tactus' internet forum).
Introduction
Addictive behaviors and substance dependencies have a global impact, with alcohol use disorder as the prevailing substance abuse disorder (1). It is estimated that, around the world, 283 million individuals suffer from alcohol use disorder, representing ~5.1% of all adults (2). As these numbers are predicted to increase globally (3), the need for accessible treatment becomes more apparent than ever. Yet, there is large delay between onset of the disorder and first treatment contact (4). A growing number resort to online solutions for their drinking problems (5, 6). Although web-based psychotherapeutic interventions have been established as effective interventions for alcohol use disorder (7, 8), they are plagued by high rates of drop-out (9, 10), thereby adding to the already well-known problems of high drop-out in alcohol treatment (11). The aim of this study is to analyze whether emails that were written by clients early in the treatment process can predict drop-out of the online treatment.
Some general advantages of web-based interventions are that they have a lower threshold for first treatment contact (12, 13), they can be as effective as traditional face-to-face therapy (14–16), they come at a low cost (17), and they have usually no or only short waiting lists (12). Online, many clients feel they can maintain their privacy (18), feel less stigmatized (19, 20), and (sometimes even) prefer the impersonal nature of the web, as they do not have to disclose their feelings and problems in person (21). Online interventions for substance dependencies form a large part of the online offer, with many targeting alcohol use disorder specifically (22).
The specific advantage of web-based interventions for alcohol use disorder is the all-time availability. Websites can be accessed every hour of the day and every day of the year, which is a great advantage over face-to-face treatment for those who cannot attend to treatment at business hours (23). This is of special importance when treating alcohol use disorder (5), as the willingness of clients to change their drinking behaviors is often of volatile nature and easily affected by (negative) events, which can also occur during holidays (24). Even though web-based interventions make it difficult for a counselor to react to the non-verbal cues of clients, they are a helpful and welcome addition to the treatment of alcohol use disorder.
Yet, there is no debate about the biggest drawback of web-based interventions (25, 26): they are plagued by a high rate of drop-out, on average ~50% (9), and for some, even as high as 99% (26, 27). The same problem is known for online alternatives for alcohol use disorder specifically (9, 11, 28). Postel (29), for example, reported a drop-out rate of 54%, whereas 84.5% dropped out in the study of Linke et al. (30).
We did not find studies that set out to specifically address the reasons for dropping out of an online intervention for alcohol use disorder (if reported, drop-out analyses usually are one of many complementary analyses). Drop-out is referred to in the literature as pre-mature termination, non-usage, low attrition, or retention (10, 31, 32). Studies that analyze drop-out often use different sample groups, diffuse treatments, and diverse subtypes of disorders (33), which makes it even more difficult to compare drop-out between studies. How drop-out should be defined is also not universally agreed upon: some argue that only those who did not finish the complete intervention dropped out; others argue that clients who did not reach a certain cap of required attended sessions should be considered drop-out (34), and some say that only the judgment of the counselor can determine who dropped out, as it is also possible that a client dropped out because he or she already experienced the beneficial effects of therapy (35, 36). These distinctions matter, as the different approaches affect the drop-out rates reported (37). In line with Eysenbach's Law of Attrition, in this study, participants are considered drop-out when they did not finish all the treatment sessions that required to complete the treatment protocol (38).
Knowing who is likely to complete—and thereby hopefully benefit from—an intervention provides a better basis for an evidence-based allocation of clients to treatment (39). Therapy change process research is the field that has addressed these kinds of questions (40). Online interventions have the advantage that interactions between clients and counselors are saved over time, for example, in emails that they exchange. The abundance of data these create allows for innovative methods like text mining (41). In the current study, we will use two basic text mining approaches to learn more about their use in predicting drop-out in an online intervention for alcohol use disorder.
Given high rates of drop-out, also in this intervention, the aim of this study is to assess whether the first mails of clients include any early “warning” signs of drop-out. To the best of our knowledge, this has not been done before and we did not find applications of text mining applications specifically tailored to study drop-out for alcohol use disorder.
Method
First, we will use a bottom-up approach by “simply” counting the most frequently used words in emails (42), and second, we will also use a top-down approach, building on dictionary-based approaches in psychology (43). The dictionary-based program Linguistic Inquiry and Word Count program [LIWC; (44)] has perhaps the most widespread use in (general) psychological research and is available in many languages, including Dutch (45, 46). LIWC allows one to analyze many aspects of the email texts, using both linguistic markers like function words and punctuation and psychological markers like affect words, cognitive processes, and personal concerns (47). Liehr et al. (48) provide an example in which they assessed self-change by applying LIWC to written stories and studied stressful feelings over the course of an intervention for substance dependencies.
We will use an evidence-based intervention for treating alcohol use disorder: the web-based intervention “Alcohol de Baas,” loosely translated from Dutch as “Look at your drinking” (19, 49). AdB is rooted in cognitive behavioral therapy and motivational interviewing, both empirically substantiated approaches for the treatment of substance dependencies (50, 51). The intervention consists of two parts: the first focuses on drinking habits, the second on behavior change. In the first part, counselors support clients in analyzing their drinking habits through several assignments and assessments that are followed up by feedback from the counselor. It ends with a personalized advice to the client.
The second part focuses on changing the drinking habits of clients and aims to replace the thoughts associated with alcohol cravings by more helpful ones. After about 10 weeks, the intervention ends with the formulation of an action plan for maintaining the new drinking behavior or sobriety to prevent relapse. Postel (29) demonstrated that the intervention led to a significant decrease in alcohol consumption, which was maintained at a 6-month follow-up. The intervention attracted almost 1,000 users per year, a substantial interest given the size of the Dutch population. Clients and counselors primarily used email to establish the beneficial effects of the intervention, so important aspects of the therapeutic process should be included in these emails.
Study Design
The current study uses a naturalistic, prospective design with consecutive clients who signed up for the online intervention Alcohol de Baas (AdB 29). The data includes personal characteristics, the first four emails written by clients, and information about treatment drop-out.
Postel et al. (19) received ethical approval for (re-)analysis of AdB. Prior to starting treatment, participants gave their informed consent that their data could be (re-)used, but had the right to withdraw at any moment.
Participants
Visitors who are concerned about their own drinking patterns had direct access to the webpages of AdB and can register for the program online. All participants who registered themselves for AdB with self-reported alcohol problems had to be over 16 years old, which was the legal drinking age in the Netherlands when the study was conducted. Of the 1,987 consecutive persons who registered up to 2017, 4 were excluded because they retracted their informed consent, 1,060 did not start the intervention, 132 did not send any emails, and 21 had too much missing data (see Figure 1). Hence, 770 participants were included in the current study. Tables 1, 2 provide an overview of their personal characteristics. Their median age is 46 years (range between 17 and 78 years). The majority were female, of Dutch nationality (and spoke Dutch), were married and finished a higher vocational degree. They smoked occasionally, but did not use drugs, nor did they gamble. Their main reason to start with treatment was that they worried about their drinking behavior: the median consumed units of alcohol (10 g of ethanol) of alcohol per week was 36 at onset of the program. About half of the sample frequently experienced feelings of depression or other psychological problems (24). From 770 clients, 346 completed the full treatment, resulting in a drop-out rate of 55.1%. For a more in-depth characterization of clients, we refer the interested reader to the four case descriptions provided in the Appendix in Supplementary Material.
Table 1. Overview of the client's age, years of problematic alcohol consumption, and the average units of consumed alcohol.
Table 2. Overview of the demographic characteristics in-take questionnaire, split to drop-out and completer.
Instruments
Personal characteristics of clients were assessed before the intervention started for a complete overview of the variables used (see Tables 1, 2).
Available text data consisted of emails that the clients and their counselors exchanged during the intervention AdB. Depending on the part of the program, the emails were more or less tailored to each individual client. We only considered texts of clients containing 20 words or more. The mean number of emails written by clients was 20.8, with a maximum number of 116 emails. The current study focused on the first four emails written by clients as an early indicator for drop-out.
Clients were labeled as a “completer” when they received an email with the word afsluiting (Dutch for closure). These emails were inspected to make sure that they were indeed related to a completed treatment which, for example, included finishing the final assignment actieplan (“action plan” in Dutch). All clients who did not qualify for these criteria were labeled as “treatment drop-out.” The labeling of emails was conducted by Van den Hazel (52).
Data Processing
The first step in the analysis is the preprocessing of the data using Python, an interpreted, high-level, and general-purpose programming language (53). The goal is to have the data in a format suitable for further data analysis. First, some client emails included texts from a previous email by the counselor. As these quotes were not written by the client, we removed these from earlier emails. Next, we normalized all texts by converting all capitals to lowercase characters. We then divided the text into tokens and sentences with Frog (54) and NLTK (55). NLTK tallies sentences by counting word-terminal end-of-sentence punctuations like the period, question mark, and exclamation mark. NLTK has a list of abbreviations, which are not included in the punctuation and sentence count. Word-internal punctuation, like the first period in “e.g.,” is ignored. Handling of interjections depends on their punctuation, for example, “Oh?” is a separate sentence while “Oh,” is part of the following sentence. Sentence fragments and quotes with end-of-sentence punctuation are counted as separate sentences.
The next step is to anonymize the emails by replacing the names, dates, numbers, locations, medical problems, and other (“miscellaneous”) entities with the abbreviations “PER,” “DATE,” “NUM,” “PRO,” and “MISC,” respectively (56). We used the Frog-program for named entity recognition and for anonymization of these entities (54). For example, a first client email started as: “Dear [PER], my name is [PER].” Because named entity recognition is a machine learning task, the anonymization procedure was not without flaws, for example, because entities were misspelled. To ensure that all personal information was removed, we checked the analyses of Frog repeatedly and adjusted the anonymization and pre-processing accordingly.
After pre-processing and anonymization, word use was investigated using n-grams. N-grams are sequences of n consecutive words occurring in the text. More specifically, we employed unigrams, the simplest of the n-grams, as it counts the frequencies of the individual words in the text. Next, we analyzed the content of the emails with the Linguistic Inquiry and Word Count [LIWC; (44)]. We used the Dutch translations of LIWC (45, 46). LIWC consists of several dictionaries with subcategories (57, 58). For example, positive emotion words is a subcategory of the dictionary affective processes and consists of words like happy, pretty, and good. LIWC counts the percentage of words in an email number belonging to a specific subcategory. For each email, the output contained 76 variables. Besides the score per email, we calculated the average across the four emails for each client for each of these 76 variables. The repeated measures consisted out of 76 × 4 LIWC variables, whereas the averages consisted of 76 LIWC variables.
Statistical Analysis
All analyses were performed in Python. We included all personal variables in a first step (see Tables 1, 2). In a second step, we included all text variables to see what could be gained from the text analyses over and above the personal variables. In the second step, we did one set of analyses with the LIWC variables for the first four emails as repeated measures, and another set with averages across these four emails.
We employed three types of statistical models: a logistic regression, a neural network, and decision trees. For all logistic regression models, our dependent variable was drop-out (yes/no), whereas we used all personal characteristics and all LIWC categories as independent variables. A random (or “naive”) distribution of clients would result in a correct classification of ~50%, because both groups (roughly) had an equal number of observations. To have an impression of the “baseline” performance of statistical models, we first conducted a (standard) logistic regression. The training method we used for the logistic regression was mini-batch gradient descent, with the binary cross-entropy as the loss function.
Neural networks are well-known for their predictive accuracy on complex tasks (59, 60) and are often applied in text mining (13, 61–64). We used a multi-layer perceptron, with five fully connected layers. The repeated measures used a slightly different architecture: the first layer of the network only takes the personal data as input, and the LIWC scores only enter the network in the second layer. Maity and Pal (65) showed that this architecture could improve results when dealing with repeated measures data.
Decision trees are well-known for the insightful “decision-maps”, which are relatively straightforward to interpret (66–68). Ensemble methods are a class of decision tree models that have better predictive performance (62), and these so-called boosting methods combine several “weak” classifiers to improve prediction of the final boosted classifier (69). We applied XGBoost (70), a (boosted) decision tree that—for many tasks—is known to outperform standard tree-based models (13). For the repeated measures, we also used two Mixed Effect Random Forests (MERF; “longitudinal decision trees”). A MERF enhances the standard decision tree by including mixed (or “random”) effects, which can lead to substantial performance improvements when dealing with clustered data (71). To our understanding, the MERF software of Hajjem et al. (71) does not include the option to assess both clustered structures simultaneously. Therefore, our first MERF used the repeatedly observed LIWC scores as clusters, and as a result, the MERF model took the longitudinal structure of the data into account. The second MERF used the clients as clusters.
In order to adequately assess model performance while still maintaining an acceptable training–test-set ratio, we used five-fold cross-validation for each model. We will only discuss test-set performance (the confusion matrices in the next section only report the test-set numbers). We reported the precision, recall, accuracy and the F1-score. We assign the most weight to the first two scores, as they balance the other two (and some other aspects of correct and incorrect classifications).
Results
We first studied the most frequently used words in the emails of the drop-out and the completers [see Table 3 for an overview of (the translations of) these words]. Table 4 contains the performance metrics of the models that we employed and also contains the “naive” baseline model based on random chance. As can be seen in Table 4, the unigram model does not outperform baseline classification. This means that none of the words in Table 3 is able to discriminate between the drop-out and completers. However, some word classes in Table 3 stand out as they could indicate some potentially relevant differences between the drop-out and completers: we discuss these three word classes in the next section.
Table 3. Top ten most commonly used words in the e-mails for those who completed the intervention, and for those who dropped out.
Word Use
Our first observation involves the usage of informal pronouns; in Dutch, there is a formal and informal way for addressing others. Three of the most frequently used words by the clients who dropped out the treatment were formal (see number 1 in Table 3), addressing the counselor in a somewhat remote manner (“I don't know what you can do for me, aside from forwarding my file”). Aside from the “distance” between the client and counselor, formal language is also used to express some sort of misunderstanding (“Did I understand you correctly, you want me to answer all the questions? That will take ridiculously long for you to read”).
Second, the clients who completed the intervention often refer to a psychotherapist, whereas the ones who dropped out frequently mention a general practitioner (see number 2 in Table 3). It appears that the first—in addition to the counselor from Tactus—also have a psychotherapist (“Yes, I really believe I need a therapist with whom I can fight about my ideas and thoughts”). This therapist is often perceived as a source of support (“My therapist agrees that I could benefit from these situations as well”), and it appears that the therapist often discusses topics that are similar to the ones in the Tactus intervention (“I discussed this yesterday as well with my therapist”). The clients who dropped out on the other hand often mention their general practitioner, to whom the Tactus counselors refer excessive drinkers (“I went to my general practitioner, and my blood pressure was good”). The general practitioner of some appears to be aware of the alcohol dependency (“My general practitioner knows about my alcohol abuse”), whereas this is not the case for others (“I tried to discuss this with my general practitioner, but I was shocked by the reaction I got”).
Thirdly, the clients who completed the intervention mention a forum. In addition to the AdB program, Tactus also offers access to an online internet forum where it is possible to discuss and meet with other participants of the program. According to Postel (29), the forum receives great user satisfaction, and offers support, motivation, and engagement (p. 136). It appears that the clients who dropped out do not use this forum, as they do not mention it.
LIWC Analyses
The results of the LIWC averages can be found in Table 4 (performance metrics) and 5 (confusion matrix). Table 4 does not indicate that any analyses based on the LIWC averages outperforms naive classification: the accuracy and F1-score rarely exceed the random baseline classification. For the LIWC averages, the F1-scores are low for the logistic regression and the multilayered perceptron (MLP in Table 4), mainly due to poor recall. The decision tree performed slightly better, with a higher accuracy and a recall that is substantially higher than for the other two models.
The results for the repeated measures are displayed in Table 4. As the confusion matrices of the LIWC repeated measure analyses were similar to Table 5, we did not include these here for the sake of brevity. Table 4 indicates that model performance remains similar to the naive classification. The performance of the longitudinal decision tree and MERF are on par with the neural network of the LIWC averages. Even though we included all LIWC categories and personal characteristics as input in our analyses, Table 4 does not indicate that the longitudinal models are (better) capable of predicting drop-out. Given that we employed a wide array of models, we conclude that there are no “large,” “powerful,” or “strong” predictors of drop-out in the first four emails.
Discussion
Web-based psychotherapy is an established alternative to classic face-to-face therapy, with the large drawback that almost all online interventions are plagued by a high rate of drop-out. In our data, we found a drop-out rate of nearly half, which was high, but similar to past studies (9). So, why did these clients drop-out? We tried to answer this question by comparing the first four emails of clients who completed the intervention to those of clients who dropped out. We used a wide array of models, but could not associate the email texts to drop-out.
Word Use
The analysis of word use showed that there are some differences in the first four emails between clients who did or did not complete the intervention. The (Dutch) words “u” and “uw,” indicating polite and formal ways to address a counselor, were used with greater frequency by those who dropped out. This could point to differences in the therapeutic alliance between client and counselor (72). The fact that the completers do not address their counselors in a formal way could be an indication that they feel a stronger therapeutic alliance (73). Establishing openness and trust requires effort which is perhaps difficult to do online for those who dropped out. It is possible that clients who already know “how to work with a counselor” could be further in their process of becoming less dependent on alcohol.
There is also a difference between the usage of the words therapist by clients who completed and general practitioner by those who dropped out. For some, the intervention could be a recommendation from their general practitioner, while others could have found the intervention on their own, suggesting a difference in extrinsic or intrinsic motivation. Another interpretation is that those who dropped out could perceive their alcohol usage as a medical problem, whereas those who completed could perceive their drinking behavior as a psychological problem and thereby be more open for the psychological support the intervention offers (74). Some clients might perceive medical care as the only form of “real” healthcare, not expressing much faith in psychological counseling and thereby investing less in the therapeutic alliance.
The last finding is that the forum was mentioned only by those who completed the intervention. The forum could be a great source of support for them, whereas for those who dropped out, this could indicate less engagement with the intervention. It could also be that those who actively participate in the forum are further in their own psychotherapeutic process, as they know it also takes some effort from themselves to become less dependent on alcohol. It would therefore not be far-fetched to recommend that the intervention could try to establish more tie-ins with the forum (and vice versa).
LIWC Analysis
LIWC is an often used program that could be helpful in determining textual aspects of therapy that are relevant for drop-out. However, we were unable to achieve satisfactory predictive accuracy based on LIWC features, even though different statistical models were used. Perhaps LIWC is too “crude” to pick up the nuances that are present in these kinds of text data. For example “I was so angry when I was a child” is equally counted to the category anger as “I am so angry right now,” even though the first sentence describes the past and this might have changed by now. Also, “I hate my family” and “I love my family” both contain words for the LIWC category family, yet have an entirely different emotional connotation. So the category family in this example does not allow for a meaningful differentiation between these two statements.
LIWC has been developed with a broad purpose and not specifically for the problem addressed in this study. Although LIWC is a popular tool among psychologists, our study contributes to a more nuanced understanding of LIWC. Other researchers did show the potential of creating more fine-tuned dictionaries for alcohol use disorder (75) or suggested to combine several LIWC categories (76). Perhaps dictionaries that target nuances that are specifically tailored to alcohol use disorder will be more helpful in understanding drop-out. Our list of most frequently used words could be helpful as a first step.
Dutch is a relatively small language with ~24 million speakers, and LIWC is the best and only readily available alternative at the moment. There might be other relevant dictionaries in English or another language. However, simply translating such dictionaries through standard available translation software is arguably naive, as it could result in a loss of linguistic and cultural nuances.
Strengths and Limitations
To the best of our knowledge, there are no earlier studies that explore drop-out for alcohol use disorder through a text-mining approach. Having a better understanding of how (and why) drop-out occurs does not only have scientific value, it can also greatly benefit the clinical practice, for example, making counselors aware of the word use of their clients that might be related to drop-out. As we present one of the first attempts to systematically study emails for alcohol use disorder, the combination of bottom-up analyses of word use and top-down dictionaries like the LIWC seemed well-suited for the purpose at hand. Starting analyses in an earlier phase would be an important recommendation to detect possible problems for analyses such as clients including quotes from counselors in their emails and to check the feasibility of programs used.
The measure for drop-out was in a sense rather crude as it is a complex and multidimensional construct. Clients could drop out for several reasons: they might not like the intervention, be hesitant to build a therapeutic relation, experience a crisis or even quit the intervention because they already experienced benefits. A recommendation for future research is therefore to include the nuances of drop-out. TPCR-related studies would benefit from structured datasets that include more labeling from counselors and clients, so that it becomes possible to conduct a more nuanced analysis of the text constructs. This could decrease the time spent on pre-processing, while the value of the data analysis increases greatly.
Although from a pragmatic viewpoint the dataset is quite large, especially in the Netherlands, the dataset was relatively small for machine learning approaches. Including more data points, like more emails of clients as well as emails of counselors, might improve the performance of the models. There is a need for further developments as more email data becomes available, a complete “manual” study becomes too labor intensive (77). Developing systems that are more suitable for the export of the large numbers of (relevant) texts from emails is important for numerous reasons. As of yet, more often than not, this requires some familiarity with programming, which is not part of the standard curriculum of psychologists trained at the graduate level.
Ever since Sigmund Freud introduced the talking cure, conversation is the cornerstone to most forms of psychotherapy. Given the central position of the therapeutic exchange, a careful assessment of the therapeutic language could provide insight into what is happening in therapy. As AdB primarily relies on the exchange of emails between client and counselor, information about the therapeutic processes (drop-out included) should be present in these emails. Even though we were not able to produce models that were helpful in discriminating between the drop-out and completers, the qualitative interpretation of our findings does suggest that the emails contain relevant information.
Data Availability Statement
The dataset presented in this article is not available: European Privacy regulation prohibits data sharing that can make human subjects identifiable. Inquiries about the datasets can be directed to Zy5qLndlc3RlcmhvZkB1dHdlbnRlLm5s.
Ethics Statement
Ethical review and approval was not required for this study, because we reused an existing dataset in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in the original study, which included later reuse of their data.
Author Contributions
WS wrote the manuscript and supervised AE for his bachelor thesis and LL-M for his master thesis. ET conducted the data handling in Python and pre-processed the data. MP contributed the data and gave feedback throughout the process. AS helped WS with the supervision of AE and LL-M and gave, together with BV and GW, feedback throughout the process. BV and GW revised the manuscript for review. The data-analyses of AE formed the basis of the results-section, the literature review of LL-M contributed greatly to the introduction and discussion. All authors contributed to the article and approved the submitted version.
Funding
This manuscript is the result of the What Works When for Whom project, which was supported by the Life Science eHealth domain of the Accelerating Scientific Discovery (ASDI) call from the Netherlands eScience Center (NLeSC; Amsterdam, the Netherlands): Grant No. 027.015.G04 awarded to AS. The NLeSC is the national knowledge center for the development and application of research software to advance scientific research, and was funded by the Netherlands Organization for Scientific Research (in Dutch: Nederlandse organisatie voor Wetenschappelijk Onderzoek; NWO) and SURF (abbreviation for Samenwerkende Universitaire Rekenfaciliteiten in Dutch).
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher's Note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Acknowledgments
We would like to thank Bert-Jan van Regteren for his continuing support throughout the process: from start to finish, Van Regteren was incredibly helpful. We would also like to thank Tinka van den Hazel, Marijana Kristić, and Laura Giesler for their help with reviewing the literature. We refer to their theses throughout the manuscript.
Supplementary Material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyt.2021.575931/full#supplementary-material
References
1. Degenhardt L, Charlson F, Ferrari A, Santomauro D, Erskine H. The global burden of disease attributable to alcohol and drug use in 195 countries and territories, 1990–2016: a systematic analysis for the global burden of disease study 2016. Lancet Psychiat. (2018) 5:987–1012. doi: 10.1016/S2215-0366(18)30337-7
2. Hammer JH, Parent M, Spiker D. Global Status Report on Alcohol and Health 2018. Poznyak V, Rekve D, editors. Geneva: Global Status Report on Alcohol. (2018).
3. Whiteford HA, Degenhardt L, Rehm J, Baxter AJ, Ferrari AJ. Global burden of disease attributable to mental and substance use disorders: findings from the global burden of disease study 2010. Lancet. (2013) 382:1575–86. doi: 10.1016/S0140-6736(13)61611-6
4. Chapman C, Slade T, Hunt C, Teesson M. Delay to first treatment contact for alcohol use disorder. Drug Alc Depend. (2015) 147:116–21. doi: 10.1016/j.drugalcdep.2014.11.029
5. Cloud RN, Peacock PL. Internet screening and interventions for problem drinking: Results from the www.carebetter.com pilot study. Alc Treat Q. (2001) 19:23–44. doi: 10.1300/J020v19n02_02
6. James SL, Abate D, Abate KH, Abay SM, Abbafati C. Global, regional, national incidence prevalence, and years lived with disability for 354 diseases and injuries for 195 countries and territories, 1990–2017: a systematic analysis for the global burden of disease study 2017. Lancet. (2018) 392:1789–858. doi: 10.1016/S0140-6736(18)32279-7
7. Pedersen ER, Marshall GN, Schell TL. Study protocol for a web-based personalized normative feedback alcohol intervention for young adult veterans. Addict Science Clin Pract. (2016) 11:6. doi: 10.1186/s13722-016-0055-8
8. Postel MG, De Haan HA, De Jong CA. Evaluation of an e-therapy program for problem drinkers: a pilot study. Subst Use Misuse. (2010) 45:2059–75. doi: 10.3109/10826084.2010.481701
9. Kelders SM, Kok RN, Ossebaard HC, van Gemert-Pijnen JE. Persuasive system design does matter: a systematic review of adherence to web-based interventions. J Med Internet Res. (2012) 14:e152. doi: 10.2196/jmir.2104
10. Melville KM, Casey LM, Kavanagh DJ. Dropout from internet-based treatment for psychological disorders. Brit J Clin Psychol. (2010) 49:455–71. doi: 10.1348/014466509X472138
11. Schroder R, Sellman D, Frampton C, Deering D. Youth retention: factors associated with treatment drop-out from youth alcohol and other drug treatment. Drug Alcoh Rev. (2009) 28:663–8. doi: 10.1111/j.1465-3362.2009.00076.x
12. Amichai-Hamburger Y, Klomek AB, Friedman D, Zuckerman O, Shani-Sherman T. The future of online therapy. Comp Hum Behav. (2014) 41:288–94. doi: 10.1016/j.chb.2014.09.016
13. Hoogendoorn M, Berger T, Schulz A, Stolz T, Szolovits P. Predicting social anxiety treatment outcome based on therapeutic email conversations. IEEE J Biomed Health Inform. (2017) 21:1449–59. doi: 10.1109/JBHI.2016.2601123
14. Barak A, Hen L, Boniel-Nissim M, Shapira N. A comprehensive review and a meta-analysis of the effectiveness of internet-based psychotherapeutic interventions. J Tech Hum Serv. (2008) 26:109–60. doi: 10.1080/15228830802094429
15. Gainsbury S, Blaszczynski A. A systematic review of internet-based therapy for the treatment of addictions. Clin Psychol Rev. (2011) 31:490–8. doi: 10.1016/j.cpr.2010.11.007
16. Howes C, Purver M, McCabe R, Healey PG, Lavelle M. Helping the medicine go down: repair and adherence in patient-clinician dialogues. In: Proceedings of SemDial 2012 (SeineDial): The 16th Workshop on the Semantics and Pragmatics of Dialogue. Paris (2012).
17. Schweitzer J, Synowiec C. The economics of eHealth and mHealth. J Health Comm. (2012) 17:73–81. doi: 10.1080/10810730.2011.649158
18. Berger M, Wagner TH, Baker LC. Internet use and stigmatized illness. Soc Sci Med. (2005) 61:1821–27. doi: 10.1016/j.socscimed.2005.03.025
19. Postel MG, de Haan HA, Ter Huurne ED, Becker ES, de Jong CA. Effectiveness of a web-based intervention for problem drinkers and reasons for dropout: randomized controlled trial. J Med Intern Res. (2010) 12:e68. doi: 10.2196/jmir.1642
20. Rooke S, Thorsteinsson E, Karpin A, Copeland J, Allsop D. Computer-delivered interventions for alcohol and tobacco use: a meta-analysis. Addiction. (2010) 105:1381–90. doi: 10.1111/j.1360-0443.2010.02975.x
21. Griffiths M. A'components' model of addiction within a biopsychosocial framework. J Substance Use. (2005) 10:191–7. doi: 10.1080/14659890500114359
22. Rogers MAA, Lemmen K, Kramer R, Mann J, Chopra V. Internet-delivered health interventions that work: systematic review of meta-analyses and evaluation of website availability. J Med Internet Res. (2017) 19:e90. doi: 10.2196/jmir.7111
23. Moritz S, Schröder J, Meyer B, Hauschildt M. The more it is needed, the less it is wanted: attitudes toward face-to-face intervention among depressed patients undergoing online treatment. Depress Anx. (2013) 30:157–67. doi: 10.1002/da.21988
24. Cunningham JA, Sobell LC, Sobell MB, Gaskin J. Alcohol and drug abusers' reasons for seeking treatment. Addict Behav. (1994) 19:691–6. doi: 10.1016/0306-4603(94)90023-X
25. Fernández-Álvarez J, Díaz-García A, González-Robles A, Baños R, García-Palacios A, Botella C. Dropping out of a transdiagnostic online intervention: a qualitative analysis of client's experiences. Internet Intervent. (2017) 10:29–38. doi: 10.1016/j.invent.2017.09.001
26. Karyotaki E, Kleiboer A, Smit F, Turner DT, Pastor AM, Andersson G, et al. Predictors of treatment dropout in self-guided web-based interventions for depression: an ‘individual patient data' meta-analysis. Psychol Med. (2015) 45:2717–26. doi: 10.1017/S0033291715000665
27. Andrade AL, de Lacerda RB, Gomide HP, Ronzani TM, Sartes LM, Martins LF, et al. Web-based self-help intervention reduces alcohol consumption in both heavy-drinking and dependent alcohol users: a pilot study. Addict Behav. (2016) 63:63–71. doi: 10.1016/j.addbeh.2016.06.027
28. Copeland J, Martin G. Web-based interventions for substance use disorders. J Subst Abuse Treat. (2004) 26:109–16. doi: 10.1016/S0740-5472(03)00165-X
29. Postel MG. Well Connected: Web-Based Treatment for Problem Drinkers. (Doctoral dissertation), Radboud Universiteit, Nijmegen (2011). Available online at: https://repository.ubn.ru.nl/handle/2066/91233
30. Linke S, Murray E, Butler C, Wallace P. Internet-based interactive health intervention for the promotion of sensible drinking: patterns of use and potential impact on members of the general public. J Med Internet Res. (2007) 9:e10. doi: 10.2196/jmir.9.2.e10
31. Lau PL, Jaladin RAM, Abdullah HS, Li PL, Abdullah HS. Understanding the two sides of online counseling and their ethical and legal ramifications. Proc Soc Behav Sci. (2013) 103:1243–51. doi: 10.1016/j.sbspro.2013.10.453
32. Oh H, Rizo C, Enkin M, Jadad A, Powell J, Pagliari C. What is eHealth (3): a systematic review of published definitions. J Med Internet Res. (2005) 7:e1. doi: 10.2196/jmir.7.1.e1
33. Khazaie H, Rezaie L, Shahdipour N, Weaver P. Exploration of the reasons for dropping out of psychotherapy: a qualitative study. Eval Progr Plan. (2016) 56:23–30. doi: 10.1016/j.evalprogplan.2016.03.002
34. Swift JK, Greenberg RP. Premature discontinuation in adult psychotherapy: a meta-analysis. J Consult Clin Psychol. (2012) 80:547–59. doi: 10.1037/a0028226
35. Krishnamurthy P, Khare A, Klenck SC, Norton PJ. Survival modeling of discontinuation from psychotherapy: a consumer decision-making perspective. J Clin Psychol. (2015) 71:199–207. doi: 10.1002/jclp.22122
36. Zandberg LJ, Rosenfield D, Alpert E, McLean CP, Foa EB. Predictors of dropout in concurrent treatment of posttraumatic stress disorder and alcohol dependence: rate of improvement matters. Behav Res Ther. (2016) 80:1–9. doi: 10.1016/j.brat.2016.02.005
37. Yeung WF, Chung KF, Ho FYY, Ho LM. Predictors of dropout from internet-based self-help cognitive behavioral therapy for insomnia. Behav Res Ther. (2015) 73:19–24. doi: 10.1016/j.brat.2015.07.008
39. Lincoln TM, Rief W, Westermann S, Ziegler M, Kesting ML, Heibach E, et al. Who stays, who benefits? Predicting dropout and change in cognitive behaviour therapy for psychosis. Psychiat Res. (2014) 216:198–205. doi: 10.1016/j.psychres.2014.02.012
40. Smink WAC, Sools AM, van der Zwaan JM, Wiegersma S, Veldkamp BP, Westerhof GJ. Towards text-mining therapeutic change: a systematic review of text-based therapeutic change process research methods. PLoS ONE. (2019) 14:e0225703. doi: 10.1371/journal.pone.0225703
41. Smink WAC, Fox JP, Tjong Kim Sang E, Sools AM, Westerhof GJ, Veldkamp BP. Understanding therapeutic change process research through multilevel modelling and text mining. Front Psychol. (2019) 10:1186. doi: 10.3389/fpsyg.2019.01186
42. Lioma C, van Rijsbergen CJK. Part of speech n-grams and information retrieval. Rev Franc Ling Appl. (2008) 13:9–22. doi: 10.3917/rfla.131.0009
43. Smink WAC, Sools AM, Tjong Kim Sang E, Veldkamp BP, Westerhof GJ. The automation and explication of research methods: understanding their interplay through a framework, with therapeutic change process research as an use-case. In: Smink, WAC, editor. What Works When for Whom? A Methodological Reflection on Therapeutic Change Process Research. University of Twente (2021). p. 83–108. doi: 10.3990/1.9789036550338
44. Pennebaker JW, Boyd RL, Jordan K, Blackburn K. The Development and Psychometric Properties of LIWC2015. UT Faculty/Researcher Works (2015).
45. Boot P, Zijlstra H, Geenen R. The dutch translation of the linguistic inquiry and word count (LIWC) 2007 dictionary. Dutch J Applied Ling. (2017) 6:65–76. doi: 10.1075/dujal.6.1.04boo
47. Pennebaker JW, Mehl MR, Niederhoffer KG. Psychological aspects of natural language use: our words, our selves. Ann Rev Psychol. (2003) 54:547–77. doi: 10.1146/annurev.psych.54.101601.145041
48. Liehr P, Marcus MT, Carroll D, Granmayeh LK, Cron SG, Pennebaker JW. Linguistic analysis to assess the effect of a mindfulness intervention on self-change for adults in substance use recovery. Substance Abuse. (2010) 31:79–85. doi: 10.1080/08897071003641271
49. Postel MG, de Haan HA, De Jong CA. E-therapy for mental health problems: a systematic review. Telemed e-Health. (2008) 14:707–14. doi: 10.1089/tmj.2007.0111
50. Miller WR, Rollnick S. Motivational Interviewing, Helping People Change. 3rd edn. New York, NY: Guilford Press (2012).
51. Miller WR, Rose GS. Toward a theory of motivational interviewing. Am Psychol. New York (2010) 64:527–37. doi: 10.1037/a0016830
52. van den Hazel TS. Predicting Early Indicators of Dropout in Online Therapy for Problem Drinkers: Using LIWC to Analyse Email Contact Between Client and Counsellor. Enschede: University of Twente (2020).
53. Python Software Foundation. Python Language Reference. (2020). Available online at: http://www.python.org/
54. van den Bosch A, Busser B, Canisius S, Daelemans W. An efficient memory-based morphosyntactic tagger and parser for Dutch. In: van Eynde F, Dirix P. Schuurman I, Vandehinste V, editors. Selected papers of the 17th computational Linguistics in the Netherlands Meeting. Leuven (2007). p. 99–114
55. Bird S, Klein E, Loper E. Natural language processing with Python. 1st edn. Sebastopol: O'Reilly Media (2009).
56. Tjong Kim Sang E, de Vries BL, Smink WAC, Veldkamp BP, Westerhof GJ, Sools AM. De-Identification of Dutch Mental Health Data [Poster Presentation] (2019).
57. Chung C, Pennebaker JW. The psychological functions of function words. In: Fiedler K, editors. Frontiers Social Psychology. 1st edn. New York, NY: Psychology Press (2007). p. 343–59.
58. Tausczik YR, Pennebaker JW. The psychological meaning of words: LIWC and computerized text analysis methods. J Lang Soc Psychol. (2010) 29:24–54. doi: 10.1177/0261927X09351676
60. Smink WAC, Sools AM, Tjong Kim Sang E, Veldkamp BP, Westerhof GJ. The Automation and Explication of Research Methods: Understanding Their Interplay Through a Framework, With Therapeutic Change Process Research as an Use-Case (2020).
61. Gibson J, Can D, Xiao B, Imel ZE, Atkins DC, Georgiou PG, et al. A deep learning approach to modeling empathy in addiction counseling. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. San Francisco (2016). doi: 10.21437/Interspeech.2016-554
62. Ji S, Yu CP, Fung SF, Pan S, Long G. Supervised learning for suicidal ideation detection in online user content. Complexity. (2018) 2018:6157249. doi: 10.1155/2018/6157249
63. Tanana M, Hallgren KA, Imel ZE, Atkins DC, Smyth P, Srikumar V. Recursive neural networks for coding therapist and patient behavior in motivational interviewing. In: Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality. Denver (2015). p. 71–9. doi: 10.3115/v1/W15-1209
64. Xiao B, Imel ZE, Georgiou PG, Atkins DC, Narayanan S. Computational analysis and simulation of empathic behaviors: a survey of empathy modeling with behavioral signal processing framework. Curr Psychiat Rep. (2016) 18:49. doi: 10.1007/s11920-016-0682-5
65. Maity TK, Pal AK. Subject specific treatment to neural networks for repeated measures analysis. Proc Int MultiConf Eng Comput Sci. (2013) 1:60–5.
67. Breiman L. Statistical modeling: the two cultures. Stat Sci. (2001) 16:199–231. doi: 10.1214/ss/1009213726
68. Kotsiantis SB. Decision trees: a recent overview. Artif Intell Rev. (2013) 39:261–83. doi: 10.1007/s10462-011-9272-4
69. Bonab HR, Can F. A theoretical framework on the ideal number of classifiers for online ensembles in data streams. In: International Conference on Information and Knowledge Management, Proceedings. (2016). p. 2053–6. doi: 10.1145/2983323.2983907
70. Chen EE, Wojcik SPA. Practical guide to big data research in psychology. Psychol Methods. (2016) 21:458–74. doi: 10.1037/met0000111
71. Hajjem A, Bellavance F, Larocque D. Mixed-effects random forest for clustered data. J Stat Comp Sim. (2014) 84:1313–28. doi: 10.1080/00949655.2012.741599
72. Clarke G, Eubanks D, Reid E, Kelleher C, O'Connor E. Overcoming depression on the internet (ODIN) (2): a randomized trial of a self-help depression skills program with reminders. J Med Internet Res. (2005) 7:1–12. doi: 10.2196/jmir.7.2.e16
73. Eubanks CF, Burckell LA, Goldfried MR. Clinical consensus strategies to repair ruptures in the therapeutic alliance. J Psychother Integr. (2018) 28:60–76. doi: 10.1037/int0000097
75. Jensen M, Hussong AM. Text message content as a window into college student drinking: development and initial validation of a dictionary of “alcohol-talk”. Int J Behav Devel. (2019) 45:3–10. doi: 10.1177/0165025419889175
76. Bliuc AM, Doan TN, Best D. Sober social networks: the role of online supportgroups in recovery from alcohol addiction. J Comm Appl Soc Psychol. (2019) 29:121–32. doi: 10.1002/casp.2388
Keywords: therapeutic change process research (TCPR), alcohol use disorder (AUD), drop-out, web-based psychotherapeutic interventions, e-mail data, machine learning
Citation: Smink WAC, Sools AM, Postel MG, Tjong Kim Sang E, Elfrink A, Libbertz-Mohr LB, Veldkamp BP and Westerhof GJ (2021) Analysis of the Emails From the Dutch Web-Based Intervention “Alcohol de Baas”: Assessment of Early Indications of Drop-Out in an Online Alcohol Abuse Intervention. Front. Psychiatry 12:575931. doi: 10.3389/fpsyt.2021.575931
Received: 07 July 2020; Accepted: 18 October 2021;
Published: 15 December 2021.
Edited by:
Yasser Khazaal, University of Lausanne, SwitzerlandReviewed by:
Domenico De Berardis, Azienda Usl Teramo, ItalySaeideh Valizadeh-Haghi, Shahid Beheshti University of Medical Sciences, Iran
Copyright © 2021 Smink, Sools, Postel, Tjong Kim Sang, Elfrink, Libbertz-Mohr, Veldkamp and Westerhof. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Gerben J. Westerhof, Zy5qLndlc3RlcmhvZkB1dHdlbnRlLm5s