The Law of Recency: An Episodic Stimulus-Response Retrieval Account of Habit Acquisition

Giesen, Carina G.; Schmidt, James R.; Rothermund, Klaus

doi:10.3389/fpsyg.2019.02927

ORIGINAL RESEARCH article

Front. Psychol. , 15 January 2020

Sec. Cognitive Science

Volume 10 - 2019 | https://doi.org/10.3389/fpsyg.2019.02927

This article is part of the Research Topic On the Nature and Scope of Habits and Model-Free Control View all 15 articles

The Law of Recency: An Episodic Stimulus-Response Retrieval Account of Habit Acquisition

$\r\nCarina G. Giesen$ Carina G. Giesen¹

James R. Schmidt²

Klaus Rothermund^1*

¹Department of Psychology, Friedrich Schiller University Jena, Jena, Germany
²Department of Psychology, Université Bourgogne Franche-Comté, Dijon, France

A habit is a regularity in automatic responding to a specific situation. Classical learning psychology explains the emergence of habits by an extended learning history during which the response becomes associated to the situation (learning of stimulus-response associations) as a function of practice (“law of exercise”) and/or reinforcement (“law of effect”). In this paper, we propose the “law of recency” as another route to habit acquisition that draws on episodic memory models of automatic response regulation. According to this account, habitual responding results from (a) storing stimulus-response episodes in memory, and (b) retrieving these episodes when encountering the stimulus again. This leads to a reactivation of the response that was bound to the stimulus (c) even in the absence of extended practice and reinforcement. As a measure of habit formation, we used a modified color-word contingency learning (CL) paradigm, in which irrelevant stimulus features (i.e., word meaning) were predictive of the to-be-executed color categorization response. The paradigm we developed allowed us to assess effects of global CL and of an instance-based episodic response retrieval simultaneously within the same experiment. Two experiments revealed robust CL as well as episodic response retrieval effects. Importantly, these effects were not independent: Controlling for response retrieval effects eliminated effects of CL, which supports the claim that habit formation can be mediated by episodic retrieval processes, and that short-term binding effects are not fundamentally separate from long-term learning processes. Our findings have theoretical and practical implications regarding (a) models of long-term learning, and (b) the emergence and change of habitual responding.

Introduction

In the cafeteria, you might notice that you bought some fries for lunch – yet again – instead of the much healthier salad. After a long day at work, you might find yourself taking the way home to your old place rather than the new one you recently moved to. Everyone knows situations like these, in which we behave by mere force of habit, sometimes even against our good intentions. But how did we acquire these habits? What is the source of habitual behavior? Psychologists have pondered over the processes underlying habit formation for over a century now.

Currently, the theoretical terrain on habit acquisition is dominated by two accounts, based on either the “law of effect” or the “law of exercise” (for overviews, see, e.g., Wood and Rünger, 2016; Wood, 2017; Miller et al., 2019). Early accounts explained habit acquisition in terms of operant conditioning (Thorndike, 1898; Hull, 1943). According to Hull (1943), habit strength is a direct function of the reinforcement history of a particular response in a specific situation. Whereas responding is initially based on the trial-and-error principle, the likelihood of showing a particular response again in a given situation will increase if the response was rewarded, but will decrease if the response was punished in the past. This emergence of habits for behaviors that were reinforced before is called the “law of effect” (Thorndike, 1898). Learning psychology has seen some debates of what counts a reward or reinforcer, with suggestions ranging from stimuli that reduce states of deprivation of biological needs and that are adaptive for survival (Hull, 1943), to more formal definitions focusing on the transituationally stable quality of a stimulus to increase the probability of different behaviors of a specific organism (Meehl, 1950), to opportunities to execute behaviors that are chosen with high frequency under free-choice conditions (Premack, 1965). A detailed discussion of these accounts is beyond the scope of this article, but it is evident that rewards can also be subtle effects and qualities of the behaviors that are studied. We will take up this important point again in the General Discussion (section “What Is a Reward?”).

Even early learning psychology, however, already had another explanation of habit acquisition that was independent of reinforcement: According to the “law of exercise,” habits can emerge as a mere result of repeating the same behavior in the same situation over and over again (Thorndike, 1898). Since reinforcement and repetition are typically confounded, the outcome devaluation paradigm has been used in order to assess habitual behavior that is independent of reward or valuable outcomes (Dickinson, 1985). Several studies have shown that although outcomes have a strong influence on instrumental behavior, behavior that has been highly overlearned in many repetitions continues to be shown even in the absence of reward or after the outcome has lost all its reinforcing qualities. For instance, the behavior might still be present after having paired the outcome with shock or after providing so much of the reward (e.g., food) that the animal is completely satiated, resulting in a refusal to consume the previously rewarding outcome when it is available (e.g., Rescorla, 1991; Colwill, 1993). These findings provide unambiguous evidence that mere repetition of a response can produce habitual behavior independently of expected reward or reinforcement. In sum, then, the concept of a habit captures the fact that behaviors eventually are elicited in a more or less automatic fashion by situational cues, even in the absence of rewards and intentions.

The concept of a habit can be broadly defined to reflect automatic operant behavior that is elicited by certain stimuli or situations. According to this definition, habitual behavior is necessarily characterized as being automatic, although the reverse does not hold: Behaviors can share features of automaticity, without necessarily reflecting habitual behavior (e.g., Amodio and Ratner, 2011). For instance, behavior that is based on instincts or autonomous reflexes (“respondent behavior”) can operate automatically without being habitual, and automatic processes without a behavioral component are also not considered to reflect habits (e.g., automatic semantic activation). Thus, a crucial feature that characterizes habits on top of their reflecting features of automaticity is that habits refer to operant behaviors that result from some kind of learning or experience.

Importantly, this definition describes what a habit is, but it does not imply specific assumptions regarding its explanation. That is, a habit can be observed regardless of whether the behavior was reinforced in a certain situation or whether it was just executed (repeatedly or just once) in this situation (without necessarily having been reinforced). Relatedly, the definition of habitual behavior is mute with regard to its underlying causes. Habits might reflect associations between situational cues and responses that will emerge gradually as a consequence of repeated and/or rewarded pairings, as early learning theories have assumed. Again, however, alternative conceptions are possible that explain habitual behavior by automatic memory processes, without necessarily drawing on the concept of associations. Whatever the correct theoretical explanation is, characterizing a behavior as habitual implies that it is assumed to share some features of automaticity (e.g., goal-independence, efficiency, speed, unawareness; Bargh, 1994; Moors and De Houwer, 2006), that it is categorized as operant behavior, and that it is somehow related to learning/experience.¹

The present study proposes an alternative view according to which habit acquisition can be explained by recent cognitive accounts of automatic action regulation that draw on episodic memory models (indeed, this view is also suggested by Wood and Rünger, 2016). In line with such a perspective, we propose the “law of recency” as another route to habitual behavior. According to this instance-based account of habit acquisition, having executed a behavior in a specific situation increases the likelihood of executing the same behavior in the same situation again when it is encountered the next time, even in the absence of reward and although the behavior was executed only once (i.e., in the absence of multiple repetitions). The core focus of our study is to provide a test of the law of recency, and to dissociate influences of an instance-based retrieval of the behavior that was executed during the last encounter with the current situation from alternative explanations in terms of multiple repetitions (global contingencies) and reward. Specifically, we investigate whether habitual behavior resulting from pairings between a stimulus and a response can be explained in terms of such an episodic retrieval of responses. To provide a pure test of habitual behavior resulting from previous pairings, we used a paradigm that does not contain any kind of rewards, thus effectively ruling out any influence of reinforcement on the emergence of habits in our study.

It is important to note that our study does not claim to show that reinforcement is irrelevant for the emergence of habits. We just want to limit our study to the investigation of mere repetition effects, without making any claims regarding the validity of the “law of effect” or its underlying causes. Even if we fully succeeded in explaining effects of practice on the basis of episodic response retrieval, this would still leave room for the possibility of reinforcement having an independent, additional effect on habit acquisition, which may or may not be mediated by episodic retrieval.

Episodic Memory Models of Automatic, Stimulus-Based Action Regulation

The idea of stimulus-response bindings (“event files,” Hommel, 1998) is a central characteristic for stimulus-based action regulation accounts (Logan, 1988; Hommel et al., 2001; Rothermund et al., 2005). Accordingly, whenever a response is executed to a stimulus, their mental codes become integrated, resulting in episodic stimulus-response bindings that are stored in memory. Stimulus repetition on a later occasion triggers retrieval of the response that was bound to the stimulus. This will facilitate or impede performance, depending on whether the retrieved response is appropriate or not on the current trial. To date, a burgeoning amount of findings attests that storage and retrieval of these episodic stimulus-response bindings are pervasive principles of action regulation and apply to a broad scope of stimuli and responses (for an overview, see Henson et al., 2014).

A crucial difference between stimulus-response bindings and stimulus-response associations in standard learning paradigms is that stimuli and responses are typically not correlated in designs which are used to investigate stimulus-response binding and retrieval (SRBR) effects. Specifically, SRBR effects are assessed in a sequential trial design, in which the factors Stimulus Relation (i.e., does the stimulus repeat or change from trial n-1 to trial n) and Response Relation (i.e., does the response repeat or change from trial n-1 to trial n) are orthogonally manipulated. In other words, there simply is nothing to learn over the course of the experiment in these tasks, since each word is presented equally often with each response. Yet, it is an unresolved issue how SRBR effects relate to learning effects. Although this is a much debated and discussed topic, empirical findings so far are scarce and unsystematic (Colzato et al., 2006; Herwig and Waszak, 2012; Moeller and Frings, 2014, 2017; Schmidt et al., 2016, 2019). Some of these studies suggest that SRBR effects are only a transient “by-product” of distributed processing and intentional action planning but are unrelated to persistent learning effects (Colzato et al., 2006; Herwig and Waszak, 2012; Moeller and Frings, 2014, 2017). In turn, other studies favor the view that short-term binding effects and more persistent learning effects are essentially the same thing, only studied at different time scales (Schmidt et al., 2019). Hence, one could conceive of SRBR effects as “one trial learning” that serves as a founding stone for contingent associations which are stored in memory on a long-term basis. This reasoning is further supported by recent computational modeling simulations (Schmidt et al., 2016) which indicate that both types of effects might result from the same underlying learning mechanism.

An Episodic Account of Habit Acquisition

According to the present account, habitual responding results from (a) storing stimulus-response bindings in memory and (b) retrieving the most recent of these bindings when the stimulus is re-encountered on a later occasion. This leads to a reactivation of the response that was bound to the stimulus during the last occurrence of the stimulus. In other words, habitual responding can be understood as a result of previous stimulus-response bindings that emerged over the course of the experiment. First and foremost, we propose this account – the “law of recency” – as an explanation for habits that are based on repetition. According to this account, it is always the most recent instance of the current stimulus situation that is retrieved on the next occasion, and that influences responding in the current situation via a retrieval of the response that was shown during the previous instance. Our account provides an alternative explanation of repetition effects that competes with association- or frequency-based accounts of repetition-based habits that were proposed in the tradition of the law of exercise (e.g., Miller et al., 2019). The crucial difference between the two accounts is that according to the law of recency, it is the most recent episode that drives responding, whereas according to the law of exercise, the global frequency or contingency of responding to all previous occurrences of this situation is the decisive factor. To distinguish between these accounts, the behavior that was shown during the last occurrence has to be manipulated independently of the global context in which this behavior has been shown.² In the current study, we will manipulate these two factors independently.

Importantly, and in contrast to existing accounts on habit formation, stimulus-response bindings can emerge even in the absence of past reinforcement and hence do not rely on any behavior-reward correlation. Hence, our account predicts that habit formation should be possible even though responses are never reinforced. Importantly, our study is not meant to rule out any effects of reinforcement on habit acquisition (“law of effect”), nor do we test whether any such effect is due to episodic retrieval processes. We just wanted to make sure that the habitual behavior we studied reflects pure repetition effects, which is why we studied behavior in the absence of any tangible rewards.

To test the underlying causes of habit formation in the absence of reinforcement, we used a modified color-word contingency learning (CL) paradigm (e.g., Schmidt et al., 2007; for a review, see MacLeod, 2019). In our task, participants classify the color of printed words (neutral adjectives) on each trial. However, each word is presented most often in two of four colors (high contingency combinations) and less often in the remaining two colors (low contingency combinations, see Table 1). Although the word meaning is irrelevant for the color categorization task, participants learn the contingencies between word stimuli and color responses. Learning of contingencies served as an index of habit formation and is reflected in faster and more accurate performance on high compared with low contingency combinations (Schmidt et al., 2007; for related work, see Miller, 1987; Carlson and Flowers, 1996).

TABLE 1

Table 1. Example for word-color contingency manipulation in Experiments 1 and 2.

Deviating from previous research on CL, we chose to study the effects of comparatively weak and complex contingencies on behavior. Previous research already showed that participants produce contingency effects even when unaware of the contingencies, thus establishing the automatic (i.e., habitual) nature of behavior that is driven by the CL (Schmidt et al., 2007). Furthermore, learning in this paradigm is incidental, as participants are not informed in advance of contingencies and the words are irrelevant to the main task of color identification. In our study, we used much weaker contingencies than in the original paradigm, and we employed more complex rules in which one stimulus was systematically paired with two instead of just one response. Through these measures, the contingencies in our study were more subtle and much harder to detect, and they could not be translated into simple S→R rules (due to the dual response pairings), making it even less likely that our participants would be able to use the contingencies strategically. By implication, any effect of CL in our study can be taken as evidence for automatic behavior regulation, thus representing an index of habitual responding.

The core idea of our study is that habit acquisition that is based on CL can be explained in terms of an episodic retrieval of previous stimulus-response episodes (cf. Schmidt et al., submitted). For high contingency trials, probabilities are above chance (which is p = 0.25 in a four color choice task) that the word of the current trial was presented in the same color also during its last occurrence (in our study, this probability is p = 0.33 and p = 0.40 for Experiments 1 and 2, respectively), whereas for low contingency trials, probabilities of word-color repetitions are lower than chance (p = 0.17 and p = 0.10 for Experiments 1 and 2, respectively). By implication, retrieving the response that was stored together with the word during its last occurrence will facilitate responding for 33% (Experiment 1) or 40% (Experiment 2) of the high contingency trials, but for only 17% (Experiment 1) or 10% (Experiment 2) of the low contingency trials. Likewise, response retrieval of the last episode in which the word was presented will activate a different response and will delay responding for 67% (Experiment 1) or 60% (Experiment 2) of the high contingency trials but for 83% (Experiment 1) or 90% (Experiment 2) of the low contingency trials. Our study aims to test the hypothesis that retrieving the response from the last occurrence of the word stimulus drives the CL effect, and is the underlying mechanism of habit formation. We predicted that controlling for these differences in retrieving either the same or a different response should eliminate the global CL effect (cf. Schmidt et al., 2019).

As a crucial design feature of our study, we aimed to assess episodic response retrieval effects and CL effects simultaneously, that is, in the very same experiment. Our study had the following expectations: First, we predicted to find robust CL effects. Second, we predicted to find response retrieval effects (reflected in an effect of response relation regarding the current and previous occurrence of the word). Third, and most central to our research aims, we tested whether response retrieval effects can explain habit formation (i.e., the CL effect). We expected that CL will be substantially reduced (or even eliminated) as soon as we control for differences in response retrieval effects. Such a pattern of results would support the law of recency as an explanation of habitual behavior, while at same time controlling for (and ruling out) an alternative explanation in terms of the law of exercise (i.e., a global, frequency based account of repetition effects).

Experiment 1

Method

Participants

Thirty native German-speaking FSU Jena students (18 female; M_age = 23.03 years; range: 18–30 years) took part in the experiment. A priori power calculations (G^∗Power 3; Faul et al., 2007) showed that we need at least 27 participants to detect a medium sized effects (d = 0.5) with sufficient power (1-β ≥ 0.8). Up to six participants were tested in parallel. Each participant was seated individually in a small cubicle. Sessions lasted 25 min. Participants received €2.50 for their participation plus a chocolate bar or ice cream voucher if they fulfilled criteria for speed (more than 80% of all reaction times [RT] faster than 1000 ms) and accuracy (less than 15% errors) in the experimental trials. In accordance with guidelines of the American Psychological Association, prior to the study, all participants gave their explicit consent to take part via pressing the “j” key of the keyboard (responses to the informed consent were saved for each participant). The study was canceled before any data collection started for participants who did not give their consent. An ethics approval was not required as per applicable institutional and national guidelines and regulations because no cover-story or otherwise misleading or suggestive information was conveyed to participants (this procedure is in accordance with the ethical standards at the Institute of Psychology of the FSU Jena).

Apparatus and Stimuli

The experiment was programed with E-Prime 3.0. Stimuli were the four neutral monosyllabic German adjectives “warm” (“warm”), “klein” (“small”), “ganz” (“whole”) and “fast” (“almost”). Stimuli were presented in Times New Roman font (16 pts.) on a black background on a 17′′ inch CRT screen. A response pad, attached to the computer via the parallel port, served to collect responses. Participants responded by pressing four colored keys on the response pad with their middle and index fingers of the left and right hand (key order from left middle to right middle finger: red, green, blue, yellow). A fifth key, operated via (left or right) thumb press, was labeled with “Los” (“go”) and served to start the experiment.

Design

Central to our study, we manipulated the contingency between word stimuli and color responses: Each of the four word stimuli appeared in each of the four colors; however, combinations differed in their frequencies. Specifically, each word appeared twice as often in two colors (high contingency combinations) than in the two remaining colors (low contingency combinations), yielding a contingency ratio of 2:1. Thus, each word was predictive of two colors/responses (high contingency combinations) and non-predictive of the other two colors/responses (low contingency combinations).³

The contingency manipulation resulted in 16 different word-color combinations. Given that high contingency combinations were shown twice as often as low contingency combinations, this amounted to a total of 24 word-color combinations (i.e., 16 word-color combinations plus 8 “duplicates” resulting from the 2:1 contingency manipulation, see Table 1). Each word-color combination was presented as stimulus in trial n-1 and as stimulus in trial n, resulting in a total of 24^∗24 = 576 experimental trials.

As another advantage, the chosen design allowed us to analyze immediate trial sequences to assess SRBR effects in a systematic and fully controlled manner. For immediate trial sequences within each experimental list, we realized a maximally balanced 2 (contingency of present trial n: high vs. low) × 2 (contingency of preceding trial n-1: high vs. low) × Stimulus Relation between trial n and trial n-1 (stimulus repetition [SR; 25%] vs. stimulus change [SC; 75%]) × Response Relation between trial n and trial n-1 (response repetition [RR; 25%] vs. response change [RC; 75%]) design. Note that trial sequences for the SR-RR cell are only possible when trial n-1 and trial n both represent high contingency trials or when both represent low contingency trials (i.e., when the contingency matches between a trial sequence). Put differently, if both the stimulus and the response repeat in a given trial n from the previous trial n-1, then the contingency from the trial n-1 has to repeat as well. In turn, contingency mismatches (e.g., high contingency on trial n-1, but low contingency on trial n, or vice versa) are impossible to create within the SR-RR cell. Thus, to analyze SRBR effects, only trial sequences with matching contingencies were regarded.

Procedure

Instructions were given on screen. Participants were informed that on every trial, a word stimulus would first appear in white font and then change its color to red, green, blue, or yellow. Their task was to categorize the color of each word stimulus by pressing the corresponding key on the response pad. After reading the instructions, participants worked through 24 practice trials that were identical to trials in the experimental blocks. The practice block was repeated if more than 20% errors were committed. If error rates still exceeded 20% after the third run of the practice block, the experiment was terminated (however, this never occurred during data collection). Upon successful completion of the practice block, the main experiment started, consisting of 576 experimental plus 1 filler trial (i.e., trial 1, which had no preceding trial). After 288 trials were completed, participants were given a small, self-paced break. The first trial after the break was identical to the last trial before the break and served as filler. Filler trials were not analyzed. Experimental trials were presented in a continuous fashion. At the end of the experiment, participants were rewarded accordingly.

Each trial started with a fixation cross (500 ms), followed by a white word for a variable duration by randomly selecting one out of five possible durations (150, 200, 250, 300, or 350 ms) after which the word changed its color until key press. Erroneous responses elicited the feedback message “Fehler – reagiere sorgfältiger! Weiter mit ‘Los’ Taste” (“Error – be more accurate! Continue with ‘go’ key…”). Responses slower than 1000 ms elicited the feedback message “Zu langsam – reagiere schneller! Weiter mit ‘Los’ Taste” (Too slow – respond faster! Continue with ‘go’ key…”). Feedback was displayed in white font on red background until key press. Then, the next trial started.

Results

Trials with erroneous responses (6.8%) and RT outliers⁴ (2.6%) were excluded from all analyses.

Contingency Learning Effects

We compared performance in low contingency (M_RT = 534ms; M_err = 6.7%) with high contingency trials (M_RT = 528ms; M_err = 6.9%). For RTs, this comparison yielded a significant CL effect of Δ_low_–_high = 6 ms, t(29) = 3.13, p = 0.004, d_z = 0.57, BF₁₀ = 9.08. For error rates, the effect was not significant (Δ_low_–_high = −0.2%, |t| < 1).

Explaining Contingency Learning Effects by Response Retrieval Effects

We investigated whether response retrieval effects influenced responding, and whether they can explain CL effects. To this end, every trial was referenced back to the last prior occurrence of the current stimulus – effectively, this implies that this analysis is based on stimulus repetitions (see Figure 1B). Furthermore, stimulus repetition trials were coded with regard to two additional factors: First, we coded the relation between the responses to the word in the current trial as well as during its last occurrence, which could be the same or different (factor Previous Response). Second, we coded how distant the last occurrence was from the present stimulus repetition trial (factor Distance: immediate vs. non-immediate stimulus repetition). Distance was coded as a binary factor with “immediate stimulus repetition” indicating that the present stimulus was repeated from the immediately preceding trial n-1. In turn, trials in which the last occurrence of the current word stimulus were further away (i.e., trials n-2 to n-30) were coded as “non-immediate stimulus repetition” (see Figure 1B for illustrations). Only last occurrences in which a correct response was committed were included. Thus, data were analyzed in a 2 (contingency: high vs. low) × 2 (previous response: same vs. different) × 2 (distance: immediate vs. non-immediate stimulus repetition) ANOVA on mean RTs (the pattern of means is shown in Table 3).

FIGURE 1

Figure 1. Schematic trial procedure in Experiments 1 and 2. Note that in the experiments, all stimuli were presented on black background in white font or in the respective colors (see Table 1). For both figures, we inverted the coloring scheme only for illustrative purposes. Stimuli are not drawn to scale. Trials are classified as high vs. low contingency trials (for details, see Table 1). Arrows in (A) illustrate different trial types for immediate sequence effects from trial n-1 to trial n to test for immediate SRBR effects (SR, stimulus repetition; SC, stimulus change; RR, response repetition; RC, response change). Arrows in (B) illustrate trial classification for the central analyses of interest to explain contingency learning effects by response retrieval effects, i.e., whether a given trial reflected an immediate (solid/blue lines) vs. non-immediate (dotted/gray lines) stimulus repetition trial (factor Distance) with same or different response (factor Previous Response) compared to the last occurrence of the stimulus word. See main text for details.

Although we obtained a significant CL effect in our first analysis (without controlling for SRBR effects, see above), the main effect of contingency was no longer significant in the final analysis, F < 1, BF₀₁ = 6.79. Instead, the ANOVA yielded a main effect of previous response, F(1,29) = 179.96, p < 0.001, η_p² = 0.86, BF₁₀ = 3.817e + 21, indicating that performance was faster if the current stimulus repetition required the same previous response (M = 480 ms) compared with a different previous response (M = 548 ms). This pattern of findings confirms our hypothesis that controlling for episodic SRBR effects effectively eliminated the CL effect in Experiment 1. The main effect of the distance factor was also significant, F(1,29) = 141.22, p < 0.001, η_p² = 0.83, indicating that performance was generally faster for immediate stimulus repetitions (M = 497 ms) compared to trials in which the last occurrence of the same word stimulus was more distant (M = 531 ms). Main effects were qualified by a Distance × Previous Response interaction, F(1,29) = 322.52, p < 0.001, η_p² = 0.92. Follow-up tests showed that response retrieval effects were significantly stronger for immediate stimulus repetitions (M_sameresponse = 432 ms; M_{differentresponse} = 562 ms; t[29] = 16.53, p < 0.001, d_z = 2.78), but were also significant for stimulus repetitions of more distant trials (M_sameresponse = 528 ms; M_{differentresponse} = 534 ms; t[29] = 1.76, p = 0.045, one-tailed, d_z = 0.32). No other effect was significant (all Fs < 2.9, all ps ≥ 0.10).

Multi-Level Analyses

We also conducted multi-level analyses on the basis of individual trials, treating trials as nested within subjects. In these analyses, CL and response retrieval reflect between factors (on the level of trials), which allows us to simulate a stepwise regression approach to test whether entering response retrieval as an additional predictor in a second step eliminates effects of CL that had been significant when entered as a single predictor into the regression equation in step 1. The multi-level analyses also allow us to treat distance of the last occurrence as a continuous predictor, so we can calculate at which distance the effect of response retrieval effectively becomes zero.

A multilevel analysis with contingency (high frequency = 1 vs. low frequency = 2) as the only level 1 predictor, allowing for random intercepts and slopes, yields a significant CL effect, β = 6.19, t = 3.15, p = 0.004, replicating the effect of the previous analysis. Adding Previous Response (same = 1 vs. different = 2), as an additional level 1 predictor in a second step produced a highly significant effect for this variable, β = 34.21, t = 9.30, p < 0.001, and it rendered the effect for the CL variable non-significant, β = 0.59, t = 0.28, p = 0.78. Effectively, then, although CL predicts RT when considered in isolation, this effect is fully explained by response retrieval.

Although we were primarily interested in the main effects of CL and response retrieval, the multinomial model also allows us to introduce an interaction term for the two variables (CL × previous response). Adding the product term in a third step yields a beta that is positive and significant (t = 2.19, p = 0.029). This interaction indicates that effects of response retrieval were slightly stronger for low contingency trials, that is, responses were slowest for low contingency trials in the “different response” condition. A plausible explanation for this asymmetry is that response retrieval may not only be influenced by the last occurrence of the stimulus but may probably also sometimes retrieve an earlier episode in which the stimulus was presented. For low contingency trials in the “different response” condition, such a retrieval of an earlier episode will retrieve a different response in 83% of these trials. For high contingency trials in the different response condition, only 67% of the previous occurrences of the word contained a different response, 33% of the trials contained an identical response. It is thus possible that in some high contingency trials in the “different response” condition, the correct response was retrieved from an earlier episode (leading to a facilitative effect that counteracted the delay effect in the “different response” condition), even though the last occurrence of the word was paired with a different response.

Another multi-level analysis was used to evaluate the moderating effect of distance on effects of response retrieval. For this purpose, we predicted RT with the previous response factor (pr), distance (d), and their interaction (pr × d). We also added a squared term for distance (d²) and the interaction of this term with previous response (pr × d²) to allow for a non-linear decline of the influence of response retrieval with increasing distance. The full model yielded significant effects for all predictors (all p < 0.001). The regression equation is given by the following set of parameter values: RT = 341 + 105.31pr + 46.72d–2.11d²–25.43pr × d + 1.15pr × d². Transforming this equation into a form that represents the slope of pr as function of d and d² gives: RT = 341 + (105.31–25.43d + 1.15d²)^∗pr + 46.72d–2.11d². Setting the quadratic formula in brackets that represents the slope for pr to zero and solving for d yields d = 5.52, that is, the slope for response retrieval becomes zero at a distance between 5 and 6 trials.

Stimulus-Response Binding and Retrieval Effects

To test for SRBR effects, we analyzed immediate sequence effects from trial n-1 to trial n (cf. Figure 1A). In these analyses, only sequences with matching contingencies were regarded (see Method section for details). We performed two separate 2 × 2 × 2 repeated measurement analyses of variance (ANOVA) with the factors stimulus relation (stimulus repetition vs. stimulus change from trial n-1 to trial n), response relation (response repetition vs. response change from trial n-1 to trial n), and type of prime-probe contingency match (both trial n-1 and trial n high contingency vs. both low contingency) on trial n performance (i.e., RTs and error rates; see Table 2 for means).

TABLE 2

Table 2. Results for SRBR effects (probe RT and error rates) in Experiments 1 and 2.

For RTs, the ANOVA yielded significant main effects of stimulus relation, F(1,29) = 7.74, p = 0.009, η_p² = 0.21, and response relation, F(1,29) = 174.16, p < 0.001, η_p² = 0.86, indicating that RTs were faster for stimulus repetition (M = 495 ms) compared with stimulus change trials (M = 505 ms) and that probe RTs were faster for response repetitions (M = 444 ms) than for response changes (M = 556 ms). Most importantly, both effects were qualified by a significant Stimulus Relation × Response Relation interaction, F(1,29) = 39.62, p < 0.001, η_p² = 0.58, that reflected the typical pattern of SRBR effects. Follow-up tests showed that compared to stimulus change from trial n-1 to trial n, stimulus repetition significantly sped up performance by Δ_SCRR–SRRR = 29ms, t(29) = 5.48, p < 0.001, d_z = 1.00, for response repetition. In turn, stimulus repetition (compared with stimulus change from trial n-1 to n) significantly slowed down performance by Δ_SCRC–SRRC = −10 ms, t(29) = 2.42, p = 0.022, d_z = 0.44, for response changes. No other effect was significant (all Fs < 1.06, all ps > 0.30).

For error rates, the same ANOVA yielded only a main effect of response relation, F(1,29) = 65.81, p < 0.001, η_p² = 0.69, indicating that participants made fewer errors on response repetition (M = 2.4%) than on response change sequences (M = 7.4%). No other effect was significant (all Fs < 3.2, all ps > 0.08).

Discussion

The results of Experiment 1 are clear-cut: First, we obtained a CL effect, indicating that participants incidentally learned the word-color response associations over the course of the experiment. Second, we obtained robust response retrieval effects, reflecting faster RTs in the current trial when the same response had been given during the last occurrence of the word stimulus that was also presented in the current trial, compared to trials when a different response had been executed during the last occurrence. Third and most central to our research aims, the CL effect was effectively eliminated after controlling for effects of response retrieval. This pattern of findings emerged both for ANOVA analyses with aggregated data and also in multilevel analyses in which CL and response retrieval were coded on a trial level. Importantly, effects of response retrieval were not limited to the immediately preceding trial, but were found for distances up to 5–6 trials, ruling out alternative explanations of the effect in terms of mere response repetition. For immediate stimulus repetition sequences (distance = 1), effects of response retrieval are identical to effects of response repetition, until sequences in which the stimulus changes are used as a baseline. These analyses replicated the standard pattern of SRBR effects that obtained in many previous studies (Rothermund et al., 2005; see also Frings et al., 2007; Giesen and Rothermund, 2014), rendering explanations of response retrieval effects in terms of mere response repetition unlikely. Together, findings from Experiment 1 support predictions derived from the law of recency that episodic retrieval of responses from the most recent occurrence of the stimulus represents a central process underlying habit formation (i.e., learning of word-response contingencies). Effects of global SR contingencies were completely eliminated after controlling for an influence of the most recent last episode, which rules out frequency-based explanations (law of exercise) of habitual responding in the current study.

The CL effect observed in Experiment 1 was smaller than in previous studies [d_z = 0.57, reflecting a medium-sized effect according to Cohen (1969) compared with effect sizes between d_z = 0.62 up to d_z = 1.24, reflecting medium-to-large- to very-large-sized effects in Schmidt et al., 2007]. In our view, this is probably due to the fact that Experiment 1 had a contingency ratio of only 2:1, which is a rather weak contingency manipulation in and of itself and it is known that the magnitude of contingency effects is proportional to the contingency (Forrin and MacLeod, 2018; see also, Schmidt and De Houwer, 2016). The low contingency was chosen on purpose, since we wanted to make sure that contingencies went undetected, and thus could not be applied in a strategic fashion. However, being aware of the fact that single studies pose the risk of being unreliable (Cesario, 2014; see also Tversky and Kahneman, 1971) and that replication is an increasingly important research value (Nosek et al., 2012), we ran a second experiment with the aim to replicate our initial findings from Experiment 1, but with a stronger contingency manipulation (ratio of 4:1) to boost CL effects. By increasing the contingency we wanted to establish that the contingency effect itself is strong beyond any reasonable doubt, so that eliminating the effect by controlling for effects of response retrieval cannot be attributed to the contingency effect being unreliable in the first place. Although the contingency that was chosen in Experiment 2 is stronger than in Experiment 1, we want to emphasize that it is still much weaker than in previous studies that already demonstrated contingency effects in the absence of awareness (Schmidt et al., 2007). Furthermore, Experiment 2 again used contingencies in which one stimulus was predictive of two different responses, preventing a simple strategic use of the contingencies for response preparation. Furthermore, Experiment 2 was preregistered online before any data collection started (see details below).