- 1Vision and Cognitive Neuroscience Lab, Department of Psychology, Neuroscience, and Behaviour, McMaster University, Hamilton, ON, Canada
- 2Rotman Research Institute, Baycrest Centre for Geriatric Care, North York, ON, Canada
- 3Department of Psychology, University of Toronto, Toronto, ON, Canada
- 4Department of Neuroscience, University of Lethbridge, Lethbridge, AB, Canada
When the outcome of a choice is less favorable than expected, humans and animals typically shift to an alternate choice option on subsequent trials. Several lines of evidence indicate that this “lose-shift” responding is an innate sensorimotor response strategy that is normally suppressed by executive function. Therefore, the lose-shift response provides a covert gauge of cognitive control over choice mechanisms. We report here that the spatial position, rather than visual features, of choice targets drives the lose-shift effect. Furthermore, the ability to inhibit lose-shift responding to gain reward is different among male and female habitual cannabis users. Increased self-reported cannabis use was concordant with suppressed response flexibility and an increased tendency to lose-shift in women, which reduced performance in a choice task in which random responding is the optimal strategy. On the other hand, increased cannabis use in men was concordant with reduced reliance on spatial cues during decision-making, and had no impact on the number of correct responses. These data (63,600 trials from 106 participants) provide strong evidence that spatial-motor processing is an important component of economic decision-making, and that its governance by executive systems is different in men and women who use cannabis frequently.
1. Introduction
Adaptive decision-making is governed by the interaction of several brain circuits, each of which has unique aspects that are advantageous under particular circumstances. For instance, a classic distinction has been made between goal-directed control, involving the prefrontal cortex (PFC) and medial striatum, and habitual control systems comprised of the sensorimotor cortex and lateral striatum (Balleine and O'Doherty, 2010; Gruber and McDonald, 2012). The goal-directed system appears to implement executive functions, such as working memory and strategic planning (Fuster, 1989; Passingham and Wise, 2012). Cannabis abuse compromises the normal ability of this executive system to suppress sensorimotor responding (Knight et al., 1999; Malone and Taylor, 2006; Filbey and Yezhuvath, 2013; Rae et al., 2015). Here we report a new dissociation among executive and sensorimotor systems governing choice, which allows us to quantify their interaction while accounting for important confounding factors such as decision time and learning.
When rewards are uncertain, the most pervasive strategy in animals and humans is to repeat choices that have previously led to reward (win-stay), and to shift away from choices following reward omission (lose-shift; Kamil and Hunter, 1970; Worthy et al., 2013; Thorndike, 2017). Although the win-stay and lose-shift are complementary response strategies, they are anatomically disassociated among goal-directed and sensorimotor systems. Lesions to the rodent lateral striatum (LS), which is homologous to the human putamen and essential for sensorimotor control (Parent and Hazrati, 1995), disrupt lose-shift responding but not win-stay (Skelin et al., 2014; Gruber et al., 2017; Thapa and Gruber, 2018). A similar shifting deficit has been observed in humans with damage to putamen or insula (Danckert et al., 2011). Conversely, lesions of the rodent ventromedial striatum (VS), a key structure in goal-directed control that receives inputs from prefrontal cortex (Voorn et al., 2004), disrupts win-stay but not lose-shift responding (Gruber et al., 2017). Several other behavioral features in rodents and humans support this anatomical disassociation. The win-stay and lose-shift exhibit different temporal dynamics (Gruber and Thapa, 2016), developmental trajectories (Ivan et al., 2018), and responses to reward feedback (Banks et al., 2018). Moreover, lose-shift responding (but not win-stay) drastically increases in adult humans under cognitive load, as well as in young children (Ivan et al., 2018). These data suggest that executive function can override lose-shift responding, which can be characterized as a reflexive response by the sensorimotor striatum. This is consistent with a long history of research indicating that executive function can suppress reflexive or habitual motor responses (Chamberlain and Sahakian, 2007).
The LS/putamen receives prominent inputs from both the somatosensory and motor cortices (Brasted et al., 1999), and encodes the motor aspects of decision-making (Burton et al., 2015). Consequently, decisions and their associated motor actions are represented in egocentric (body-centered) spatial coordinates (Kesner and DiMattia, 1987; Palencia and Ragozzino, 2005). The dorsolateral caudate, which receives inputs from the dorsolateral PFC in primates, is also necessary for egocentric spatial processing (Possin et al., 2017). The VS/nucleus accumbens encodes the value of choices and can engage a wide range of spatial-motor actions when executing a single decision involving abstract representations (Burton et al., 2015). It encodes responses in both egocentric and allocentric (world-centered) spatial coordinates, likely involving its prominent inputs from the hippocampus and PFC (Voorn et al., 2004; De Leonibus et al., 2005; Possin et al., 2017). These data suggest that the control of actions by sensorimotor systems will be restricted to an egocentric framework heavily dependent on the position of targets, whereas the control by executive systems have the capacity to use abstract features of targets. This is supported by the dissociated effects of cannabis on neural structures and performance on spatial vs. non-spatial tasks, as described next.
The recreational use of psychoactive substances has complex short-term and long-term effects on the brain, some of which dissociate. Δ9- tetrahydrocannabinol (THC) administration increases dopamine release in the LS, while the VS remains unaffected (Sakurai-Yamashita et al., 1989). Behaviorally, dopamine signalling in the dorsolateral striatum is necessary for normal spatial memory, motor control, and visuospatial learning, while reward processing and goal-directed learning rely on dopamine signalling in the medial striatum (Darvas and Palmiter, 2009, 2010). This provides an explanation for the observation that THC reduces spatial processing and visuospatial memory in humans (Cha et al., 2007), particularly in females (Pope et al., 1997). Recreational drugs also differentially influence win-stay and lose-shift responding. THC and amphetamine cause large changes in lose-shift behavior in rats and humans, while the win stay is only weakly affected (Paulus et al., 2002; Wong et al., 2017a,b).
Because the LS is necessary for lose-shift responding, and it processes information in egocentric coordinates, we hypothesized that lose-shift responses are calculated according to the position of a target relative to the participant, rather than other visual features of target identity. We further hypothesized that frequent cannabis use will disrupt the normal positional-dependence of lose-shift and the ability of executive systems to govern sensorimotor control. We tested these hypotheses by having human participants engage in a competitive decision-making task between two choices. Crucially, the choices were visually distinct and changed their spatial configuration unpredictably between each trial. We found that following a loss, lose-shift behavior was robustly associated with a choice's previous location, rather than its visual identity. The win-stay was only weakly associated with previous choice position, and this association was eliminated by global changes in target position. Although female cannabis users exhibited reduced task performance and increased lose-shift responding, their reliance on spatial information was not different than controls. Male cannabis users, however, did exhibit a reduced reliance on spatial information. These data support the dissociation of choice among systems with different spatial propensities, and reveal a sexual dimorphism of recreational cannabis use and the function of these systems.
2. Methods
2.1. Behavioral task
During the experiment, participants played a competitive game colloquially called “Matching Pennies” against a computer opponent. The task display consisted of two distinct targets (a blue circle and yellow square) presented on a 15″ touchscreen monitor (Figure 1). On each trial, participants would choose either target by touching it on the screen. They would then receive visual feedback indicating “You Win” or “You Lose” for 1.5 s. On each trial, the computer algorithm attempted to predict which target would be selected. If the participant selected this target, the trial was a loss. Otherwise it was a win. The algorithm attempted to minimize the number of wins for participants. The optimal strategy for the participants is to be unpredictable in choice, in which case they win on 50% of trials. Because the win-stay and lose-shift are predictable by the computer, subjects should learn to suppress these responses as the session progresses. This task provides measures of lose-shift, win-stay, and cognitive flexibility (i.e., response entropy) as participants adapted their choices to the computer opponent.
 
  Figure 1. Behavioral task. (A) Timeline of trials in the matching pennies game. (B) Reconfiguration of targets, which could undergo local (swap and displace) and global changes in position.
The computer used four types of algorithms to detect patterns in (i) participants' choices, (ii) switching from one choice to another, (iii) choices paired with rewards (e.g., blue square after a loss), and (iv) switching paired with rewards (e.g., swapping choices after a loss). Specifically, on each trial the computer examined a subject's recent choice and reward history (e.g., shifting from the blue target to the yellow after a loss). The choice that most accurately predicted the subject's past choice history was selected as the prediction of the present choice. Patterns of choices 1–6 trials in length were considered, resulting in 24 total prediction strategies. On each trial, the best performing strategy (computed over all previous trials in the session) was used to predict participants' choices. If all strategies failed to beat the participant on ≥ 50% of past trials, the computer would select choices randomly.
The effect of cue position was investigated by moving the location of one or both cues from one trial to the next. The changes came in two types—local and global. The screen was divided into four equal quadrants, each of which contained an invisible 2 × 2 grid in its center. Local changes occurred within each grid, while global changes involved shifting among quadrants. Three local manipulations are of particular interest to investigate the importance of position and cue identity. The “Control” case is when the cue positions remain in the same locations. The “Swap” case occurs when the targets swap positions (a local change). The “Displace” condition occurs when the previously selected choice moves to a previously empty position in the same 2 × 2 choice grid, while the other target remains in its previous position. Global changes occurred independently of each of these three local changes, for a total of six possible changes of target positions across subsequent trials, which were selected randomly on each trial. This manipulation allowed us to determine how the position of targets relative to each other, and to the participants, affected choice following wins and losses. In particular, this design allows us to test if participants avoid the screen position of a target following a loss, as expected by the egocentric processing framework of LS. Alternatively, they may instead avoid the target regardless of position.
2.2. Procedure
All procedures and experimental tasks were approved by the McMaster University Research Ethics Board. One hundred and six undergraduates (53 males, mean age = 19.40, SD = 2.74) from McMaster University participated in the study in exchange for payment. After providing informed consent, participants played 600 trials of the task. They were informed that they were competing against a computer opponent and would win nothing each time the computer predicted their choice and $0.03 each time it could not, rounded up to the nearest $5 upon completion of the experiment. Participants were given no guidance as to optimal decision-making strategies.
After task completion, participants were screened with the South Oaks Gambling Screen (SOGS), the alcohol, smoking, and substance involvement screening test (ASSIST) v3.0, Adult ADHD Self-Report Scale (ASRS) v1.1, and an additional demographic questionnaire. Habitual cannabis users were defined as those meeting the criteria for brief or intensive treatment (score >3) on the ASSIST cannabis subtest. Controls were defined as those reporting zero cannabis use within the three months prior to the experiment. Total drug use was also recorded as the ASSIST score summed across all drug subtypes. Males had a mean ASSIST score of 19.11, with 32.08, 39.62, and 60.38% meeting the criteria for alcohol, cannabis, or any recreational drug use requiring intervention. Females had a mean ASSIST of 16.66, with 28.30, 30.19, and 45.28% meeting alcohol, cannabis, or general drug use criteria (see Table 1). Of the 37 subjects who met the cannabis use criteria, 18 (7 females) indicated having used cannabis once or twice in the last three months, 9 (4 females) that they used cannabis monthly, 4 (1 female) indicated weekly usage, and 6 (3 females) daily usage.
In addition, 25 subjects (11 females) reported symptoms consistent with adult ADHD, as assessed by the ASRS. Only two males reported behavior indicative of a pathological gambler. Consequently, the effects of gambling history on task performance were not assessed. Similarly, there was insufficient variability in subject age to investigate whether win-stay or lose-shift behavior changes with age.
2.3. Analysis
Participants' responses were analyzed for proportion of lose-shift and win-stay responses, averaged over five blocks of 120 trials each, and conditioned on the type of cue shift relative to the previous trial's target positions. As a measure of behavioral flexibility, the binary response entropy (H) for each participant was calculated from four-trial choice sequences as:
Where Pi is the probability of each choice sequence, and k is the total number of sequences possible (i.e., 16). For example, a participant that exhibited the choice pattern “circle-square-circle-square” to the exclusion of all other patterns, would have an entropy of 0 bits, while a participant responding randomly would have an entropy of four bits. Response entropy and task performance were averaged over the experimental session for each participant. Decision times were measured as the time to make a response following presentation of the choice selection screen. They were normalized using the inverse transform (1/RT) and averaged after removing 131 erroneous RTs of <3 ms. The inverse transform was used to normalize RTs because it produced more normalized (Gaussian) distributions than did the log or square-root transforms.
The differences of marginal means of derived quantities (decision time, lose-shift, etc.) were tested by analysis of variance (for categorical factors) or co-variance (for continuous factors) using repeated-measure, mixed-effects models. Each model utilized a maximal random-effects structure and was fit in R using the lme4 package (Bates et al., 2014). A maximal model ensured that variations in effects between participants, and between trial blocks within each participant, were properly controlled (Barr et al., 2013). Degrees of freedom and p-values were calculated using the Welch-Satterthwaite equation and type-III sums of squares. The effects of local changes in position were assessed via planned paired t-tests comparing the effects of spatial swaps and displacement relative to the no-change condition. For all statistical comparisons, t-tests were only employed following significant ANOVA results, with statistical significance being assessed with a firm p < 0.05 cutoff. As such, no statistical corrections for multiple comparisons were employed. Additional statistical characteristics (e.g. Cohen's D, confidence intervals) are reported in Table 2 for analyses of primary importance to our interpretation of the data.
We also used the Q-learning with forgetting (Barraclough et al., 2004) reinforcement learning model to examine the effects of cannabis use, local changes, and global changes on reward sensitivity, choice stochasticity, and learning rates. In this model the probability of selecting one of the two choices (C) on a given trial (t) are calculated according to the softmax equation (Sutton and Barto, 2018).
where Qi and Qj are the value each subject assigns to choices i and j. β refers to the inverse temperature that balances exploiting known action-reward associations with exploring more of the state/action space. As such, larger values of β indicate a greater tendency to choose the most highly valued action. The values of each choice are updated from rewards (R) according to the following rules:
where α is the learning and forgetting rates for the chosen and unchosen action, κ1 is the strength of reinforcement from reward, and κ2 is the strength of aversion from failing to receive a reward. These three parameters were treated as stochastic variables that follow a random walk process. As such, they were free to vary throughout the experiment. Conversely, β was treated as a deterministic variable that remained fixed throughout the experiment. These parameters were fit for each subject using the VBA toolbox (Daunizeau et al., 2014).
To determine how local swaps, displacement, and global changes influenced RL parameters (i.e., hidden state values), we performed a Volterra decomposition of α, κ1, and κ2 values for each trial onto five basis functions (u): previous choice, outcome, local displacement, swap, and global change (relative to no change), according to (Equation 4):
Volterra modeling allows for observation of input response characteristics of non-linear systems as Volterra weights (Boyd et al., 1984). At each trial t the Volterra weight x of a given parameter is estimated from inputs u over trials t to a lag of τ (set to 32 trials) using a series of Volterra kernels ω. The first kernel ω1 represents the linear transformation of lagged input basis functions into the output, ω2 represents the effect of past inputs being dependent on other earlier inputs, and so on. These weights provide a measure of how subjects' valuation of each choice changes from baseline in response to past choices and outcomes. The benefit of Volterra modeling over analysis of raw prediction error is that the effect of current and past inputs on hidden state responses can be estimated. Inputs were also orthogonalized so that the effect of one input (e.g., local swaps) is computed independently of all other inputs (e.g., global changes). To control for trial order effects, we also detrended inputs prior to decomposition using a cubic polynomial.
3. Results
3.1. Cannabis use coincides with sensorimotor dominance of decisions in women
Each of the 106 included participants performed 600 trials of the task, for a total of 63,600 trials in the dataset. We first sought to reveal how recreational cannabis use and biological sex affected overall performance on the task. We compared the effects of sex (male, female) and habitual cannabis use on task performance via a 2 × 2 analysis of variance (ANOVA) with type-III sums of squares and a zero-sum constraint. Following statistically significant ANOVA results, t-tests were used to further elucidate differences between individual groups.
There was a significant main effect of cannabis use on task performance [F(1,102) = 4.772, p = 0.032] and a sex × cannabis use interaction [F(1,102) = 6.540, p = 0.012], while the main effect of sex was not significant (p = 0.271). Consequently, cannabis use was associated with decreased task performance in females [t(51) = −3.123, p = 0.003, d = −0.934]A, while males were unaffected (p = 0.777; Figure 2A). The interaction between sex and cannabis use was reflected in similar trends in response entropy and decision times (Figures 2B,C). The effects of sex, cannabis use, and sex × cannabis use on response entropy were not significant (p > 0.055 in all cases). However, cannabis use was associated with decreased decision times in females [t(51) = −3.024, p = 0.004, d = −0.905]B, while those of men were again unaffected (p = 0.399), as indicated by a significant sex × cannabis use interaction [F(1,102) = 7.701, p = 0.007]. However, no main effects were present (p > 0.112 in both cases).
 
  Figure 2. Effect of recreational drug use on measures of task performance in males and females: proportion of wins (A); response entropy (B); and decision times (C). Plots show the conventional descriptive statistics: mean (diamond), median (horizontal line in the box), 25th/75th percentiles (box edges), and outliers (dots). *p < 0.05, **p < 0.01.
As expected, task performance was highly correlated with response entropy [r(104) = 0.699, p < 0.001], highly anti-correlated with mean lose-shift tendencies [r(104) = −0.605, p < 0.001], and not correlated with win-stay responding [r(104) = −0.095, p = 0.333]. Therefore, frequent cannabis use in females strongly coincides with increased lose-shift responding and decreased response times. These features are consistent with dominance of sensorimotor control in decision processes. Moreover, the tendency for increasingly stereotyped response sequences in females that frequently used cannabis further suggests a reduction in cognitive flexibility, defined here as a loss of ability to generate varied response types following many losses so as to improve task performance. In no case was task performance significantly affected by age, gambling history, or ADHD symptoms (p > 0.105 in all cases).
3.2. Spatial cues drive lose-shift and win-stay responding
The optimal strategy on the task is to simply select targets at random on each trial of the session. Deviation from this optimal strategy is revealing of neural processes guiding behavioral choice. Lose-shift responding is maladaptive in this context, but nonetheless is a prevalent strategy. We next investigated to what extent spatial and/or other visual features of targets affect the propensity for lose-shift and win-stay behaviors. The task design allows us to test if participants avoid the screen position of a target following a loss, as expected by the egocentric processing framework of LS, or if they avoid the target itself regardless of position. We compared the effects of local and global changes in target position via 2 (global change, no change) × 3 (no local change, displace, swap) mixed-effects models with repeated measures and a full random effect structure (see Section Methods). We then conducted planned comparisons of marginal means for significant effects revealed by ANOVA.
Lose-shift behavior was strongly affected by local changes in position across the 600 trials of the session parsed into five blocks [RMANOVA, F(2,117.43) = 25.643, p < 0.001]. No effects of global changes [F(1,320.75) = 0.368, p = 0.545], nor a local × global interaction [F(2,147.87) = 2.285, p = 0.105] were present. As seen in the left panel of Figure 3A, participants exhibited a high degree of lose-shift responding when choices did not move between trials, particularly in the first three blocks (360 trials). However, swapping choice positions strongly reversed their associated lose-shift probabilities [t(529) = −7.249, p < 0.001, d = −0.507]C. This is particularly evident in the first three blocks. Because we are computing shifts with respect to each target (rather than position), a lose-shift probability <0.5 on swap trials indicates that participants are selecting the same target, but in a new location. This is a lose-stay response in terms of target identity, but a lose-shift in terms of spatial position. In other words, the blue and orange lines should overlap if lose-shift is computed with respect to target identity irrespective of location. Therefore, lose-shift is based on the previous position of an unrewarded target, rather than its identity as distinguished by other features (color and shape).
 
  Figure 3. Effects of target reconfiguration and recreational drug use on reinforcement-driven behavior. (A,B) Effect of local and global changes in choice position on lose-shift and win-stay tendencies for all participants. SEM in error bars. (C,D) Box plots of the difference in lose-shift and win stay when target positions are swapped as compared to no change. The effect of swapping targets on lose-shift is higher in women who use cannabis, but lower in men who use cannabis, than their sex-matched controls. (E) Correlation between total drug use (ASSIST) and swap effect on lose-shift. *p < 0.05.
Although, participants are able to eventually suppress lose-shift responding after hundreds of trials, this only occurs in the absence of global changes in target positions (compare left and right panels of Figure 3A). The effects of local swaps remained even in the presence of global changes [t(529) = −8.056, p < 0.001, d = −0.459]D. Furthermore, the effect of local displacement was only significant in the presence of global changes [t(529) = −2.827, p = 0.005, d = −0.160]E, and reduced the participant's use of the lose-shift as compared to global changes and no displacement. Global changes thus appear to immediately reduce the probability of lose-shift early in the session, but to also interfere with the ability of participants to eventually learn to suppress this sub-optimal response near the end of the session.
The same analysis was repeated for win-stay responses. As seen in Figure 3B, local changes [F(2,105.65) = 5.470, p = 0.005] and a local × global interaction [F(2,159.33) = 11.070, p < 0.001] had a significant effect on win-stay responding, while the main effect of global changes were not significant [F(1,1117.7) = 3.792, p = 0.052]. Across the entire session, both local swaps [t(529) = −6.565, p < 0.001, d = −0.438]F and displacement [t(529) = −4.388, p < 0.001, d =−0.221]G reduced win-stay responding when no global change was present. Unlike the lose-shift, win-stay behavior was initially unaffected by local changes. As trial blocks progressed, however, both win-stay and the effects of local changes increased. Global changes completely eliminated any effects of local changes throughout the session. The most parsimonious explanation is that subjects eventually learn to suppress the shift response, which reveals the stay response as the session progresses. To test if these are in competition, we computed the correlation of lose-shift and win-stay using data separated into the five trial blocks in the no change condition. Initially they were uncorrelated [r(104) = 0.008, p = 0.935]. However, as trials progressed the win-stay and lose-shift exhibited an increasingly negative correlation of [r(104) = −0.310, p = 0.001] in block 3 and [r(104) = −0.406, p < 0.001] in block 5. Therefore, competition between these strategies increases with time such that the competition is initially biased toward shift responses, but becomes biased toward stay responses as the session progresses.
Because the time between reinforcement and subsequent decisions affect lose-shifting (Gruber and Thapa, 2016; Ivan et al., 2018), we next analyzed whether changes in these response types could be explained by effects of target manipulation on decision times. Analysis of decision times (inverse transformed) indicated significant effects of local [F(2,140.14) = 9.668, p < 0.001] and global [F(1,109.81) = 29.731, p < 0.001] changes in position, and a local × global interaction [F(2,140.15) = 9.672, p < 0.001]. To further elucidate these effects, Table 3 provides decision times for each response type following local and global changes. Regardless of response type, global changes significantly increased decision times. More importantly, local swaps increased the time of lose-shift, win-shift, and win-stay responses, particularly when no global changes were present. While it could be argued that this increase is only due to the extra time needed to move to a new location, the fact that these effects are not consistent between response types indicates otherwise. Instead, this finding suggests that actions are planned prior to target presentation, and must be updated when target positions change to unexpected locations. The difference in decision time when targets are moved is on the order of 0.1 s, which is too small to account for changes in lose-shift or win-stay responding as a decay in memory of the previous reinforcement (Ivan et al., 2018).
3.3. Cannabis use modulates the lose-shift
We next analyzed the effects of cannabis use on lose-shift by a mixed-effects model testing for the effects of sex (male, female), local changes (no change, displace, swap), and cannabis use (controls, habitual users). Models were fit separately to trials with and without global changes in position, in order to simplify model interpretation. A random-intercepts-only structure was used because the full random-effects structure resulted in an over-fit model.
Following no global change, there was again a significant main effect of local changes [F(2,1476) = 43.347, p < 0.001] on lose-shift behavior. Significant local × sex [F(2,1476) = 5.711, p = 0.003], sex × cannabis [F(1,102) = 8.342, p = 0.005], and local × sex × cannabis [F(2,1476) = 13.008, p < 0.001] interactions were also present. No other effects or interactions were present (p > 0.148 in each case). As shown in Figure 4, male and female controls exhibit similar effects of positional changes on lose-shift behavior. Both exhibit a strong-lose shift tendency that is extinguished following local swaps and with experience on the task. In females, the difference between the no change and swap conditions increases with heavy cannabis use [t(263) = 2.157, p = 0.032, d = 0.289]H. This effect on lose-shift behavior may be due to an increased reliance on spatial choice cues or an increase in baseline lose-shift behavior. We find that while female cannabis users lose-shift more in the no change condition [t(263) = 2.402, p = 0.017, d = 0.321], the effect of local swaps was the same [t(263) = −0.951, p = 0.343, d = −0.127]. In other words, female cannabis users are no more reliant on spatial choice cues than are female controls. Instead, they exhibit an elevated lose-shift response at baseline (Figure 3C). Conversely, men exhibited the opposite trajectory. While male controls show a large effect of spatial swaps, this behavior is extinguished, and in some cases reversed with elevated drug use [t(263) = 3.737, p < 0.001, d = 0.469]I. Importantly, male cannabis users exhibited decreased lose-shifting in the no change condition [t(263) = −3.489, p < 0.001, d = −0.438] and an increase following local swaps [t(263) = 2.494, p = 0.013, d = 0.313] relative to controls. Therefore, male cannabis users exhibit less reliance on spatial cues when responding after losses.
 
  Figure 4. Effect of local and global changes in choice position on lose-shift tendencies in male and female cannabis users, relative to controls. SEM in error bars.
Modulation of the lose-shift response may not be specific to cannabis alone. As seen in Figure 3E, Total drug use (ASSIST score) is positively correlated with the lose-shift swap effect in men [r(51) = 0.378, p = 0.005]. In women, however, they are uncorrelated [r(51) = −0.143, p = 0.308], suggesting that cannabis use provides a more informative metric. Furthermore, ASSIST cannabis-use scores were more heavily correlated with total drug use [r(104) = 0.847, p < 0.001] than they were with tobacco [r(104) = 0.762, p < 0.001], or alcohol use [r(104) = 0.651, p < 0.001]. Dunn and Clark's z-test (Dunn and Clark, 1969), conducted via the R cocor package (Diedenhofen and Musch, 2015), indicated that the correlation between cannabis and total drug use was significantly greater than that of tobacco [z = 2.028, p = 0.043] or alcohol [z = 3.712, p < 0.001]. The lose-shift swap effect was also uncorrelated with subject age, gambling history, and ADHD (p > 0.227 in all cases). Therefore, in our population cannabis use is the most indicative of total drug use, while also remaining a clinically relevant classification.
Following global changes in position, there were significant effects of local changes [F(2,1476) = 34.300, p < 0.001] on lose-shift behavior, local × sex [F(2,1476) = 5.633, p = 0.004], and sex × cannabis [F(1,102) = 5.764, p = 0.018], interactions. No other effects were significant (p > 0.088 in all cases).
Similar models were applied to win-stay behavior. While the effects of local changes remained significant [F(2,1578) = 22.755, p < 0.001], the win-stay was not affected by drug use, sex, or interactions with local changes in position (p > 0.062 in all cases). As seen in Figure 3D, processing of the win-stay did not differ with sex or drug use, and was not considered further. The same was true following global changes in position, where no effects were significant (p > 0.135 in all cases).
3.4. Computational results
The results presented above demonstrate that the location of choice targets, rather than their visual identity, is more important for choice adaptation based on reinforcement in the immediately previous trial. The importance of spatial configuration is evidenced by changes in win-stay and lose-shift probabilities following manipulations of cue position. We next sought to determine how choices, cue configurations, and reinforcement affected choice over multiple trials. We therefore used a biologically-relevant computational model to determine how learning rate, reward valuation, and loss aversion affected choice. Each participant's choice behavior was analyzed with the Q-learning with forgetting (FQ) model, as described by Barraclough et al. (2004). It uses learning rate (α), inverse temperature (β), reward strength (κ1), and punishment strength (κ2) as parameters (hidden states) to estimate action values. We compared model performance against Q-learning (Q) (Sutton and Barto, 2018) and Q-learning with differential forgetting (DFQ) (Ito and Doya, 2009). The former only includes the α and β parameters, while the latter includes a second α2 parameter, in order to describe forgetting as a different process from learning. Hidden states were estimated for each subject, using the negative log-likelihood to assess model performance:
where n is the number of trials and P(i) the probability that the model predicted the choice made by each subject on trial i. As seen in Figure 5A, both the FQ and DFQ models performed better than the Q-learning model. However, the FQ model fit no worse than DFQ (p = 0.503) while requiring one less parameter. Therefore, Q-learning with forgetting provided the best model of human choice in the present task. In no instance was model fit different as a result of sex or cannabis use (p > 0.154 in all cases), indicating that comparison of the parameter values is well-founded. Note that the parameters were free to vary within the session, and so take a range of values for each subject. Figures 5B–D shows a logistic-like relationship between parameter values and win-stay/lose-shift response probability. When fit against a mixed effects logistic function with random effects for logistic asymptote, intercept, and slope, we found a strong relationship between reward strength (κ1) and win-stay behavior. Consequently, the asymptote [β = 0.370, F(1,30691) = 3069.663, p < 0.001], intercept [β = 0.019, F(1,30691) = 10.986, p = 0.001], and slope [β = −0.153, F(1,30691) = 237.979, p < 0.001] parameters were significant. As seen in Figure 5B, when reward strength is low, subjects win-stay at a fixed baseline of 37.0% (SD = 10.9%). However, when κ1 is high (>0.019), subjects almost exclusively win-stay. The same is true of the lose-shift. At low values of κ2 subjects exhibit a lose-stay policy. However, as κ2 increases, they reach a stable lose-shift strategy of 64.8% (SD = 10.1%). Consequently, for the lose-shift the asymptote [β = 0.648, F(1,32587) = 3712.685, p < 0.001], intercept [β = −0.033, F(1,32587) = 20.921, p < 0.001], and slope [β = 0.118, F(1,32587) = 254.602, p < 0.001] parameters were significant. Lose-shift behavior was also associated with learning rates (α, Figure 5D). A mixed-effects model with random asymptotes indicated the asymptote [β = 0.567, F(1,32587) = 14089.972, p < 0.001], intercept [β = 0.026, F(1,32587) = 3641.362, p < 0.001], and slope [β = 0.595, F(1,32587) = 122.090, p < 0.001] of the relationship between α and lose-shifting were significant. As subjects increase the rate at which new reinforcement updates past knowledge of choice-outcome associations, they lose-shift more before reaching an asymptote of 56.7% (SD = 4.8%). Similar analysis of the relationship between win-stay and α found no relationships (p > 0.128 in all cases).
 
  Figure 5. (A) Performance of the Q-learning (Q), Q-learning with forgetting (FQ), and Q-learning with differential forgetting (DFQ) models. (B) Relationship between κ1 and win-stay behavior with curves fit to individual subjects (blue lines), the population average (black line), and averages of binned raw data (points). (C,D) Relationship between κ2 & α and lose-shift behavior. α was normalized via the logit transform prior to model fitting.
Given the relevance of the FQ model to human behavior, we next sought to quantify how hidden states changed in response to reinforcement and cue positions using Volterra decomposition. The method accounted for the past effects of wins, local displacement, swaps, and global changes on changes in α, κ1, and κ2 over the preceding n∈(1,32) trials. The effects of wins were calculated relative to those of losses, while local displacement, swaps, and global changes were calculated relative to the no change condition. Their impact on hidden states over time were tested with a mixed-effects model incorporating random intercepts and slopes for each subject. Each model was reparameterized to exclude a global intercept, but fit a separate intercept for each group (trial type). Therefore, for each model we tested whether each trial type differed from zero (null hypothesis of no effect) to determine if it had a significant impact on RL parameters.
Initially we collapsed data over sex and cannabis use to determine what variance between subjects is explained by the model. For learning rate (α), there was a significant effect of trial type (of past trials) on the learning rate of the present trial [F(4,404.68) = 7.828, p < 0.001]. There was a significant change from baseline following local displacement [t(121.5) = 2.916, p = 0.004, d = 0.529]J or global changes [t(121.5) = 2.108, p = 0.037, d = 0.383]K, but not local swaps or winning outcomes (p > 0.091 in both cases). As seen in Figure 6A, all trial types resulted in a slight increase in learning rates relative to baseline (Volterra weight intercept >0). The plot indicates that the effect persists for a maximum of about 10 previous trials.
 
  Figure 6. Effect of wins, local, and global changes in choice position on Q-learning parameters α (A) (learning rate), κ1 (B) (reward strength), and κ2 (C) (punishment strength). The influence of wins and cue rearrangement during the previous 32 trials is estimated by Volterra decomposition, which provides a weight (loading) for each trial lag. SEM in shaded area.
There was also a significant effect of past trial type on reward strength [F(4,404.68) = 19.629, p < 0.001]. Wins [t(125.9) = −2.209, p = 0.029, d = −0.394] and local swaps [t(125.9) = −4.266, p < 0.001, d = −0.761]L both caused significant decreases in reward strength (Volterra weight intercept <0). Therefore, as multiple wins (or swaps) are experienced, future rewards become progressively less impactful on choice. Local displacement and global changes had no effect on reward strength (p > 0.461 in both cases). Punishment strength (κ2) was also affected by trial type [F(4,404.68) = 67.857, p < 0.001]. As with κ1, wins (relative to losses) decreased the strength of future punishments [t(142.4) = −8.025, p = 0.029, d = −1.345]. Consequently, losses increased the strength of future punishment, so that experiencing multiple losses would have a cumulative effect. As seen in Figures 6B,C, reward strength quickly recovered in response to wins. However, κ2 exhibited a much more prolonged change, suggesting that the effects of losses were more impactful over a longer time course. No other trial type had a significant effect on κ2 (p > 0.321 in all cases).
In sum, these data indicate that recent rewards and manipulation of choice target locations increase the learning rate. Wins reduce the sensitivity of subjects to future reward (κ1) and punishment (κ2), whereas losses increase the sensitivity. We interpret this to indicate that subjects who have been winning on recent trials persist in their long-term strategy (e.g., executive control), rather than engaging in reflexive responding strongly influenced by the immediately previous reinforcement (e.g., sensorimotor control).
We next tested whether cannabis use and sex modulated the response of reinforcement learning parameters to wins, local displacements, swaps, and global changes. We used a mixed effects model with random slopes and intercepts for each subject. In this case, a global intercept was used because we were testing differences between conditions, rather than between each group relative to the null hypothesis of no change within each condition.
Cannabis use and sex had a significant effect on the change in learning rates (α) following local displacement, as evidenced by a cannabis × sex interaction [F(1,102) = 4.748, p = 0.032], while there were no main effects of sex or cannabis use (p > 0.178). The same sex × cannabis interaction was also present in the response to global changes [F(1,102) = 7.443, p = 0.007]. However, there were no differences in the response to winning outcomes or local swaps (p > 0.108 in all cases). The immediate responses to each trial type (in the following trial, or at lag=1) are shown in Figure 7A. Males exhibit a significant increase in learning rates immediately following local displacement [t(51) = 2.325, p = 0.024, d = 0.653]M. Therefore, local displacement increases the rate at which new information updates choice value estimates. While male cannabis users also exhibited a similar increase in response to local swaps, the effect was not significant [t(51) = 1.801, p = 0.078, d = 0.506]. Conversely, learning rates fell in response to global changes for male cannabis users, relative to male controls [t(51) = −2.786, p = 0.007, d = −0.782]N. For κ1, there was a significant effect of sex on the response to displacement [F(1,102) = 4.517, p = 0.036], as seen in Figure 7B. In addition, there was a significant sex × cannabis interaction in the effect of global changes on κ1 [F(1,102) = 6.242, p = 0.014]. However, for κ2, male and female cannabis users did not differ from controls in their response to wins, local displacement, swaps, and global changes (p > 0.138 in all cases).
 
  Figure 7. Cannabis × Sex interactions on mean reinforcement learning parameter values estimated by Volterra decomposition. Learning rate α (A) and effect of wins κ1 (B) to wins, local displacement, swaps, and global changes. SEM in error bars. *p < 0.05, **p < 0.01.
In sum, male cannabis users tended to increase learning following local, but not global changes of target positions, which is different from all other groups. Moreover, the parameter values for female cannabis users were not different from controls, suggest that their reduced task performance is related mostly to the processing of the previous reinforcement (e.g., lose-shift) rather than effects spanning multiple trials.
4. Discussion and conclusions
The data here provide novel behavioral evidence that the lose-shift response is a product of sensorimotor systems, and that the regulation of such responding is compromised differently in men and women with high recreational cannabis use. A high proportion of lose-shift responding is sub-optimal in the present task because it is predictable, and can therefore be exploited by the computer opponent. Indeed, the propensity for lose-shift responding is negatively correlated with task performance here. Nonetheless, subjects engage this response above chance levels for several hundred trials before learning to suppress it. Similarly, humans exhibit considerable difficulty in generating random response sequences, preferring to alternate choices at a rate well above chance (Falk and Konold, 1997; Sun and Wang, 2012). Consequently, the lose-shift is likely a default strategy, consistent with previous work in humans (Ivan et al., 2018), and analogous to what has been observed in animals performing a similar task (Gruber and Thapa, 2016). As lose-shift responses eventually converge to chance levels in trials with no global changes, the probability of win-stay responses increase above chance levels. We found that this negative correlation between lose-shift and win-stay was significant and strikingly similar to rodents (Gruber and Thapa, 2016), suggesting that lose-shift and win-stay are expressed by neural systems in competition with one another.
We show here that participants overwhelmingly perform the lose-shift according to target position, rather than target identity. In other words, participants avoided the prior position of the previously chosen target when it was unrewarded. This novel observation reveals a strong spatial component to the lose-shift. These data are consistent with the notion that lose-shift is a product of sensorimotor systems. Loss-driven response shifting is reduced following lesions to sensorimotor striatum in animals (Skelin et al., 2014; Gruber et al., 2017; Thapa and Gruber, 2018) or damage to putamen/insula in humans (Danckert et al., 2011); these homologous structures are strongly involved with sensorimotor control. Moreover, decision times are lower for lose-shifting than for lose-stay responses, even when global changes in position (that require equally distant arm movements) are present. There are multiple reasons this may occur. First, the visual systems of the brain process information about spatial position independently from other object characteristics (Mishkin et al., 1983; Haxby et al., 1991). The ventral “where” pathway processes information more quickly than the “what” pathway. (Goodale and Milner, 1992; Deubel et al., 1998). Secondly, the ventral pathway may be used to compute actions prior to stimulus presentation. In the perceptual learning literature, activity in both the motor and visual cortices builds up prior to stimulus onset, and reflects stimulus expectation and the associated motor responses (de Lange et al., 2013). Moreover, pre-response fluctuations in beta-power motor activity are also predictive of choice alternation (i.e., lose-shift), regardless of associated motor action (Pape and Siegel, 2016). There is evidence that loops involving premotor cortex and the lateral striatum map vision and other sensory modalities into an egocentric space. The ventral premotor cortex contains neurons that both drive motor actions, but also encode locations of visual, tactile, and auditory stimuli (Fadiga et al., 2000). Consequently, they form a motor vocabulary for mapping several modalities into actions in a common egocentric space. Even when stimuli are removed, these neurons still respond to the position of remembered objects in relation to the body (Graziano and Gross, 1998). The putamen (LS in rodents) also contains these bimodal visuomotor cells (Graziano and Gross, 1996), and therefore has the capacity to mediate lose-shift from a remembered location. In the context of our study, spatial rearrangement of choice targets subverts this motor preparation, requiring choices to be recalculated following stimulus onset. This is evidenced by the increase in response times following local swaps and displacement. Interestingly, local swaps had a larger and more consistent effect on response times than did displacement, suggesting that a greater level of motor recalculation is required. Specifically, we speculate that it requires more time for the executive control to overcome the intrinsic inhibition of a previously unrewarded action (than a novel one) in the motor system in order to intentionally select it.
The influence of local and global changes in choice position also highlights the importance of egocentric and allocentric processing of space. While local changes in choice orientation modulate the lose-shift, these effects persist even when all choices are moved to a new global position relative to the observer. Conversely, the win-stay is much less affected by spatial position. Local changes do have an effect on behavior, but these are eliminated by concurrent global changes. These results highlight the importance of allocentric processing on the lose-shift. Choice positions are calculated relative to one another, allowing their associated values to be maintained across large global movements in choice position. Conversely, while processing of the win-stay is less reliant on spatial information, egocentric reference frames are more important than allocentric, where choice value is calculated relative to the subject. Consequently, local and global changes have a large effect on win-stay behavior.
In addition to driving different decision strategies, the putamen/LS and ventral striatum (VS, including the nucleus accumbens) also respond differently to psychoactive drugs. Relative to the VS, the LS exhibits a much higher density of dopamine transporters (Coulter et al., 1997), endocannabinoid receptors (Herkenham et al., 1991), opioid receptors (Benfenati et al., 1991), and alcohol-sensitive NMDA receptors (Liste et al., 1995). Consequently, THC administration temporarily increases dopamine release throughout the striatum, but particularly in the LS (Sakurai-Yamashita et al., 1989; Jentsch et al., 1998), resulting in diminished loss aversion. The effects of acute ethanol exposure are similar, though greater in the VS (Clarke et al., 2015; Vena et al., 2016). Conversely, long-term sensitization to alcohol and cannabis reduces availability of striatal dopamine receptors (Martinez et al., 2005; Budygin et al., 2007; Albrecht et al., 2013) and cannabinoid receptors (Villares, 2007), especially in the LS. Chronic exposure also inhibits the prefrontal cortex and anterior cingulate (Goldstein et al., 2007, 2009) which may result in an attenuated feedback-related negativity, an error signal generated in the cingulate necessary for behavioral adaptation following losses (Cohen and Ranganath, 2007). Instead, choice control is shifted to the LS (Everitt and Robbins, 2005, 2013; Everitt et al., 2008; Lucantonio et al., 2014). We expect this effect to impair the ability of participants to use executive control to suppress lose-shift responding by the sensorimotor systems, while having little effect on win-stay behavior. We do not have sufficient primary evidence to hypothesize how the change in receptor densities by repeated alcohol/THC exposure affects lose-shift processing within the LS and/or other components of the sensorimotor system.
In the present study, we find that self-reported level of recreational use of cannabis affects task performance, but that this differs on the basis of biological sex. Elevated cannabis use in men decreased spatial modulation of the lose-shift, possibly through dopaminergic desensitization of the LS. As seen in Figure 4, baseline lose-shifting is also reduced, falling below 50% in trial blocks 4 and 5. With this reduction, lose-shift responding after swaps increases to 63%. Therefore, either the calculation of the lose-shift is affected or its suppression by executive systems is enhanced in male cannabis users, while spatial processing remains unaffected. Conversely, female cannabis users exhibit decreased task performance, possibly due to weakened suppression of sensorimotor responding by the prefrontal cortex. Furthermore, they show a moderate and significant increase in baseline lose-shift responding [F(1,51) = 4.109, p < 0.048], revealed by a mixed effects model between female controls and cannabis users in the no change condition (Figure 3C). It thus appears that females with high cannabis use exert less executive control over sensorimotor systems in our task.
While it is tempting to describe this sex difference as a consequence of different drug effects on the brain in males and females, several alternatives are also possible. For example, a confounding factor may be present that promotes high levels of recreational drug use and also impairs sensorimotor regulation. Unfortunately, the WHO ASSIST is not sufficient to infer whether these are the case in the present study. However, it is known that females are more susceptible to drug tolerance (including cannabis) and sensitization than are males (Robinson, 1988; Wakley et al., 2014). Drug use is also comorbid with mood and anxiety disorders, particularly depression (Zilberman et al., 2003), which causes heightened loss aversion (Beevers et al., 2013). These differences are possibly due to the effects of estrogen, which enhances striatal dopamine release in response to psychoactive drugs (Becker, 1999) and alters the effects of drugs on the prefrontal cortex. Females rats with high estrogen levels exhibit dysfunction of the prefrontal cortex relative to males and low-estrogen females when exposed to dopamine-enhancing drugs (Shansky et al., 2004). Estrogen also heightens the effects of cocaine and amphetamine, causing an abnormal BOLD response in rats (Febo et al., 2005; Sárvári et al., 2014). Alcohol and cannabis consumption also increase oestradiol levels, and can inhibit testosterone production in males (Kolodny et al., 1974; Maskarinec et al., 1978; Harclerode, 1984; Purohit, 2000; Yonker et al., 2005). In males, increased estrogen and reduced testosterone levels cause declines in spatial cognitive ability (Janowsky et al., 1994). Therefore, the heightened susceptibility of the PFC to the combined effects of estrogen and drug abuse provides an explanation for why only women with high ASSIST scores show a dominance of sensorimotor control, without compromising the spatial dependence of lose-shift. Specifically, this population had accelerated decision speeds, lower proportion of wins, and a tendency for lower entropy of response sequences. On the other hand, the lose-shift remained sensitive to swapping cue locations, which is similar to controls, but opposite of what is observed in males with high ASSIST scores.
Our analysis of behavior through a reinforcement learning framework also revealed a cannabis × sex interaction. Whereas the other analysis presented here focuses on the effects of the previous trial, the Q-learning model allowed us to examine effects that span many trials. It was the men who used cannabis who stood out in this analysis; they had increased learning when previous cues were displaced locally, and decreased learning when previous cues were switched globally. We expect such learning is part of a reinforcement learning scheme in “goal-directed” brain circuits linked more closely to executive function than sensorimotor control (Balleine and O'Doherty, 2010; Gruber and McDonald, 2012), suggesting that not only is there an enhanced suppression of sensorimotor control by executive function in male cannabis users, but that adaptation by the executive system is also different than the other groups. It is worth noting, however, that our sample (as is common in the field) was predominantly young university students, who are presumably well-educated and high functioning. We urge caution in extrapolating our findings to the general public.
The interpretation of data in this study faces several challenges besides the aforementioned limitations of the ASSIST. First, alcohol and cannabis use are highly concordant (Spearman's correlation of ρ = 0.533, p < 0.001 in our sample), and likely additive in their effects. Second, the sexually dimorphic effects observed here may be due to confounding interactions between drug use, IQ, and/or psychiatric disorders that have different prevalence among the sexes. However, the sexually dimorphic distribution of endocannabinoid receptors in the striatum and prefrontal cortex (De Fonseca et al., 1994) likely also play an important role. For instance, errors when reconstructing spatio-temporal sequences were reduced in men and increased in women following THC treatment (Makela et al., 2006). We previously reported that lose-shift is decreased by acute administration of THC in female rats (Wong et al., 2017a). It is possible that the down regulation of receptors in heavy chronic users may cause the inverse, which would be consistent with the data here.
In sum, the data presented here indicate that lose-shift responding is a useful gauge of the cognitive control over sensorimotor responding in humans, and that this is impacted differently in men and women that heavily use cannabis. These linkages are important factors to account for the impact of lose-shift responding in real-world economic decision making, such as gambling (Worthy et al., 2013; Abouzari et al., 2015), as well as clinical/laboratory testing of cognitive flexibility with tasks such as the Wisconsin Card Sorting Task that involve loss-based shifting of response policies.
Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
Ethics statement
The studies involving human participants were reviewed and approved by McMaster University Research Ethics Board. The patients/participants provided their written informed consent to participate in this study.
Author contributions
PBa programmed and conducted the study, performed all statistical analyses, and wrote the first draft of the manuscript. AG conceived of and designed the study, reviewed all statistical methods, and wrote the final version of the manuscript. PBe and AS reviewed the study design, statistical methods, and edited the final version of the paper. All authors contributed to manuscript revision, read, and approved the submitted version.
Funding
Natural Sciences and Engineering Research Council of Canada (NSERC): AG, PBe, and AS and Canada Foundation for Innovation (CFI) and Alberta Innovates - Health Solutions (AIHS): AG. Funding sources had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgments
We would like to thank Donna Waxman for helping with data collection.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
References
Abouzari, M., Oberg, S., Gruber, A., and Tata, M. (2015). Interactions among attention-deficit hyperactivity disorder (ADHD) and problem gambling in a probabilistic reward-learning task. Behav. Brain Res. 291, 237–243. doi: 10.1016/j.bbr.2015.05.041
Albrecht, D. S., Skosnik, P. D., Vollmer, J. M., Brumbaugh, M. S., Perry, K. M., Mock, B. H., et al. (2013). Striatal D2/D3 receptor availability is inversely correlated with cannabis consumption in chronic marijuana users. Drug Alcohol Depend. 128, 52–57. doi: 10.1016/j.drugalcdep.2012.07.016
Balleine, B. W., and O'Doherty, J. P. (2010). Human and rodent homologies in action control: corticostriatal determinants of goal-directed and habitual action. Neuropsychopharmacology 35, 48–69. doi: 10.1038/npp.2009.131
Banks, P. J., Tata, M. S., Bennett, P. J., Sekuler, A. B., and Gruber, A. J. (2018). Implicit valuation of the near-miss is dependent on outcome context. J. Gambl. Stud. 34, 181–197. doi: 10.1007/s10899-017-9705-3
Barr, D. J., Levy, R., Scheepers, C., and Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: keep it maximal. J. Mem. Lang. 68, 255–278. doi: 10.1016/j.jml.2012.11.001
Barraclough, D. J., Conroy, M. L., and Lee, D. (2004). Prefrontal cortex and decision making in a mixed-strategy game. Nat. Neurosci. 7:404. doi: 10.1038/nn1209
Bates, D., Mächler, M., Bolker, B., and Walker, S. (2014). Fitting linear mixed-effects models using LME4. arXiv preprint arXiv:1406.5823. doi: 10.18637/jss.v067.i01
Becker, J. B. (1999). Gender differences in dopaminergic function in striatum and nucleus accumbens. Pharmacol. Biochem. Behav. 64, 803–812. doi: 10.1016/S0091-3057(99)00168-9
Beevers, C. G., Worthy, D. A., Gorlick, M. A., Nix, B., Chotibut, T., and Maddox, W. T. (2013). Influence of depression symptoms on history-independent reward and punishment processing. Psychiatry Res. 207, 53–60. doi: 10.1016/j.psychres.2012.09.054
Benfenati, F., Pich, E. M., Zoli, M., Grimaldi, R., Fuxe, K., and Agnati, L. F. (1991). Changes in striatal μ and δ opioid receptors after transient forebrain ischemia: a quantitative autoradiographic study. Brain Res. 546, 171–175. doi: 10.1016/0006-8993(91)91175-Z
Boyd, S., Chua, L. O., and Desoer, C. A. (1984). Analytical foundations of Volterra series. IMA J. Math. Control Inform. 1, 243–282. doi: 10.1093/imamci/1.3.243
Brasted, P. J., Robbins, T. W., and Dunnett, S. B. (1999). Distinct roles for striatal subregions in mediating response processing revealed by focal excitotoxic lesions. Behav. Neurosci. 113:253. doi: 10.1037/0735-7044.113.2.253
Budygin, E. A., Oleson, E. B., Mathews, T. A., Läck, A. K., Diaz, M. R., McCool, B. A., et al. (2007). Effects of chronic alcohol exposure on dopamine uptake in rat nucleus accumbens and caudate putamen. Psychopharmacology 193, 495–501. doi: 10.1007/s00213-007-0812-1
Burton, A. C., Nakamura, K., and Roesch, M. R. (2015). From ventral-medial to dorsal-lateral striatum: neural correlates of reward-guided decision-making. Neurobiol. Learn. Mem. 117, 51–59. doi: 10.1016/j.nlm.2014.05.003
Cha, Y. M., Jones, K. H., Kuhn, C. M., Wilson, W. A., and Swartzwelder, H. S. (2007). Sex differences in the effects of δ9-tetrahydrocannabinol on spatial learning in adolescent and adult rats. Behav. Pharmacol. 18, 563–569. doi: 10.1097/FBP.0b013e3282ee7b7e
Chamberlain, S. R., and Sahakian, B. J. (2007). The neuropsychiatry of impulsivity. Curr. Opin. Psychiatry 20, 255–261. doi: 10.1097/YCO.0b013e3280ba4989
Clarke, R. B., Söderpalm, B., Lotfi, A., Ericson, M., and Adermark, L. (2015). Involvement of inhibitory receptors in modulating dopamine signaling and synaptic activity following acute ethanol exposure in striatal subregions. Alcoholism 39, 2364–2374. doi: 10.1111/acer.12895
Cohen, M. X., and Ranganath, C. (2007). Reinforcement learning signals predict future decisions. J. Neurosci. 27, 371–378. doi: 10.1523/JNEUROSCI.4421-06.2007
Coulter, C. L., Happe, H. K., and Murrin, L. C. (1997). Dopamine transporter development in postnatal rat striatum: an autoradiographic study with [3H] win 35,428. Dev. Brain Res. 104, 55–62. doi: 10.1016/S0165-3806(97)00135-1
Danckert, J., Stöttinger, E., Quehl, N., and Anderson, B. (2011). Right hemisphere brain damage impairs strategy updating. Cereb. Cortex 22, 2745–2760. doi: 10.1093/cercor/bhr351
Darvas, M., and Palmiter, R. D. (2009). Restriction of dopamine signaling to the dorsolateral striatum is sufficient for many cognitive behaviors. Proc. Natl. Acad. Sci. U.S.A. 106, 14664–14669. doi: 10.1073/pnas.0907299106
Darvas, M., and Palmiter, R. D. (2010). Restricting dopaminergic signaling to either dorsolateral or medial striatum facilitates cognition. J. Neurosci. 30, 1158–1165. doi: 10.1523/JNEUROSCI.4576-09.2010
Daunizeau, J., Adam, V., and Rigoux, L. (2014). Vba: a probabilistic treatment of nonlinear models for neurobiological and behavioural data. PLoS Comput. Biol. 10:e1003441. doi: 10.1371/journal.pcbi.1003441
De Fonseca, F. R., Cebeira, M., Ramos, J., Martin, M., and Fernandez-Ruiz, J. (1994). Cannabinoid receptors in rat brain areas: sexual differences, fluctuations during estrous cycle and changes after gonadectomy and sex steroid replacement. Life Sci. 54, 159–170. doi: 10.1016/0024-3205(94)00585-0
de Lange, F. P., Rahnev, D. A., Donner, T. H., and Lau, H. (2013). Prestimulus oscillatory activity over motor cortex reflects perceptual expectations. J. Neurosci. 33, 1400–1410. doi: 10.1523/JNEUROSCI.1094-12.2013
De Leonibus, E., Oliverio, A., and Mele, A. (2005). A study on the role of the dorsal striatum and the nucleus accumbens in allocentric and egocentric spatial memory consolidation. Learn. Mem. 12, 491–503. doi: 10.1101/lm.94805
Deubel, H., Schneider, W. X., and Paprotta, I. (1998). Selective dorsal and ventral processing: evidence for a common attentional mechanism in reaching and perception. Vis. Cogn. 5, 81–107. doi: 10.1080/713756776
Diedenhofen, B., and Musch, J. (2015). cocor: A comprehensive solution for the statistical comparison of correlations. PLoS ONE 10:e0121945. doi: 10.1371/journal.pone.0121945
Dunn, O. J., and Clark, V. (1969). Correlation coefficients measured on the same individuals. J. Am. Stat. Assoc. 64, 366–377. doi: 10.1080/01621459.1969.10500981
Everitt, B. J., Belin, D., Economidou, D., Pelloux, Y., Dalley, J. W., and Robbins, T. W. (2008). Neural mechanisms underlying the vulnerability to develop compulsive drug-seeking habits and addiction. Philos. Trans. R. Soc. B Biol. Sci. 363, 3125–3135. doi: 10.1098/rstb.2008.0089
Everitt, B. J., and Robbins, T. W. (2005). Neural systems of reinforcement for drug addiction: from actions to habits to compulsion. Nat. Neurosci. 8:1481. doi: 10.1038/nn1579
Everitt, B. J., and Robbins, T. W. (2013). From the ventral to the dorsal striatum: devolving views of their roles in drug addiction. Neurosci. Biobehav. Rev 37, 1946–1954. doi: 10.1016/j.neubiorev.2013.02.010
Fadiga, L., Fogassi, L., Gallese, V., and Rizzolatti, G. (2000). Visuomotor neurons: ambiguity of the discharge or ‘motor-perception? Int. J. Psychophysiol. 35, 165–177. doi: 10.1016/S0167-8760(99)00051-3
Falk, R., and Konold, C. (1997). Making sense of randomness: Implicit encoding as a basis for judgment. Psychol. Rev. 104:301. doi: 10.1037/0033-295X.104.2.301
Febo, M., Ferris, C. F., and Segarra, A. C. (2005). Estrogen influences cocaine-induced blood oxygen level-dependent signal changes in female rats. J. Neurosci. 25, 1132–1136. doi: 10.1523/JNEUROSCI.3801-04.2005
Filbey, F., and Yezhuvath, U. (2013). Functional connectivity in inhibitory control networks and severity of cannabis use disorder. Am. J. Drug Alcohol Abuse 39, 382–391. doi: 10.3109/00952990.2013.841710
Fuster, J. (1989). The prefrontal cortex: anatomy, physiology, and neuropsychology of the frontal lobe. Dissert. Abstracts Int. 60:255.
Goldstein, R. Z., Alia-Klein, N., Tomasi, D., Carrillo, J. H., Maloney, T., Woicik, P. A., et al. (2009). Anterior cingulate cortex hypoactivations to an emotionally salient task in cocaine addiction. Proc. Natl. Acad. Sci. U.S.A. 106, 9453–9458. doi: 10.1073/pnas.0900491106
Goldstein, R. Z., Tomasi, D., Rajaram, S., Cottone, L. A., Zhang, L., Maloney, T., et al. (2007). Role of the anterior cingulate and medial orbitofrontal cortex in processing drug cues in cocaine addiction. Neuroscience 144, 1153–1159. doi: 10.1016/j.neuroscience.2006.11.024
Goodale, M. A., and Milner, A. D. (1992). Separate visual pathways for perception and action. Trends Neurosci. 15, 20–25. doi: 10.1016/0166-2236(92)90344-8
Graziano, M. S., and Gross, C. G. (1996). “Multiple pathways for processing visual space,” in Attention and Performance XVI: Information Integration in Perception and Communication, eds I. Toshio and J. L. McClelland (Cambridge: MIT Press), 181–207.
Graziano, M. S., and Gross, C. G. (1998). Spatial maps for the control of movement. Curr. Opin. Neurobiol. 8, 195–201. doi: 10.1016/S0959-4388(98)80140-2
Gruber, A. J., and McDonald, R. J. (2012). Context, emotion, and the strategic pursuit of goals: interactions among multiple brain systems controlling motivated behavior. Front. Behav. Neurosci. 6:50. doi: 10.3389/fnbeh.2012.00050
Gruber, A. J., and Thapa, R. (2016). The memory trace supporting lose-shift responding decays rapidly after reward omission and is distinct from other learning mechanisms in rats. Eneuro 3. doi: 10.1523/ENEURO.0167-16.2016
Gruber, A. J., Thapa, R., and Randolph, S. H. (2017). Feeder approach between trials is increased by uncertainty and affects subsequent choices. eNeuro 4:ENEURO-0437. doi: 10.1523/ENEURO.0437-17.2017
Harclerode, J. (1984). Endocrine effects of marijuana in the male: preclinical studies. NIDA Res. Monogr. 44, 46–64.
Haxby, J. V., Grady, C. L., Horwitz, B., Ungerleider, L. G., Mishkin, M., Carson, R. E., et al. (1991). Dissociation of object and spatial visual processing pathways in human extrastriate cortex. Proc. Natl. Acad. Sci. 88, 1621–1625. doi: 10.1073/pnas.88.5.1621
Herkenham, M., Lynn, A. B., Johnson, M. R., Melvin, L. S., de Costa, B. R., and Rice, K. C. (1991). Characterization and localization of cannabinoid receptors in rat brain: a quantitative in vitro autoradiographic study. J. Neurosci. 11, 563–583. doi: 10.1523/JNEUROSCI.11-02-00563.1991
Ito, M., and Doya, K. (2009). Validation of decision-making models and analysis of decision variables in the rat basal ganglia. J. Neurosci. 29, 9861–9874. doi: 10.1523/JNEUROSCI.6157-08.2009
Ivan, V. E., Banks, P. J., Goodfellow, K., and Gruber, A. J. (2018). Lose-shift responding in humans is promoted by increased cognitive load. Front. Integr. Neurosci. 12:9. doi: 10.3389/fnint.2018.00009
Janowsky, J. S., Oviatt, S. K., and Orwoll, E. S. (1994). Testosterone influences spatial cognition in older men. Behav. Neurosci. 108:325. doi: 10.1037/0735-7044.108.2.325
Jentsch, J. D., Wise, A., Katz, Z., and Roth, R. H. (1998). α-noradrenergic receptor modulation of the phencyclidine-and δ9-tetrahydrocannabinol-induced increases in dopamine utilization in rat prefrontal cortex. Synapse 28, 21–26. doi: 10.1002/(SICI)1098-2396(199801)28:1<21::AID-SYN3>3.0.CO;2-E
Kamil, A. C., and Hunter, M. W. (1970). Performance on object-discrimination learning set by the greater hill myna (Gracula religiosa). J. Compar. Physiol. Psychol. 73:68. doi: 10.1037/h0029811
Kesner, R., and DiMattia, B. (1987). Neurobiology of an attribute model of memory. Prog. Psychobiol. Physiol. Psychol. 12, 207–277.
Knight, R. T., Staines, W. R., Swick, D., and Chao, L. L. (1999). Prefrontal cortex regulates inhibition and excitation in distributed neural networks. Acta Psychol. 101, 159–178. doi: 10.1016/S0001-6918(99)00004-9
Kolodny, R. C., Masters, W. H., Kolodner, R. M., and Toro, G. (1974). Depression of plasma testosterone levels after chronic intensive marihuana use. N. Engl. J. Med. 290, 872–874. doi: 10.1056/NEJM197404182901602
Liste, I., Rozas, G., Guerra, M., and Labandeira-Garcia, J. (1995). Cortical stimulation induces FOS expression in striatal neurons via NMDA glutamate and dopamine receptors. Brain Res. 700, 1–12. doi: 10.1016/0006-8993(95)00958-S
Lucantonio, F., Caprioli, D., and Schoenbaum, G. (2014). Transition from ‘model-based' to ‘model-free-behavioral control in addiction: involvement of the orbitofrontal cortex and dorsolateral striatum. Neuropharmacology 76, 407–415. doi: 10.1016/j.neuropharm.2013.05.033
Makela, P., Wakeley, J., Gijsman, H., Robson, P. J., Bhagwagar, Z., and Rogers, R. D. (2006). Low doses of δ-9 tetrahydrocannabinol (THC) have divergent effects on short-term spatial memory in young, healthy adults. Neuropsychopharmacology 31, 462–470. doi: 10.1038/sj.npp.1300871
Malone, D. T., and Taylor, D. A. (2006). The effect of δ9-tetrahydrocannabinol on sensorimotor gating in socially isolated rats. Behav. Brain Res. 166, 101–109. doi: 10.1016/j.bbr.2005.07.009
Martinez, D., Gil, R., Slifstein, M., Hwang, D.-R., Huang, Y., Perez, A., et al. (2005). Alcohol dependence is associated with blunted dopamine transmission in the ventral striatum. Biol. Psychiatry 58, 779–786. doi: 10.1016/j.biopsych.2005.04.044
Maskarinec, M., Shipley, G., Novotny, M., Brown, D., and Forney, R. (1978). Endocrine effects of cannabis in male rats. Toxicol. Appl. Pharmacol. 45, 617–628. doi: 10.1016/0041-008X(78)90123-0
Mishkin, M., Ungerleider, L. G., and Macko, K. A. (1983). Object vision and spatial vision: two cortical pathways. Trends Neurosci. 6, 414–417. doi: 10.1016/0166-2236(83)90190-X
Palencia, C. A., and Ragozzino, M. E. (2005). The contribution of nmda receptors in the dorsolateral striatum to egocentric response learning. Behav. Neurosci. 119:953. doi: 10.1037/0735-7044.119.4.953
Pape, A.-A., and Siegel, M. (2016). Motor cortex activity predicts response alternation during sensorimotor decisions. Nature Commun. 7:13098. doi: 10.1038/ncomms13098
Parent, A., and Hazrati, L.-N. (1995). Functional anatomy of the basal ganglia. I. The cortico-basal ganglia-thalamo-cortical loop. Brain Res. Rev. 20, 91–127. doi: 10.1016/0165-0173(94)00007-C
Passingham, R. E., and Wise, S. P. (2012). The Neurobiology of the Prefrontal Cortex: Anatomy, Evolution, and the Origin of Insight. Oxford: Oxford University Press. doi: 10.1093/acprof:osobl/9780199552917.001.0001
Paulus, M. P., Hozack, N. E., Zauscher, B. E., Frank, L., Brown, G. G., Braff, D. L., et al. (2002). Behavioral and functional neuroimaging evidence for prefrontal dysfunction in methamphetamine-dependent subjects. Neuropsychopharmacology 26:53. doi: 10.1016/S0893-133X(01)00334-7
Pope, H. G., Jacobs, A., Mialet, J.-P., Yurgelun-Todd, D., and Gruber, S. (1997). Evidence for a sex-specific residual effect of cannabis on visuospatial memory. Psychother. Psychosom. 66, 179–184. doi: 10.1159/000289132
Possin, K. L., Kim, H., Geschwind, M. D., Moskowitz, T., Johnson, E. T., Sharon, J. S., et al. (2017). Egocentric and allocentric visuospatial working memory in premotor Huntington's disease: a double dissociation with caudate and hippocampal volumes. Neuropsychologia 101, 57–64. doi: 10.1016/j.neuropsychologia.2017.04.022
Purohit, V. (2000). Can alcohol promote aromatization of androgens to estrogens? A review. Alcohol 22, 123–127. doi: 10.1016/S0741-8329(00)00124-5
Rae, C. L., Hughes, L. E., Anderson, M. C., and Rowe, J. B. (2015). The prefrontal cortex achieves inhibitory control by facilitating subcortical motor pathway connectivity. J. Neurosci. 35, 786–794. doi: 10.1523/JNEUROSCI.3093-13.2015
Robinson, T. E. (1988). “Stimulant drugs and stress: factors influencing individual differences in the susceptibility to sensitization,” in Sensitization of the Nervous System, eds P. W. Kalivas and C. Barnes (Caldwell, NJ: Telford Press), 145–173.
Sakurai-Yamashita, Y., Kataoka, Y., Fujiwara, M., Mine, K., and Ueki, S. (1989). δ9-tetrahydrocannabinol facilitates striatal dopaminergic transmission. Pharmacol. Biochem. Behav. 33, 397–400. doi: 10.1016/0091-3057(89)90521-2
Sárvári, M., Deli, L., Kocsis, P., Márk, L., Maász, G., Hrabovszky, E., et al. (2014). Estradiol and isotype-selective estrogen receptor agonists modulate the mesocortical dopaminergic system in gonadectomized female rats. Brain Res. 1583, 1–11. doi: 10.1016/j.brainres.2014.06.020
Shansky, R., Glavis-Bloom, C., Lerman, D., McRae, P., Benson, C., Miller, K., et al. (2004). Estrogen mediates sex differences in stress-induced prefrontal cortex dysfunction. Mol. Psychiatry 9:531. doi: 10.1038/sj.mp.4001435
Skelin, I., Hakstol, R., VanOyen, J., Mudiayi, D., Molina, L. A., Holec, V., et al. (2014). Lesions of dorsal striatum eliminate lose-switch responding but not mixed-response strategies in rats. Eur. J. Neurosci. 39, 1655–1663. doi: 10.1111/ejn.12518
Sun, Y., and Wang, H. (2012). “Perception of randomness: subjective probability of alternation,” in Proceedings of the Annual Meeting of the Cognitive Science Society, Vol. 34. Oxford. doi: 10.1017/S0140525X11000653
Thapa, R., and Gruber, A. J. (2018). Lesions of ventrolateral striatum eliminate lose-shift but not win-stay behaviour in rats. Neurobiol. Learn. Mem. 155, 446–451. doi: 10.1016/j.nlm.2018.08.022
Thorndike, E. (2017). Animal Intelligence: Experimental Studies. Oxford: Routledge. doi: 10.4324/9781351321044
Vena, A. A., Mangieri, R., and Gonzales, R. A. (2016). Regional analysis of the pharmacological effects of acute ethanol on extracellular striatal dopamine activity. Alcoholism 40, 2528–2536. doi: 10.1111/acer.13246
Villares, J. (2007). Chronic use of marijuana decreases cannabinoid receptor binding and mRNA expression in the human brain. Neuroscience 145, 323–334. doi: 10.1016/j.neuroscience.2006.11.012
Voorn, P., Vanderschuren, L. J., Groenewegen, H. J., Robbins, T. W., and Pennartz, C. M. (2004). Putting a spin on the dorsal-ventral divide of the striatum. Trends Neurosci. 27, 468–474. doi: 10.1016/j.tins.2004.06.006
Wakley, A. A., Wiley, J. L., and Craft, R. M. (2014). Sex differences in antinociceptive tolerance to delta-9-tetrahydrocannabinol in the rat. Drug Alcohol Depend. 143, 22–28. doi: 10.1016/j.drugalcdep.2014.07.029
Wong, S. A., Randolph, S. H., Ivan, V. E., and Gruber, A. J. (2017a). Acute δ-9-tetrahydrocannabinol administration in female rats attenuates immediate responses following losses but not multi-trial reinforcement learning from wins. Behav. Brain Res. 335, 136–144. doi: 10.1016/j.bbr.2017.08.009
Wong, S. A., Thapa, R., Badenhorst, C. A., Briggs, A. R., Sawada, J. A., and Gruber, A. J. (2017b). Opposing effects of acute and chronic D-amphetamine on decision-making in rats. Neuroscience 345, 218–228. doi: 10.1016/j.neuroscience.2016.04.021
Worthy, D. A., Hawthorne, M. J., and Otto, A. R. (2013). Heterogeneity of strategy use in the Iowa gambling task: a comparison of win-stay/lose-shift and reinforcement learning models. Psychon. Bull. Rev. 20, 364–371. doi: 10.3758/s13423-012-0324-9
Yonker, J. E., Nilsson, L.-G., Herlitz, A., and Anthenelli, R. (2005). Sex differences in spatial visualization and episodic memory as a function of alcohol consumption. Alcohol Alcohol. 40, 201–207. doi: 10.1093/alcalc/agh141
Keywords: cannabis, lose-shift, addiction, executive control, spatial processing, choice, habit, sex differences
Citation: Banks PJ, Bennett PJ, Sekuler AB and Gruber AJ (2022) Cannabis use is associated with sexually dimorphic changes in executive control of visuospatial decision-making. Front. Integr. Neurosci. 16:884080. doi: 10.3389/fnint.2022.884080
Received: 25 February 2022; Accepted: 25 July 2022;
 Published: 23 August 2022.
Edited by:
Thomas W. James, Indiana University Bloomington, United StatesReviewed by:
Yavor Yalachkov, University Hospital Frankfurt, GermanyJames Danckert, University of Waterloo, Canada
Ben Dyson, University of Alberta, Canada
Copyright © 2022 Banks, Bennett, Sekuler and Gruber. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Aaron J. Gruber, YWFyb24uZ3J1YmVyQHVsZXRoLmNh
 Patrick J. Bennett1
Patrick J. Bennett1 
   
  