- Department of Functional Brain Imaging, National Institute of Radiological Sciences, National Institutes for Quantum and Radiological Science and Technology, Chiba, Japan
Humans and animals show diverse preferences for risks (“trait-like” risk attitude) and shift their preference depending on the state or current needs (“state-dependent” risk attitude). For a better understanding of the neural mechanisms underlying risk-sensitive decisions, useful animal models have been required. Here we examined the risk attitude of three male monkeys in a single-option response task, in which an instrumental lever-release was required to obtain a chance of reward. In each trial, reward condition, either deterministic (100% of 1, 2, 3, and 4 drops of juice) or probabilistic (25, 50, 75, and 100% of 4-drop juice) was randomly selected and assigned by a unique visual cue, allowing the monkeys to evaluate the forthcoming reward. The subjective value of the reward was inferred from their performance. Model-based analysis incorporating known economic models revealed non-linear probability distortion in monkeys; unlike previous studies, they showed a simple convex or concave probability distortion curve. The direction of risk preference was consistent between early and late phases of the testing period, suggesting that our observation reflected the trait-like risk attitude of monkeys, at least under the current experimental setting. Regardless of the baseline risk preference, all monkeys showed an enhancement of risk preference in a session according to the satiation level (i.e., state-dependent risk attitude). Our results suggest that, without choice or cognitive demand, monkeys show naturalistic risk attitude – diverse and flexible like humans. Our novel approach may provide a useful animal model of risk-sensitive decisions, facilitating the investigation of the neural mechanisms of decision-making under risk.
Introduction
In an uncertain environment, one’s preference toward risk biases one’s decisions. Imagine that your friend encouraged you to buy an unlisted stock of a business venture. If you are a conservative person, you may pass on the opportunity to avoid the risk (i.e., risk-averse). However, if you are an adventurous person, you may buy the stock regardless of the risk (i.e., risk-prone). As such, inherent individual risk preference is diverse and determines the basic tendency to take (or not to take) a risky option (“trait-like” risk attitude) (Weber et al., 2002; Huettel et al., 2006; Tobler et al., 2008). In addition, the risk attitude is changeable depending on internal contexts; if you need to make money right away, you may buy the risky stock irrespective of your character (“state-dependent” risk attitude) (Caraco et al., 1980; Stephens and Krebs, 1986; McNamara and Houston, 1992).
Past studies measured the risk preference of human subjects in economic tasks, in which subjects repeatedly made choices between a risky option and a safe option, and mathematical models have been proposed to capture the choice decisions of subjects. The most influential model, prospect theory, assumes a distortion of probabilities and provides better explanation of the non-normative choice pattern of human subjects than the expected utility theory does (Kahneman, 1979; Tversky and Kahneman, 1992; Prelec, 1998; Gonzalez and Wu, 1999). Calculation of the subjective value based on distorted probability is conceptually analogous to the assumption of the finance theory that calculates the subjective value with the mean–variance model (Markowitz, 1952; Levy and Markowitz, 1979; Tobler et al., 2009). These studies revealed various risk preferences of human subjects, and further facilitated research to find the neural correlates of trait-like risk attitude by coupling with brain imaging techniques (Tom et al., 2007; Takahashi et al., 2010; Gilaie-Dotan et al., 2014). Such an economic approach has also been applied to some animal studies using a liquid reward as an alternative of a monetary reward, and they consistently reported non-linear probability distortion of monkeys just like humans (Stauffer et al., 2015; Chen and Stuphorn, 2018).
Although economic approaches began to elucidate the mechanisms of risk-sensitive decisions across species, direct application of economic tasks to animals may pose limitations; for example, the cognitive capacity (e.g., working memory) of animals is not comparable to that of humans, but is largely limited to adaptation to their ecological niche (Krebs et al., 1977; Stevens et al., 2005; Elmore et al., 2011). Such disparity may enforce extra task-demands on animals even in physically identical task settings (Pearson et al., 2010; Blanchard et al., 2013). Another problem is that making repeated choices among available options is an unfamiliar setting for animals considering their feeding ecology, in which they typically make a cost–benefit decision on a single prey (i.e., non-choice decisions) (Krebs et al., 1977; Kacelnik et al., 2011; Hayden and Walton, 2014). As recently suggested, such non-choice decisions recruit distinct brain circuits to that for two-option choices (Kolling et al., 2012; Shenhav et al., 2016). Moreover, some studies using human subjects emphasized that humans showed distorted risk preference in the task without choice (Tobler et al., 2008; Levy et al., 2011). Hence, from an ethological perspective, it is worthwhile to test the risk preference of monkeys in a non-choice decision paradigm.
In this study, we aimed to assess the naturalistic risk attitude of monkeys by minimizing undesirable task demands. We adopted a non-choice, instrumental lever-release task, in which a visual cue revealed the size and probability of forthcoming reward condition as being either deterministic or probabilistic. The basic setting of this task was shown to be useful for inferring monkeys’ evaluation of a certain reward value (e.g., reward size) based on their performance (Minamimoto et al., 2009). The inference has been formulated and applied in many studies (Bouret and Richmond, 2015; Eldridge et al., 2016; Nagai et al., 2016; Fujimoto et al., 2019), and can be extended to temporal discounting and workload discounting using the same basic task structure (Minamimoto et al., 2009, 2012). Here, we implemented well-known economic models to assess the trait-like and state-dependent risk attitude of monkeys in a quantitative manner (Stauffer et al., 2015; Chen and Stuphorn, 2018). Our results may fill the gap between human and monkey studies using economic tasks, thus providing a useful animal model to investigate the neural basis of risk-sensitive decision-making.
Materials and Methods
Subjects
Three male macaque monkeys (Macaca mulatta, monkeys ST and KY, 5.3 kg and 6.8 kg; Macaca fuscata, monkey HI, 7.6 kg) were used. All experimental procedures were approved by the Animal Care and Use Committee of the National Institutes for Quantum and Radiological Science and Technology and were in accordance with the guidelines published in the NIH Guide for the Care and Use of Laboratory Animals.
Behavioral Task
The monkeys squatted on a primate chair inside a dark, sound-attenuated, and electrically shielded room. A touch-sensitive lever was mounted on the chair. Visual stimuli were displayed on a computer video monitor in front of the animal. Behavioral control and data acquisition were performed using a real-time experimentation system (REX) (Hays et al., 1982). Presentation software was used to display visual stimuli (Neurobehavioral Systems Inc., Berkeley, CA, United States).
The monkeys performed the single-option response task (Figure 1A). In each trial, the monkey had the same requirement to obtain liquid rewards. A trial began when a monkey gripped a lever. A visual cue and a red spot appeared sequentially, with a 0.4 s interval, at the center of the monitor. After a variable interval (0.5–1.5 s), the central spot turned to green (“go” signal), and the monkey had to release the lever within the reaction time (RT) window (0.2–1.0 s). If the monkey released the lever correctly, the spot turned to blue (0.2–0.4 s), and then a reward was delivered in accordance with the visual cue. The next trial began following an inter-trial interval (ITI, 1.5 s). When trials were performed incorrectly, they were terminated immediately (all visual stimuli disappeared), and the next trial began with the same reward condition following the ITI. There were two types of errors: premature lever releases (lever releases before or no later than 0.2 s after the appearance of the go signal, named “early errors”) and failures to release the lever within 1.0 s after the appearance of the go signal (named “late errors”).
Figure 1. Single-option response task. (A) Sequence of a trial. (B) Cue sets. Left: cue stimuli that predict deterministic reward delivery (deterministic trials). Right: cue stimuli that predict probabilistic reward delivery (probabilistic trials). (C) An example of trial series. Deterministic and probabilistic trials were intermingled in a session.
The combination of reward size and its probability was informed by the visual cue (grayscale images) at the beginning of each trial; four cues were used for the deterministic trials and the other four for the probabilistic trials (Figure 1B). In the deterministic trials, the size of the reward (1, 2, 3, or 4 drops) was chosen randomly, and the reward probability was fixed at 100%. In the probabilistic trials, the size of the reward was fixed at 4 drops and the probability of the reward (25, 50, 75, or 100%) was chosen randomly. Thus, the expected value was matched across the two conditions. The training schedule was as follows. Prior to the experiment with the single-option response task, all monkeys had been trained to perform color discrimination trials in a cued multi-trial reward schedule task for >1 month. Next, the monkeys were trained in the deterministic trials for 3 weeks, and subsequently in the probabilistic trials for 3 weeks, respectively (“separate” phase). Finally, the monkeys were tested under the condition in which deterministic and probabilistic trials were intermingled, and the test ran for >6 weeks (“mixed” phase; Figure 1C). The data obtained during the mixed phase (43, 53, and 41 sessions for monkeys ST, KY, and HI, respectively) were analyzed in the current study. The number of trials in a session was 1,338 ± 79 trials for monkey ST, 1,206 ± 300 trials for monkey KY, and 1,384 ± 109 trials for monkey HI, and the amount of reward intake in a session was 325 ± 20 ml for monkey ST, 286 ± 75 ml for monkey KY, and 327 ± 38 ml for monkey HI (mean ± SD).
Experimental Design and Statistical Analysis
All statistical analyses and model fitting were performed using R statistical software. We analyzed the error rate and RT. The error rate was calculated by dividing the total number of errors (the sum of early and later errors) by the total number of trials in a session. We reported the average error rate across sessions and the standard error of the mean (SEM). RT was defined as the duration from a “go” signal to the time point of lever release in a correct trial.
As previously shown, the error rate in the same paradigm with deterministic reward has an inverse relationship to the subjective value (inverse function, Minamimoto et al., 2009). To infer the subjective reward value in each monkey, we used a modified version of the inverse function:
where E and V represented the error rate and the subjective value, while c and b were free parameters that represented the reward sensitivity of monkeys. We confirmed that this model fitted well with the error rates in deterministic trials of the training session, where (V) corresponded to the reward size (1, 2, 3, and 4 drops; R2 > 0.86). We extended this model to infer the subjective reward value of probabilistic trials using three models: GW, Prelec, and mean–variance models (see below). For each monkey, parameters c and b were first determined using the best-fit of the inverse function (Eq. 1) to the error rate in the deterministic trials. These parameters were then applied to Eq. (1), which integrated one of the three subjective value models as V and then was fitted to the error rates in the probabilistic trials.
GW Model
According to Gonzalez and Wu (1999), probability weighting function, w(p), was formulated as below:
where p represents the probability of winning a reward (25, 50, 75, and 100%), and γ and δ are free parameters that control the curvature and elevation of the function, respectively. This model yields non-linear probability weighting function, although it allows monotonic increase/decrease of probability weighting when γ = 1. Subjective value V was then calculated by multiplying the reward magnitude m (4 drops) and subjective probability w(p) in accordance with the prospect theory (Kahneman, 1979; Tversky and Kahneman, 1992).
Prelec Model
According to Prelec (1998), the probability weighting function was formulated as below:
where α and β are free parameters that control the curvature and elevation of the function, respectively. For the one-parameter Prelec model, β is fixed at 1; this function yields an inverted S-shape in α > 1, while it yields S-shape in α < 1, with inflection point (p = w(p)) around p = 1/e. We defined the subjective values with Eqs (3) and (4).
Mean–Variance Model
According to financial theory, the subjective value is determined by combining the expected value (EV) and variance risk (Var) (Markowitz, 1952; Levy and Markowitz, 1979). First, EV and Var are calculated as follows:
Then, the subjective value is defined as:
where ε is a free parameter that describes a bonus by the variance risk.
The model fittings were performed using the “optim” function implemented in R software. Standard error of estimated parameter was calculated by means of the Hessian matrix at the function. The goodness of fit was assessed with the R2 value and Akaike Information Criteria (AIC) (Akaike, 1973), which is calculated as follows:
where L is the maximum likelihood of the model and k is the number of free parameters in the model. Smaller AIC values indicated a better model fit to the data. A likelihood ratio test was used to compare GW models. The p-value was obtained by the parametric bootstrapping method (n = 10,000).
The effect of the satiation level on risk attitude was assessed using a measure of accumulated reward level (Minamimoto et al., 2009). Satiation level (S) was defined as the normalized liquid intake that is the ratio between the amount of total reward delivered up to time t, Rcum(t), and the total amount of reward delivered in the entire session, RcumMax:
The effect of the history of previous reward was also assessed by logistic regression analysis:
where P is the performance (i.e., correct or error), R is the reward size, S is the satiation level, PR is the reward size in the previous trial, β are the regression coefficients, and e is a constant.
Results
Risk Preference in Three Monkeys
The error rate and RT were the two main behavioral measures of the monkeys’ valuation of the current task; the more reward value is expected, the less the subjects make errors and the faster they respond (Minamimoto et al., 2009; Nagai et al., 2016; Fujimoto et al., 2019). We first compared the overall error rate and RT between deterministic (1, 2, or 3 drops) and probabilistic trials (25, 50, or 75%) in each session separately. For this analysis, we excluded the trials of which the expected value was 4 drops (and the probability was 100%) to focus on the effect of risk. Although expected values were equivalent between the two trial types, motivation of monkey ST appeared to be higher in probabilistic trials; the overall error rate in the deterministic trials was significantly higher than that in the probabilistic trials (n = 43, p < 0.01, rank-sum test; Figure 2A, left), and RT in the deterministic trials was significantly longer than in the probabilistic trials (n = 43, p < 0.01, rank-sum test, Figure 2B, left). These results indicated a risk-prone tendency of this monkey, which was consistent across sessions. Monkey KY also showed a risk-prone tendency; the error rate and RT were significantly larger and longer in the deterministic trials (error rate, n = 53, p = 0.049; RT, n = 53, p < 0.01; Figures 2A,B, middle column). Monkey HI, on the other hand, displayed the opposite pattern; the error rate and RT tended to be larger and longer in the probabilistic trials (error rate, n = 41, p = 0.54; RT, n = 41, p < 0.01; Figures 2A,B, right column), indicating a risk-averse tendency of this monkey. These results demonstrate that our task allowed us to characterize the individual risk preference of monkeys as a consistent behavioral bias across sessions, which was not uniform across the monkeys examined.
Figure 2. Risk-induced behavioral bias. (A) Error rate of each session. Error rates in probabilistic trials (abscissa) and in deterministic trials (ordinate) in each session are plotted for monkeys ST (left), KY (center), and HI (right). The plots in the blue shaded areas indicate risk-prone sessions. Histograms on the right shoulder of panels show the distribution of the distance between each plot and the identity line. Red lines indicate the average of the distance. Asterisks indicate significant difference from zero (∗∗p < 0.01, *p < 0.05, rank-sum test). (B) Reaction time (RT) of each session. Schemas of the figures are the same as in A.
As we reported previously, the error rate in the deterministic trials varied depending on the reward size, with higher error rates for smaller reward (Figure 3, plots in red), the relation of which was well explained by an inverse function (Eq. 1, R2 > 0.80) (Minamimoto et al., 2009; Nagai et al., 2016). The error rate in the probabilistic trials also reflected the expected value of reward; however, they were lower (monkeys ST and KY) or higher (monkey HI) than those in deterministic trials for the corresponding expected value (Figure 3, plots in blue). Three-way ANOVA (expected value: 1, 2, 3, and 4 drops × trial type: deterministic or probabilistic × Monkey) revealed a significant main effect of the expected value [F(1,1088) = 39.6, p < 0.01] and a significant interaction of the trial type and monkey [F(1,1088) = 4.9, p = 0.027], suggesting the effects of reward expectation and individual risk preference on the subjective valuation of probabilistic rewards.
Figure 3. Change of error rate by reward-expected value and risk. Error rates (mean ± SEM) in deterministic (red) and probabilistic trials (blue) are plotted as a function of expected values for monkey ST (left), KY (center), and HI (right). The best-fit inverse function (red) is superimposed on the plots (ST: c = 7.3, b = –1.8; KY: c = 26.9, b = 2.3, HI: c = 41.9, b = 5.6) with the goodness of fit (R2) on each panel.
Simulations With Parsimonious Models
To describe the relationship between error rate and reward probability, we used a modified version of the inverse function with the subjective value of probabilistic reward (i.e., subjective-value model). To estimate the subjective valuation of monkeys, we employed the probability-weighting function developed by Gonzalez and Wu (1999) (“GW model,” Eq. 2), a prospect-theory model that is widely used to describe non-linear probability distortion measured in economic tasks. Because both probabilistic and deterministic trials were tested in the same sessions, we used the same monkey-specific parameters c and b in the inverse functions to explain the error rates in two trial types (see the section “Materials and Methods”).
The GW model implements two free parameters: γ and δ, control curvature and elevation of function, respectively. First, we simulated how each parameter modifies the probability-weighting function and the error rate by using parsimonious models (“partial GW models”), which incorporate one free parameter. When γ in the GW model was fixed [GW (δ| γ = 1)], the probability-weighting function became concave when δ > 1, while it became convex when δ < 1 (Figure 4A). The error rate in the probabilistic trials then simply rose or fell compared to that in the deterministic trials (Figure 4B). When δ in the GW model was fixed [GW (γ| δ = 1)], on the other hand, the function became S-shaped when γ > 1, while it became inverted S-shaped when γ < 1 (Figure 4C). Under this condition, the error rates in the two trial types crossed each other; when γ < 1, for instance, the error rate in 25% trials was lower than in 1-drop trials and that in 75% trials was higher than in 3-drop trials (Figure 4D). Because the data demonstrated simple reduction (monkeys ST and KY) or elevation (monkey HI) of error rate by imposing risk (Figure 3), the simulation suggests that the partial GW model with fixed γ [GW (δ| γ = 1)] may explain the probability distortion of monkeys.
Figure 4. Simulation of error rates in probabilistic trials. (A) Simulated probability-weighting function with partial GW model with fixed γ. Colors indicate the value of parameter δ used for the simulation. (B) Simulated error rate in the probabilistic trials with partial GW model (γ = 1) for monkey ST (left), KY (center), and HI (right). As a reference, a best-fit inverse function to error rate in deterministic trials (dashed gray curve) is shown for each monkey. (C,D) Simulated probability-weighting function (C) and error rate in the probabilistic trials (D) with partial GW model with fixed δ (δ = 1). Colors indicate the value of parameter γ used for the simulation. Schemas of the figures are the same as in A and B.
Modeling Individual Risk Preference Reflecting Trait-Like Risk Attitude
The subjective-value model implementing the GW model [GW (γ, δ)] well explained the error rate in the probabilistic trials for all monkeys (R2 > 0.75, Figure 5A). As predicted in the simulation, the best-fit probability-weighting function with the GW model showed a simple convex or concave pattern (Figure 5B), demonstrating overweighting of reward probability (monkeys ST and KY) and underweighting of reward probability (monkey HI) in subjective valuation of the probabilistic reward. This result suggests risk-prone tendency of monkeys ST and KY and risk-averse tendency of monkey HI, as demonstrated in Figure 2. Then, to validate the parsimonious model, we tested whether the partial GW model with fixed γ [GW (δ| γ = 1), Figure 4A] also fits the data. As expected, the subjective-value model implementing the partial GW model with fixed γ well described the error rate in the probabilistic trials for all monkeys (R2 > 0.74, Figure 5C). The best-fit probability-weighting function and estimated parameter δ (Figure 5D) was comparable to those estimated by the full GW model. In contrast, the partial GW model with fixed δ or the simple GW model with fixed γ and δ did not provide good fits to the error rate in the probabilistic trials [GW (γ| δ = 1) and GW (γ = 1, δ = 1), Table 1]. The partial GW model, GW (δ| γ = 1), explained the data significantly better than the simple GW model in all monkeys (p < 0.05, likelihood ratio test), suggesting that unfixed parameter δ is essential and sufficient for explaining the individual risk preference of monkeys measured in the single-option response task. We also tested whether the subjective-value model (the inverse function fusing the partial GW model with fixed γ), which incorporated three free parameters c, b, and δ, fits the error rate in both trial types. The model again fitted well with the data for all monkeys (R2 > 0.81), suggesting the robustness of the modified inverse function in the current task.
Figure 5. Validation of the full and partial GW models. (A,C) Error rate and best-fit function of subjective-value models in the deterministic trials (red) and in the probabilistic trials (blue) for monkeys ST (left), KY (center), and HI (right). Red curve shows the best-fit inverse function for each monkey. Blue curve shows the best-fit function of the subjective-value model with the GW model [GW (δ, γ)] (A) or with the partial GW model with fixed γ [GW (δ | γ = 1)] (C). (B,D) Best-fit probability-weighting function for each monkey. Probability-weighting function is calculated in the GW model (B) or in the partial GW model with fixed γ (D), and value of estimated parameter δ is shown in each panel. Dashed line indicates the identity line where subjective probability and reward probability are indifferent.
As shown in Figure 2, the risk in reward outcome biased error rate and RT in the same direction, and the direction of bias was roughly consistent during the testing period. Given that what we modeled reflected the trait-like risk attitude of monkeys, the direction of risk preference (i.e., risk-prone or risk-averse), in other words, a convex or concave probability weighting pattern, should be stable over a longer time period. To confirm the stability of individual risk preference, we separately calculated δ in the partial GW model for the early (e.g., #1–20 sessions) and late testing sessions (e.g., #21–40 sessions) for each monkey. As expected, risk preference was consistent over the sessions; monkeys ST and KY showed high δ (>1) either in early or late sessions (ST early: 2.8 ± 1.0, ST late: 2.3 ± 0.6; KY early: 1.3 ± 0.8, KY late: 3.2 ± 0.9, mean ± SEM), while monkey HI consistently showed low δ (<1) between the two periods (early: 0.25 ± 0.43, late: 0.51 ± 0.14). These results suggested that we modeled the trait-like risk attitude of the monkeys.
Convex/Concave Probability Distortion Was Not Model-Specific
The error rate was also well explained by other subjective-value models that incorporated the Prelec model (Eq. 4, R2 > 0.77, Figure 6A) or mean–variance model (Eq. 7, R2 > 0.71, Figure 6C), which also assume non-linear probability distortion (Markowitz, 1952; Levy and Markowitz, 1979; Prelec, 1998). The best-fit probability-weighting function calculated by the Prelec model (Figure 6B) or mean–variance model (Figure 6D) showed the convex or concave pattern that was comparable to that calculated by the full or partial GW model (Figures 5B,D). Thus, the individual risk preference assessed in the single-option response task can be modeled reasonably well by the economic models with a free parameter focusing on the elevation. The goodness of fit (AIC) and parameters estimated are summarized in Table 2.
Figure 6. Validation of Prelec model and mean–variance models. (A,C) Error rate and best-fit functions of subjective-value models for each monkey. Blue curve shows best-fit function of the subjective-value model implementing the Prelec model (A) or the mean–variance model (C) (blue), respectively. Schemas of the figures are the same as in Figures 5A,C. (B,D) Best-fit probability-weighting function calculated by the Prelec model (B) or by the mean–variance model (D). Schemas of the figures are the same as in Figures 5B,D.
Assessing State-Dependent Risk Attitude Within a Session
In addition to trait-like risk attitude, physiological drive state can influence risk attitudes; for example, thirsty monkeys became more risk averse (Yamada et al., 2013). To examine the effect of satiation on risk attitude, we analyzed the error rate in the sub-parts of a session according to reward accumulation (satiation level: 0–0.5, 0.25–0.75, 0.5–1.0; see the section “Materials and Methods”). We found that the difference in error rate between deterministic and probabilistic trials varied depending on the satiation level [one-way repeated measures ANOVAs, main effect of satiation level, F(1,409) = 5.9, p = 0.015, Figures 7A–C]. The satiation level also affected RT; the difference in RTs between the two conditions increased according to satiation [main effect of satiation level, F(1,409) = 17, p < 0.01].
Figure 7. Satiation effect on risk attitude. (A–C) Difference in error rate by trial type. The error rate in the probabilistic trials was subtracted from that in the deterministic trials for each sub-session (early, middle, late). Panels are for monkeys ST (A), KY (B), and HI (C). (D–F) Shifts of probability-weighting functions (Eq. 2) according to satiation for each monkey. The best-fit function for the data of each sub-session (left: 0–0.5, center: 0.25–0.75, right: 0.5–1.0, satiation level) is displayed. (G–I) Parameter δ is plotted for each sub-session and for each monkey. Colors are the same as in D–F.
The satiation effect on risk attitude was further assessed by the modeling approach; we fitted the subjective-value model implementing the partial GW model with fixed γ to the error rate in the probabilistic trials and extracted the best-fit parameter δ from the probability-weighting function for each sub-session (Figures 7D–F). We found that parameter δ tended to increase in the latter sub-sessions for all monkeys; the risk-proneness of monkeys ST and KY was evident in the early period and was enhanced thereafter, while monkey HI exhibited weaker risk-averseness as the session progressed and became nearly risk-neutral in the last sub-session (Figures 7G–I). In contrast, the direction of risk attitude was unchanged over a session; δ was always >1 in monkeys ST and KY, whereas it was always <1 in monkey HI. These results demonstrated a state-dependent risk attitude in monkeys; that is, the risk preference gets stronger according to satiation.
Partial Effects of Reward History on Performance
In our task design, the subjective value of probabilistic reward was associated with the cue but was independent from trial sequence or history. However, monkeys could take local contextual reward information into account for the reward expectation that may influence the performance (i.e., correct or error). In other words, the differences in error rate between the deterministic and probabilistic trials could arise from the effect of reward history. If so, the effect should be parallel with the risk preferences of the three monkeys. To address this possibility, we performed logistic regression analysis with three regressors: expected value (1, 2, 3, or 4 drops), satiation level (0–1), and previous reward (0, 1, 2, 3, or 4 drops). Expected value and satiation level significantly contributed to the performance for all monkeys (p < 0.05 with Bonferroni correction; Figure 8). The previous reward, on the other hand, affected only the performance of monkey KY (p < 0.01), but not the other two (p > 0.10, Figure 8). This pattern of individual differences was unrelated to that of risk preference or state-dependent change among the three monkeys. Thus, the effect of reward history was apparently limited and did not correlate with individual risk attitude in our experimental condition.
Figure 8. Effect of expected value, satiation, and reward history on performance. Results of logistic regression for monkey ST (left), KY (center), and HI (right). Bars indicate the correlation coefficient (normalized beta) of expected value (white), satiation level (gray), and previous reward (black), respectively. Asterisks indicate statistically significant difference from zero (∗∗p < 0.05, *p < 0.10 with Bonferroni correction).
Discussion
In the present study, monkeys’ risk attitude was assessed by a single-option response task, in which the subjective value of a probabilistic reward was inferred from their performance. To the best of our knowledge, this is the first study to examine risk preference of monkeys in a non-choice paradigm. Model-based analysis revealed non-linear probability distortion and diverse risk preference among three monkeys. The subjective probability weighting of monkeys was well explained by economic models and showed a simple convex/concave pattern over testing sessions. Regardless of baseline risk preference, all monkeys showed an increase in risk preference as satiation increased in a session. The current results thus highlighted the trait-like and state-dependent risk attitude of monkeys in non-choice decisions.
Past studies demonstrated that monkeys show non-linear probability distortion using economic tasks (Stauffer et al., 2015; Chen and Stuphorn, 2018). The present study replicated this in the single-option response task that imposed no choice demand. The basic structure of the current task was shown to be useful to infer the valuation of monkeys when reward size or cost was varied (Minamimoto et al., 2009, 2012; Bouret and Richmond, 2015; Eldridge et al., 2016; Nagai et al., 2016; Fujimoto et al., 2019). By implementing known economic models, the present study extended this basic model to infer the subjective reward value of probabilistic reward. Our monkeys demonstrated a diverse preference for the risk; two monkeys showed risk-prone, and one showed risk-averse. This seems to reflect the trait-like risk attitude of monkeys because their risk preferences were consistent across sessions. Their performance in probabilistic trials was well demonstrated by a subjective-value model incorporating a non-linear probability weighting function (Markowitz, 1952; Levy and Markowitz, 1979; Prelec, 1998; Gonzalez and Wu, 1999), and thus the results were largely consistent with the above literature despite differences in task structures and measures of subjective valuations. Our results also suggest that economic models are generalizable for describing the probability distortion in non-choice, ecological decisions (Hayden and Walton, 2014; Pearson et al., 2014).
Unlike the previous studies, our monkeys showed a simple convex or concave probability distortion, and that pattern was well explained by a parsimonious GW model in which one free parameter concerning the elevation of function was adopted. On the above studies using economic tasks, all monkeys tested showed inverted S-shaped probability distortion (i.e., risk-seeking for low probability and risk-aversion for high probability) and was well-explained by Prelec’s function with α < 1, while the same model failed to explain the monkeys’ performance in the current study. Such a stereotypical pattern observed in the previous studies may arise from excessive task demand in economic tasks; the cognitive load due to choice demand could diminish sensitivity to the difference in the reward probability and result in the inverted S-shape probability distortion. Indeed, recent studies showed that manipulation in task structure (e.g., trial sequence) of economic tasks affected monkeys’ inverted S-shape probability distortion, potentially due to contamination of reward history (Farashahi et al., 2018; Ferrari-Toniolo et al., 2019). Importantly, the effect of reward history was limited in our paradigm, and hence did not account for the observed individual risk preference. Therefore, the discrepancy could be attributed solely to the task design concerning the ecological decision situation.
As a genetic kinship, humans and monkeys share a large number of cognitive traits. However, because monkeys learn the option value through their experience, a task structure per se would largely influence their task performance and therefore hamper a straightforward interpretation by investigators (Real, 1991). For example, Blanchard et al. (2013) demonstrated that monkeys did not care about the length of the delay period after reward delivery, and that had led to misunderstanding by preceding researchers about the temporal-discounting ability of monkeys. Similarly, economic tasks could contain undesirable confoundings, such as working memory, inhibitory control, and value comparison, which may affect decision strategy and obscure natural behavioral traits (Stephens and Krebs, 1986; Elmore et al., 2011; Blanchard et al., 2014; Hayden and Walton, 2014). The current study eliminated such undesirable confounding effects by adopting a non-choice decision in the task. In fact, our monkeys quickly learned to perform the single-option response task (<1 month), while it usually takes several months for monkeys to learn to perform two-option choices. Unlike using choice tasks, diverse individual differences in trait-like risk attitudes were seen in our monkeys, as observed in human studies (Tom et al., 2007; Tobler et al., 2008; Takahashi et al., 2010; Gilaie-Dotan et al., 2014), and therefore the current task may provide a better opportunity to assess the naturalistic risk attitude of monkeys.
Adapting risk attitude based on current needs is vital for maximizing fitness in an uncertain environment (Stephens and Krebs, 1986). Human studies showed that subjects flexibly modulate risk attitude based on required points or “wealth level” even during a single experimental session (Symmonds et al., 2011; Kolling et al., 2014; Fujimoto and Takahashi, 2016; Juechems et al., 2017). Yamada et al. (2013) directly demonstrated the relationship between risk preference and satiety by monitoring the blood osmolality level within a session in macaque monkeys, which is a physiological form of “wealth level.” Consistently, our monkeys showed enhancement of risk-prone tendency (ST and KY) or suppression of risk-aversion (HI) according to reward accumulation, and our model-based analysis successfully described the satiation effect. Of note, the increase of risk preference reflects state-dependent risk attitude, because it occurred irrespective of baseline risk preference. This change of risk preference within a session is not attributable to the reward history effect, which was limited in the monkeys. Importantly, human studies suggested that state-dependent modulation of risk attitude was not accounted for by change of the physiological state itself either (Symmonds et al., 2011; Kolling et al., 2014; Fujimoto and Takahashi, 2016). Hence, the current approach successfully quantified the trait-like and state-dependent risk attitude of monkeys within one task, suggesting a useful model of risk-sensitive decision for translational research.
What causes the inconsistent risk preference across animals still remains unclear. Probably the most well-known factors that lead to differences in risk attitude in humans are gender and age (Walker et al., 2017). However, they are unlikely to have a role in the current study because we solely used adult male monkeys. Another possible cause is social rank (Davis et al., 2009), but the contribution of this factor is unknown because we have not tested the social relationship of our monkeys. Future study should validate the exact cause of individual risk preference by employing a larger cohort of animals.
Past studies reported that the trait risk attitude correlated with individual differences in monoamine systems (Berridge and Waterhouse, 2003; Roiser et al., 2009; Takahashi et al., 2010), brain structures (Gilaie-Dotan et al., 2014; Leong et al., 2016), and activity patterns (Kuhnen and Knutson, 2005; Huettel et al., 2006; Preuschoff et al., 2008; Levy et al., 2010) of human subjects. However, the neural substrates of individual risk preference in monkeys are largely unknown. Our behavioral assessment, which successfully demonstrated diverse risk attitude in monkeys with single free parameter (δ), may provide an excellent opportunity to explore the neural basis of individual risk preference, as the animal model allows us to measure neural activities directly, and to use neural modulation techniques (cf., Nagai et al., 2016). One of the potential applications is the study of gambling disorder (GD), which is considered to be a dysfunction of risk-sensitive decision (Hodgins et al., 2011; American Psychiatric Association [APA], 2013). Indeed, we recently showed that GD patients had deficits not only in trait-like risk attitude but also in state-dependent risk attitude (Fujimoto et al., 2017). Therefore, future study should identify the neural substrates of both trait-like and state-dependent risk attitude in monkeys, providing therapeutic targets for GD patients.
One of the limitations of the current study was the small sample size. We thus could not address the mechanism behind individual differences in risk attitude. Another limitation was that we used only one reward size for probabilistic trials (4 drops); modifying the range of reward size may influence monkeys’ risk attitude. Further validation with a larger cohort and/or broader reward environments will be needed to generalize our findings and identify other factors that influence the risk attitude of monkeys.
In conclusion, our approach based on economics and behavioral ecology illustrates the trait-like and state-dependent risk attitude of monkeys. Because our model-based analysis employed well-known functions from past human studies, the current animal model may accelerate translational research to determine neural mechanisms underlying risk-sensitive decision-making.
Data Availability
All datasets generated for this study are included in the manuscript and/or the supplementary files.
Ethics Statement
All experimental procedures were approved by the Animal Care and Use Committee of the National Institutes for Quantum and Radiological Science and Technology and were in accordance with the guidelines published in the NIH Guide for the Care and Use of Laboratory Animals.
Author Contributions
AF designed and performed the research, analyzed the data, and wrote the manuscript. TM designed the research and edited the manuscript.
Funding
This work was supported by the JSPS KAKENHI [Grant Numbers JP15H06872 and JP17K13275 (to AF) and JP18H04037 (to TM)] from the Ministry of Education, Culture, Sports, Science, and Technology of Japan (MEXT), by the Takeda Science Foundation Overseas Research Fellowship (to AF) and by the AMED Grant Numbers JP18dm0107146 and JP18dm0207007 (to TM).
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Acknowledgments
We thank Dr. Y. Sakai for his comments on an early version of the manuscript, Dr. T. Suhara for financial support, Dr. K. Mimura for his assistance with statistical analyses, and members from the Neural Systems and Circuits Research Group, QST, for invaluable scientific discussion. We also thank J. Kamei, Y. Matsuda, R. Yamaguchi, Y. Sugii, R. Suma, and A. Maruyama for their technical assistance.
References
Akaike, H. (1973). Maximum likelihood identification of Gaussian autoregressive moving average models. Biometrika 60, 255–265. doi: 10.1093/biomet/60.2.255
American Psychiatric Association [APA] (2013). Diagnostic and Statistical Manual of Mental Disorders (DSM-5®). Arlington, VA: American Psychiatric Association Publishing.
Berridge, C. W., and Waterhouse, B. D. (2003). The locus coeruleus-noradrenergic system: modulation of behavioral state and state-dependent cognitive processes. Brain Res. Brain Res. Rev. 42, 33–84. doi: 10.1016/s0165-0173(03)00143-7
Blanchard, T. C., Pearson, J. M., and Hayden, B. Y. (2013). Postreward delays and systematic biases in measures of animal temporal discounting. Proc. Natl. Acad. Sci. U.S.A. 110, 15491–15496. doi: 10.1073/pnas.1310446110
Blanchard, T. C., Wolfe, L. S., Vlaev, I., Winston, J. S., and Hayden, B. Y. (2014). Biases in preferences for sequences of outcomes in monkeys. Cognition 130, 289–299. doi: 10.1016/j.cognition.2013.11.012
Bouret, S., and Richmond, B. J. (2015). Sensitivity of locus ceruleus neurons to reward value for goal-directed actions. J. Neurosci. 35, 4005–4014. doi: 10.1523/JNEUROSCI.4553-14.2015
Caraco, T., Martindale, S., and Whittam, T. S. (1980). An empirical demonstration of risk-sensitive foraging preferences. Anim. Behav. 28, 820–830. doi: 10.1016/s0003-3472(80)80142-4
Chen, X., and Stuphorn, V. (2018). Inactivation of medial frontal cortex changes risk preference. Curr. Biol. 28, 3114.e4–3122.e4.
Davis, J. F., Krause, E. G., Melhorn, S. J., Sakai, R. R., and Benoit, S. C. (2009). Dominant rats are natural risk takers and display increased motivation for food reward. Neuroscience 162, 23–30. doi: 10.1016/j.neuroscience.2009.04.039
Eldridge, M. A., Lerchner, W., Saunders, R. C., Kaneko, H., Krausz, K. W., Gonzalez, F. J., et al. (2016). Chemogenetic disconnection of monkey orbitofrontal and rhinal cortex reversibly disrupts reward value. Nat. Neurosci. 19, 37–39. doi: 10.1038/nn.4192
Elmore, L. C., Ma, W. J., Magnotti, J. F., Leising, K. J., Passaro, A. D., Katz, J. S., et al. (2011). Visual short-term memory compared in rhesus monkeys and humans. Curr. Biol. 21, 975–979. doi: 10.1016/j.cub.2011.04.031
Farashahi, S., Azab, H., Hayden, B., and Soltani, A. (2018). On the flexibility of basic risk attitudes in monkeys. J. Neurosci. 38, 4383–4398. doi: 10.1523/JNEUROSCI.2260-17.2018
Ferrari-Toniolo, S., Bujold, P. M., and Schultz, W. (2019). Probability distortion depends on choice sequence in rhesus monkeys. J. Neurosci. 39, 2915–2929. doi: 10.1523/JNEUROSCI.1454-18.2018
Fujimoto, A., Hori, Y., Nagai, Y., Kikuchi, E., Oyama, K., Suhara, T., et al. (2019). Signaling incentive and drive in the primate ventral pallidum for motivational control of goal-directed action. J. Neurosci. 39, 1793–1804. doi: 10.1523/JNEUROSCI.2399-18.2018
Fujimoto, A., and Takahashi, H. (2016). Flexible modulation of risk attitude during decision-making under quota. Neuroimage 139, 304–312. doi: 10.1016/j.neuroimage.2016.06.040
Fujimoto, A., Tsurumi, K., Kawada, R., Murao, T., Takeuchi, H., Murai, T., et al. (2017). Deficit of state-dependent risk attitude modulation in gambling disorder. Transl. Psychiatry 7:e1085. doi: 10.1038/tp.2017.55
Gilaie-Dotan, S., Tymula, A., Cooper, N., Kable, J. W., Glimcher, P. W., and Levy, I. (2014). Neuroanatomy predicts individual risk attitudes. J. Neurosci. 34, 12394–12401. doi: 10.1523/JNEUROSCI.1600-14.2014
Gonzalez, R., and Wu, G. (1999). On the shape of the probability weighting function. Cogn. psychol. 38, 129–166. doi: 10.1006/cogp.1998.0710
Hays, A. Jr., Richmond, B., and Optican, L. (1982). Unix-based multiple-process system, for real-time data acquisition and control. WESCON Conf. Proc. 2, 1–10.
Hayden, B. Y., and Walton, M. E. (2014). Neuroscience of foraging. Front. Neurosci. 8:81. doi: 10.3389/fnins.2014.00081
Hodgins, D. C., Stea, J. N., and Grant, J. E. (2011). Gambling disorders. Lancet 378, 1874–1884. doi: 10.1016/S0140-6736(10)62185-X
Huettel, S. A., Stowe, C. J., Gordon, E. M., Warner, B. T., and Platt, M. L. (2006). Neural signatures of economic preferences for risk and ambiguity. Neuron 49, 765–775. doi: 10.1016/j.neuron.2006.01.024
Juechems, K., Balaguer, J., Ruz, M., and Summerfield, C. (2017). Ventromedial prefrontal cortex encodes a latent estimate of cumulative reward. Neuron 93, 705.e4–714.e4. doi: 10.1016/j.neuron.2016.12.038
Kacelnik, A., Vasconcelos, M., Monteiro, T., and Aw, J. (2011). Darwin’s “tug-of-war” vs. starlings’“horse-racing”: how adaptations for sequential encounters drive simultaneous choice. Behav. Ecol. Sociobiol. 65, 547–558. doi: 10.1007/s00265-010-1101-2
Kahneman, D. (1979). Prospect theory: an analysis of decisions under risk. Econometrica 47, 263–292.
Kolling, N., Behrens, T. E., Mars, R. B., and Rushworth, M. F. (2012). Neural mechanisms of foraging. Science 336, 95–98. doi: 10.1126/science.1216930
Kolling, N., Wittmann, M., and Rushworth, M. F. (2014). Multiple neural mechanisms of decision making and their competition under changing risk pressure. Neuron 81, 1190–1202. doi: 10.1016/j.neuron.2014.01.033
Krebs, J. R., Erichsen, J. T., Webber, M. I., and Charnov, E. L. (1977). Optimal prey selection in the great tit (Parus major). Anim. Behav. 25, 30–38. doi: 10.1016/0003-3472(77)90064-1
Kuhnen, C. M., and Knutson, B. (2005). The neural basis of financial risk taking. Neuron 47, 763–770. doi: 10.1016/j.neuron.2005.08.008
Leong, J. K., Pestilli, F., Wu, C. C., Samanez-Larkin, G. R., and Knutson, B. (2016). White-matter tract connecting anterior insula to nucleus accumbens correlates with reduced preference for positively skewed gambles. Neuron 89, 63–69. doi: 10.1016/j.neuron.2015.12.015
Levy, H., and Markowitz, H. M. (1979). Approximating expected utility by a function of mean and variance. Am. Econ. Rev. 69, 308–317.
Levy, I., Lazzaro, S. C., Rutledge, R. B., and Glimcher, P. W. (2011). Choice from non-choice: predicting consumer preferences from blood oxygenation level-dependent signals obtained during passive viewing. J. Neurosci. 31, 118–125. doi: 10.1523/JNEUROSCI.3214-10.2011
Levy, I., Snell, J., Nelson, A. J., Rustichini, A., and Glimcher, P. W. (2010). Neural representation of subjective value under risk and ambiguity. J. Neurophysiol. 103, 1036–1047. doi: 10.1152/jn.00853.2009
Markowitz, H. (1952). Portfolio selection. J. Finance 7, 77–91. doi: 10.1111/j.1540-6261.1952.tb01525.x
McNamara, J. M., and Houston, A. I. (1992). Risk-sensitive foraging: a review of the theory. Bull. Math. Biol. 54, 355–378.
Minamimoto, T., Hori, Y., and Richmond, B. J. (2012). Is working more costly than waiting in monkeys? PLoS One 7:e48434. doi: 10.1371/journal.pone.0048434
Minamimoto, T., La Camera, G., and Richmond, B. J. (2009). Measuring and modeling the interaction among reward size, delay to reward, and satiation level on motivation in monkeys. J. Neurophysiol. 101, 437–447. doi: 10.1152/jn.90959.2008
Nagai, Y., Kikuchi, E., Lerchner, W., Inoue, K. I., Ji, B., Eldridge, M. A., et al. (2016). PET imaging-guided chemogenetic silencing reveals a critical role of primate rostromedial caudate in reward evaluation. Nat. Commun. 7:13605. doi: 10.1038/ncomms13605
Pearson, J. M., Hayden, B. Y., and Platt, M. L. (2010). Explicit information reduces discounting behavior in monkeys. Front. Psychol. 1:237. doi: 10.3389/fpsyg.2010.00237
Pearson, J. M., Watson, K. K., and Platt, M. L. (2014). Decision making: the neuroethological turn. Neuron 82, 950–965. doi: 10.1016/j.neuron.2014.04.037
Preuschoff, K., Quartz, S. R., and Bossaerts, P. (2008). Human insula activation reflects risk prediction errors as well as risk. J. Neurosci. 28, 2745–2752. doi: 10.1523/JNEUROSCI.4286-07.2008
Real, L. A. (1991). Animal choice behavior and the evolution of cognitive architecture. Science 253, 980–986. doi: 10.1126/science.1887231
Roiser, J. P., De Martino, B., Tan, G. C., Kumaran, D., Seymour, B., Wood, N. W., et al. (2009). A genetically mediated bias in decision making driven by failure of amygdala control. J. Neurosci. 29, 5985–5991. doi: 10.1523/JNEUROSCI.0407-09.2009
Shenhav, A., Straccia, M. A., Botvinick, M. M., and Cohen, J. D. (2016). Dorsal anterior cingulate and ventromedial prefrontal cortex have inverse roles in both foraging and economic choice. Cogn. Affect. Behav. Neurosci. 16, 1127–1139. doi: 10.3758/s13415-016-0458-8
Stauffer, W. R., Lak, A., Bossaerts, P., and Schultz, W. (2015). Economic choices reveal probability distortion in macaque monkeys. J. Neurosci. 35, 3146–3154. doi: 10.1523/JNEUROSCI.3653-14.2015
Stephens, D. W., and Krebs, J. R. (1986). Foraging Theory. Princeton, NJ: Princeton University Press.
Stevens, J. R., Hallinan, E. V., and Hauser, M. D. (2005). The ecology and evolution of patience in two New World monkeys. Biol. Lett. 1, 223–226. doi: 10.1098/rsbl.2004.0285
Symmonds, M., Wright, N. D., Bach, D. R., and Dolan, R. J. (2011). Deconstructing risk: separable encoding of variance and skewness in the brain. Neuroimage 58, 1139–1149. doi: 10.1016/j.neuroimage.2011.06.087
Takahashi, H., Matsui, H., Camerer, C., Takano, H., Kodaka, F., Ideno, T., et al. (2010). Dopamine D(1) receptors and nonlinear probability weighting in risky choice. J. Neurosci. 30, 16567–16572. doi: 10.1523/jneurosci.3933-10.2010
Tobler, P. N., Christopoulos, G. I., O’doherty, J. P., Dolan, R. J., and Schultz, W. (2008). Neuronal distortions of reward probability without choice. J. Neurosci. 28, 11703–11711. doi: 10.1523/JNEUROSCI.2870-08.2008
Tobler, P. N., Christopoulos, G. I., O’doherty, J. P., Dolan, R. J., and Schultz, W. (2009). Risk-dependent reward value signal in human prefrontal cortex. Proc. Natl. Acad. Sci. U.S.A. 106, 7185–7190. doi: 10.1073/pnas.0809599106
Tom, S. M., Fox, C. R., Trepel, C., and Poldrack, R. A. (2007). The neural basis of loss aversion in decision-making under risk. Science 315, 515–518. doi: 10.1126/science.1134239
Tversky, A., and Kahneman, D. (1992). Advances in prospect theory: cumulative representation of uncertainty. J. Risk Uncertain. 5, 297–323. doi: 10.1007/bf00122574
Walker, D. M., Bell, M. R., Flores, C., Gulley, J. M., Willing, J., and Paul, M. J. (2017). Adolescence and reward: making sense of neural and behavioral changes amid the chaos. J. Neurosci. 37, 10855–10866. doi: 10.1523/JNEUROSCI.1834-17.2017
Weber, E. U., Blais, A.-R., and Betz, N. E. (2002). A domain-specific risk-attitude scale: measuring risk perceptions and risk behaviors. J. Behav. Decis. Mak. 15, 263–290. doi: 10.1002/bdm.414
Keywords: risk attitude, subjective value, decision-making, monkeys, economic models
Citation: Fujimoto A and Minamimoto T (2019) Trait and State-Dependent Risk Attitude of Monkeys Measured in a Single-Option Response Task. Front. Neurosci. 13:816. doi: 10.3389/fnins.2019.00816
Received: 25 April 2019; Accepted: 22 July 2019;
Published: 07 August 2019.
Edited by:
Hiroshi Yamada, University of Tsukuba, JapanReviewed by:
Masatoshi Yoshida, National Institute for Physiological Sciences (NIPS), JapanShunsuke Kobayashi, Fukushima Medical University, Japan
Copyright © 2019 Fujimoto and Minamimoto. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Atsushi Fujimoto, YXRzdXNoaS5mdWppbW90b0Btc3NtLmVkdQ==; YS5mdWppbW90by5qdWw4QGdtYWlsLmNvbQ==; Takafumi Minamimoto, bWluYW1pbW90by50YWthZnVtaUBxc3QuZ28uanA=
†Present address: Atsushi Fujimoto, Nash Family Department of Neuroscience and Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, United States