Flexible goal-directed behavior requires a performance monitoring system to monitor behavioral consequences in order to detect the need for further adjustments and control. When a failure in performance is detected by the monitoring system, some signals are transmitted to the brain structures responsible for control implementation. Evidences suggest the anterior cingulate cortex (ACC) (Carter et al., 1998; Gehring and Knight, 2000; MacDonald et al., 2000; Ferdinand et al., 2012) and the lateral prefrontal cortex (lPFC) (MacDonald et al., 2000; Ridderinkhof et al., 2004a,b) as the neural correlates of performance monitoring and control implementation systems, respectively. The interaction of these two systems appears to modulate some components of event-related brain potentials (ERPs) linked with performance monitoring such as the error-related negativity (ERN), the N200, and the feedback-related negativity (FRN) (Gruendler et al., 2011). The ERN is an ERP component that begins close to the time of the erroneous response in speeded response time tasks and peaks about 100 ms later (Gehring et al., 1993). The N200 is another negative deflection in ERP that peaks between 200 and 400 ms after stimulus onset, prior to the response execution, on correct trials of cognitive control experiments (Olvet and Hajcak, 2008). The FRN as one of the most studied components is a negative-going deflection observed 230–330 ms following outcome presentation (Miltner et al., 1997) in gambling and trial-and-error learning tasks (Holroyd et al., 2006). Source localization studies show the neural source of the FRN to be located most probably in the ACC (Miltner et al., 1997; Gehring and Willoughby, 2002; Bellebaum and Daum, 2008; Hauser et al., 2014).
The central question in the interaction of performance monitoring and control systems is how the brain determines the need to recruit the intervention of control structures. The reinforcement learning (RL) account of performance monitoring and control is one of the influential theories to the field (Holroyd and Coles, 2002; Holroyd et al., 2005). The theory is based on the physiological evidences that reveal the similarity of the phasic activity of the mesencephalic dopamine system and reward prediction errors (RPEs) in temporal difference models of learning (Suri, 2002). The theory holds that the monitor is located in the basal ganglia, which produces RPE signals that indicate when events are better or worse than expected. These RPEs are used by the ACC to improve performance on the task at hand (Holroyd et al., 2005). According to the RL model, negative RPEs sent to the ACC generate the ERN and the FRN. Another prominent theory, the conflict-monitoring theory (CMT) proposes that the performance monitoring system monitors for the coactivation of mutually incompatible response tendencies or conflict during response selection. The CMT suggests that the ACC detects response-conflict signal and sends this information to the dorsolateral prefrontal cortex for further adjustment and control (Botvinick et al., 2001; Yeung et al., 2004). Based on this theory, the N2 and the ERN can be described using conflict signal. The CMT argues that the N2 and the ERN are electrophysiologically correlated with pre-response and post-response conflict signals, respectively. However, since no motor response exists after external feedback presentation, the CMT cannot account for the phenomena commencing after feedback onset, e.g., the FRN (Ullsperger et al., 2014). In our previous studies, we have explained the significance of integrating the computational models associated with the RL and the CMT (Zendehrouh et al., 2013, 2014). Since the unification of these two theories depends centrally on conflict signal definition, we propose a hypothetical cost-conflict monitor in the brain that extends the CMT theory to account for post feedback activities in feedback-based learning tasks. Based on this proposal, the FRN can be described using a cost-conflict signal.
The basis for our hypothetical cost-conflict monitor is that: (1) Theoretically, conflict can occur anywhere within the information processing system (Carter and van Veen, 2007). (2) Conflict-driven control is domain-specific suggested to be mediated by multiple, independent, and parallel-operating conflict monitor-controller loops in the brain (Egner, 2008). (3) The appraisal of costs and benefits associated with different candidate actions is a key aspect of decision-making.
The Delay-based and the effort-based costs (effort needed to perform an action in order to obtain a reward) are two types of costs that bias decision making (Floresco et al., 2008). In delay-based tasks, as the time passes, the subjective value of a reward is discounted hyperbolically (Green and Myerson, 2004). Also, the aversiveness of a negative event decreases hyperbolically with time (Murphy et al., 2001). Evidences suggest that discounting can happen across many reward types, reward magnitudes, and several timescales even in the order of tens of milliseconds (Haith et al., 2012). In this paper, it is hypothesized that in feedback-based learning tasks, the participants are faced with delay-based evaluations. Therefore, in these tasks, the time interval between response selection and feedback presentation gives rise to a cost. This delay elevates the cost of the rewarded outcome and reduces the cost of the non-rewarded outcome associated with the selected action. In fact, the conflict can be produced by simultaneous activation of the expected costs of possible outcomes that are mutually exclusive. Therefore, when a cost-conflict is detected by the monitoring system, the regulatory mechanism implements the required control, e.g., by modifying the excitatory weights to the response units. The cost-conflict signal that may occur between expected costs can show the amount of subjective transient uncertainty about what will happen that increases with time (delay) until receiving the actual outcome. The cost-conflict signal can also be viewed in the context of the emerging field of neuroeconomics as an ambiguity signal that may be present during decision-making. Ambiguity is defined as a lack of confidence in probability assignment to the possible outcomes (Kishida et al., 2010). This is consistent with investigations suggesting the existence of an ambiguity-sensitive mechanism in the ventromedial prefrontal cortex (vmPFC) (Glimcher and Rustichini, 2004), and also with the role of this area in delay cost coding (Prévost et al., 2010; Rushworth et al., 2011; Dreher, 2013).
This proposal can be validated by performing simple gambling games or probabilistic reinforcement learning tasks with feedback-timing manipulations at the timescale of milliseconds while measuring the brain responses with functional magnetic resonance imaging (fMRI) and electroencephalography (EEG) to identify the contributions of the ACC and the vmPFC in those tasks. Especially, the behaviors of addicted and depressed individuals in these tasks that show anomalies in value based decision making (Sharp et al., 2012) can be beneficial.
Therefore, the cost-conflict monitor as an independent and parallel loop to the response-conflict monitor detects the conflict between the costs of likely outcomes of the selected action and uses this information to adjust the behavior for the future, thereby implements trial-by-trial adjustments. Surely, this proposal is speculative and further experimental studies and research is needed to evaluate its merit. However, the proposal can provide promising avenues toward the unification of computational models associated with the RL and the CMT.
Statements
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
References
1
BellebaumC.DaumI. (2008). Learning-related changes in reward expectancy are reflected in the feedback-related negativity. Eur. J. Neurosci. 27, 1823–1835. 10.1111/j.1460-9568.2008.06138.x
2
BotvinickM. M.BraverT. S.BarchD. M.CarterC. S.CohenJ. D. (2001). Conflict monitoring and cognitive control. Psychol. Rev. 108, 624–652. 10.1037/0033-295X.108.3.624
3
CarterC. S.BraverT. S.BarchD. M.BotvinickM. M.NollD.CohenJ. D. (1998). Anterior cingulate cortex, error detection, and the online monitoring of performance. Science280, 747–749. 10.1126/science.280.5364.747
4
CarterC. S.van VeenV. (2007). Anterior cingulate cortex and conflict detection: an update of theory and data. Cogn. Affect. Behav. Neurosci. 7, 367–379. 10.3758/CABN.7.4.367
5
DreherJ.-C. (2013). Neural coding of computational factors affecting decision making. Prog. Brain Res. 202, 289–320. 10.1016/B978-0-444-62604-2.00016-2
6
EgnerT. (2008). Multiple conflict-driven control mechanisms in the human brain. Trends Cogn. Sci. 12, 374–380. 10.1016/j.tics.2008.07.001
7
FerdinandN. K.MecklingerA.KrayJ.GehringW. J. (2012). The processing of unexpected positive response outcomes in the mediofrontal cortex. J. Neurosci. 32, 12087–12092. 10.1523/JNEUROSCI.1410-12.2012
8
FlorescoS. B.St OngeJ. R.Ghods-SharifiS.WinstanleyC. A. (2008). Cortico-limbic-striatal circuits subserving different forms of cost-benefit decision making. Cogn. Affect. Behav. Neurosci. 8, 375–389. 10.3758/CABN.8.4.375
9
GehringW. J.GossB.ColesM. G. H.MeyerD. E.DonchinE. (1993). A neural system for error detection and compensation. Psychol. Sci. 4, 385–390. 10.1111/j.1467-9280.1993.tb00586.x
10
GehringW. J.KnightR. T. (2000). Prefrontal-cingulate interactions in action monitoring. Nat. Neurosci. 3, 516–520. 10.1038/74899
11
GehringW. J.WilloughbyA. R. (2002). The medial frontal cortex and the rapid processing of monetary gains and losses. Science295, 2279–2282. 10.1126/science.1066893
12
GlimcherP. W.RustichiniA. (2004). Neuroeconomics: the consilience of brain and decision. Science306, 447–452. 10.1126/science.1102566
13
GreenL.MyersonJ. (2004). A discounting framework for choice with delayed and probabilistic rewards. Psychol. Bull. 130, 769–792. 10.1037/0033-2909.130.5.769
14
GruendlerT. O. J.UllspergerM.HusterR. J. (2011). Event-related potential correlates of performance-monitoring in a lateralized time-estimation task. PloS ONE6:e25591. 10.1371/journal.pone.0025591
15
HaithA. M.ReppertT. R.ShadmehrR. (2012). Evidence for hyperbolic temporal discounting of reward in control of movements. J. Neurosci. 32, 11727–11736. 10.1523/JNEUROSCI.0424-12.2012
16
HauserT. U.IannacconeR.StämpfliP.DrechslerR.BrandeisD.WalitzaS.et al. (2014). The feedback-related negativity (FRN) revisited: new insights into the localization, meaning and network organization. Neuroimage84, 159–168. 10.1016/j.neuroimage.2013.08.028
17
HolroydC. B.ColesM. G. H. (2002). The neural basis of human error processing: reinforcement learning, dopamine, and the error-related negativity. Psychol. Rev. 109, 679–709. 10.1037/0033-295X.109.4.679
18
HolroydC. B.HajcakG.LarsenJ. T. (2006). The good, the bad and the neutral: electrophysiological responses to feedback stimuli. Brain Res. 1105, 93–101. 10.1016/j.brainres.2005.12.015
19
HolroydC. B.YeungN.ColesM. G. H.CohenJ. D. (2005). A mechanism for error detection in speeded response time tasks. J. Exp. Psychol. Gen. 134, 163–191. 10.1037/0096-3445.134.2.163
20
KishidaK. T.King-CasasB.MontagueP. R. (2010). Neuroeconomic approaches to mental disorders. Neuron67, 543–554. 10.1016/j.neuron.2010.07.021
21
MacDonaldA. W.3rd.CohenJ. D.StengerV. A.CarterC. S. (2000). Dissociating the role of the dorsolateral prefrontal and anterior cingulate cortex in cognitive control. Science288, 1835–1838. 10.1126/science.288.5472.1835
22
MiltnerW. H. R.BraunC. H.ColesM. G. H. (1997). Event-related brain potentials following incorrect feedback in a time-estimation task: evidence for a generic neural system for error detection. J. Cogn. Neurosci. 9, 788–798. 10.1162/jocn.1997.9.6.788
23
MurphyJ.VuchinichR.SimpsonC. (2001). Delayed reward and cost discounting. Psychol. Rec. 51, 571–588.
24
OlvetD. M.HajcakG. (2008). The error-related negativity (ERN) and psychopathology: toward an endophenotype. Clin. Psychol. Rev. 28, 1343–1354. 10.1016/j.cpr.2008.07.003
25
PrévostC.PessiglioneM.MétéreauE.Cléry-MelinM.-L.DreherJ.-C. (2010). Separate valuation subsystems for delay and effort decision costs. J. Neurosci. 30, 14080–14090. 10.1523/JNEUROSCI.2752-10.2010
26
RidderinkhofK. R.UllspergerM.CroneE. A.NieuwenhuisS. (2004a). The role of the medial frontal cortex in cognitive control. Science306, 443–447. 10.1126/science.1100301
27
RidderinkhofK. R.van den WildenbergW. P. M.SegalowitzS. J.CarterC. S. (2004b). Neurocognitive mechanisms of cognitive control: the role of prefrontal cortex in action selection, response inhibition, performance monitoring, and reward-based learning. Brain Cogn. 56, 129–140. 10.1016/j.bandc.2004.09.016
28
RushworthM. F. S.NoonanM. P.BoormanE. D.WaltonM. E.BehrensT. E. (2011). Frontal cortex and reward-guided learning and decision-making. Neuron70, 1054–1069. 10.1016/j.neuron.2011.05.014
29
SharpC.MonterossoJ.MontagueP. R. (2012). Neuroeconomics: a bridge for translational research. Biol. Psychiatry72, 87–92. 10.1016/j.biopsych.2012.02.029
30
SuriR. E. (2002). TD models of reward predictive responses in dopamine neurons. Neural Netw. 15, 523–533. 10.1016/S0893-6080(02)00046-1
31
UllspergerM.DanielmeierC.JochamG. (2014). Neurophysiology of performance monitoring and adaptive behavior. Physiol. Rev. 94, 35–79. 10.1152/physrev.00041.2012
32
YeungN.CohenJ. D.BotvinickM. M. (2004). The neural basis of error detection: conflict monitoring and the error-related negativity. Psychol. Rev. 111, 931–959. 10.1037/0033-295X.111.4.931
33
ZendehrouhS.GharibzadehS.TowhidkhahF. (2013). Modeling error detection in human brain: a preliminary unification of reinforcement learning and conflict monitoring theories. Neurocomputing103, 1–13. 10.1016/j.neucom.2012.04.026
34
ZendehrouhS.GharibzadehS.TowhidkhahF. (2014). Reinforcement-conflict based control: an integrative model of error detection in anterior cingulate cortex. Neurocomputing123, 140–149. 10.1016/j.neucom.2013.06.020
Summary
Keywords
performance monitoring, cognitive control, conflict-driven control, monitor-controller networks, feedback-related negativity
Citation
Zendehrouh S, Gharibzadeh S and Towhidkhah F (2014) The hypothetical cost-conflict monitor: is it a possible trigger for conflict-driven control mechanisms in the human brain?. Front. Comput. Neurosci. 8:77. doi: 10.3389/fncom.2014.00077
Received
17 May 2014
Accepted
30 June 2014
Published
21 July 2014
Volume
8 - 2014
Edited by
Tobias Alecio Mattei, Ohio State University, USA
Reviewed by
Tobias Alecio Mattei, Ohio State University, USA; Carlos Rodrigo Goulart, Ohio State University, USA
Copyright
© 2014 Zendehrouh, Gharibzadeh and Towhidkhah.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: sareh.zendehrouh@gmail.com
This article was submitted to the journal Frontiers in Computational Neuroscience.
Disclaimer
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.