Hubbry Logo
Reward systemReward systemMain
Open search
Reward system
Community hub
Reward system
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Contribute something
Reward system
Reward system
from Wikipedia

Examples of primary rewards.[1] Clockwise from top left: water, food, parental care, and sex.

The reward system (the mesocorticolimbic circuit) is a group of neural structures responsible for incentive salience (i.e., "wanting"; desire or craving for a reward and motivation), associative learning (primarily positive reinforcement and classical conditioning), and positively-valenced emotions, particularly ones involving pleasure as a core component (e.g., joy, euphoria and ecstasy).[2][3] Reward is the attractive and motivational property of a stimulus that induces appetitive behavior, also known as approach behavior, and consummatory behavior.[2] A rewarding stimulus has been described as "any stimulus, object, event, activity, or situation that has the potential to make us approach and consume it is by definition a reward".[2] In operant conditioning, rewarding stimuli function as positive reinforcers;[1] however, the converse statement also holds true: positive reinforcers are rewarding.[1][4] The reward system motivates animals to approach stimuli or engage in behaviour that increases fitness (sex, energy-dense foods, etc.). Survival for most animal species depends upon maximizing contact with beneficial stimuli and minimizing contact with harmful stimuli. Reward cognition serves to increase the likelihood of survival and reproduction by causing associative learning, eliciting approach and consummatory behavior, and triggering positively-valenced emotions.[1] Thus, reward is a mechanism that evolved to help increase the adaptive fitness of animals.[5] In drug addiction, certain substances over-activate the reward circuit, leading to compulsive substance-seeking behavior resulting from synaptic plasticity in the circuit.[6]

Primary rewards are a class of rewarding stimuli which facilitate the survival of one's self and offspring, and they include homeostatic (e.g., palatable food) and reproductive (e.g., sexual contact and parental investment) rewards.[2][7] Intrinsic rewards are unconditioned rewards that are attractive and motivate behavior because they are inherently pleasurable.[2] Extrinsic rewards (e.g., money or seeing one's favorite sports team winning a game) are conditioned rewards that are attractive and motivate behavior but are not inherently pleasurable.[2][8] Extrinsic rewards derive their motivational value as a result of a learned association (i.e., conditioning) with intrinsic rewards.[2] Extrinsic rewards may also elicit pleasure (e.g., euphoria from winning a lot of money in a lottery) after being classically conditioned with intrinsic rewards.[2]

Definition

[edit]
Unsolved problem in biology
How and where does the brain evaluate reward value and effort (cost) to modulate behavior?

In neuroscience, the reward system is a collection of brain structures and neural pathways that are responsible for reward-related cognition, including associative learning (primarily classical conditioning and operant reinforcement), incentive salience (i.e., motivation and "wanting", desire, or craving for a reward), and positively-valenced emotions, particularly emotions that involve pleasure (i.e., hedonic "liking").[1][3]

Reward related activities, such as feeding, exercise, sex, substance use, and social interactions play a factor in elevated levels of dopamine, ultimately altering the CNS (or the central nervous system). Dopamine is the chemical messenger that plays a role in regulating mood, motivation, reward, and pleasure.[9]

Terms that are commonly used to describe behavior related to the "wanting" or desire component of reward include appetitive behavior, approach behavior, preparatory behavior, instrumental behavior, anticipatory behavior, and seeking.[10] Terms that are commonly used to describe behavior related to the "liking" or pleasure component of reward include consummatory behavior and taking behavior.[10]

The three primary functions of rewards are their capacity to:

  1. produce associative learning (i.e., classical conditioning and operant reinforcement);[1]
  2. affect decision-making and induce approach behavior (via the assignment of motivational salience to rewarding stimuli);[1]
  3. elicit positively-valenced emotions, particularly pleasure.[1]

Neuroanatomy

[edit]

Overview

[edit]

The brain structures that compose the reward system are located primarily within the cortico-basal ganglia-thalamo-cortical loop;[11] the basal ganglia portion of the loop drives activity within the reward system.[11] Most of the pathways that connect structures within the reward system are glutamatergic interneurons, GABAergic medium spiny neurons (MSNs), and dopaminergic projection neurons,[11][12] although other types of projection neurons contribute (e.g., orexinergic projection neurons). The reward system includes the ventral tegmental area, ventral striatum (i.e., the nucleus accumbens and olfactory tubercle), dorsal striatum (i.e., the caudate nucleus and putamen), substantia nigra (i.e., the pars compacta and pars reticulata), prefrontal cortex, anterior cingulate cortex, insular cortex, hippocampus, hypothalamus (particularly, the orexinergic nucleus in the lateral hypothalamus), thalamus (multiple nuclei), subthalamic nucleus, globus pallidus (both external and internal), ventral pallidum, parabrachial nucleus, amygdala, and the remainder of the extended amygdala.[3][11][13][14][15] The dorsal raphe nucleus and cerebellum appear to modulate some forms of reward-related cognition (i.e., associative learning, motivational salience, and positive emotions) and behaviors as well.[16][17][18] The laterodorsal tegmental nucleus (LDT), pedunculopontine nucleus (PPTg), and lateral habenula (LHb) (both directly and indirectly via the rostromedial tegmental nucleus (RMTg)) are also capable of inducing aversive salience and incentive salience through their projections to the ventral tegmental area (VTA).[19] The LDT and PPTg both send glutaminergic projections to the VTA that synapse on dopaminergic neurons, both of which can produce incentive salience. The LHb sends glutaminergic projections, the majority of which synapse on GABAergic RMTg neurons that in turn drive inhibition of dopaminergic VTA neurons, although some LHb projections terminate on VTA interneurons. These LHb projections are activated both by aversive stimuli and by the absence of an expected reward, and excitation of the LHb can induce aversion.[20][21][22]

Most of the dopamine pathways (i.e., neurons that use the neurotransmitter dopamine to communicate with other neurons) that project out of the ventral tegmental area are part of the reward system;[11] in these pathways, dopamine acts on D1-like receptors or D2-like receptors to either stimulate (D1-like) or inhibit (D2-like) the production of cAMP.[23] The GABAergic medium spiny neurons of the striatum are components of the reward system as well.[11] The glutamatergic projection nuclei in the subthalamic nucleus, prefrontal cortex, hippocampus, thalamus, and amygdala connect to other parts of the reward system via glutamate pathways.[11] The medial forebrain bundle, which is a set of many neural pathways that mediate brain stimulation reward (i.e., reward derived from direct electrochemical stimulation of the lateral hypothalamus), is also a component of the reward system.[24]

Development of dopamine pathways is of particular importance in adolescents. There are a number of reward-associated behaviors that arise during adolescence partly due to developments of the DA reward system.[25] D1 and D2 receptors peak in adolescence as a result on neural maturation processes including synaptic pruning which can result in altered reward sensitivity.[26]

Two theories exist with regard to the activity of the nucleus accumbens and the generation liking and wanting. The inhibition (or hyper­polar­ization) hypothesis proposes that the nucleus accumbens exerts tonic inhibitory effects on downstream structures such as the ventral pallidum, hypothalamus or ventral tegmental area, and that in inhibiting MSNs in the nucleus accumbens (NAcc), these structures are excited, "releasing" reward related behavior. While GABA receptor agonists are capable of eliciting both "liking" and "wanting" reactions in the nucleus accumbens, glutaminergic inputs from the basolateral amygdala, ventral hippocampus, and medial prefrontal cortex can drive incentive salience. Furthermore, while most studies find that NAcc neurons reduce firing in response to reward, a number of studies find the opposite response. This had led to the proposal of the disinhibition (or depolarization) hypothesis, that proposes that excitation or NAcc neurons, or at least certain subsets, drives reward related behavior.[3][27][28]


After nearly 50 years of research on brain-stimulation reward, experts have certified that dozens of sites in the brain will maintain intracranial self-stimulation. Regions include the lateral hypothalamus and medial forebrain bundles, which are especially effective. Stimulation there activates fibers that form the ascending pathways; the ascending pathways include the mesolimbic dopamine pathway, which projects from the ventral tegmental area to the nucleus accumbens. There are several explanations as to why the mesolimbic dopamine pathway is central to circuits mediating reward. First, there is a marked increase in dopamine release from the mesolimbic pathway when animals engage in intracranial self-stimulation.[5] Second, experiments consistently indicate that brain-stimulation reward stimulates the reinforcement of pathways that are normally activated by natural rewards, and drug reward or intracranial self-stimulation can exert more powerful activation of central reward mechanisms because they activate the reward center directly rather than through the peripheral nerves.[5][29][30] Third, when animals are administered addictive drugs or engage in naturally rewarding behaviors, such as feeding or sexual activity, there is a marked release of dopamine within the nucleus accumbens.[5] However, dopamine is not the only reward compound in the brain.

Key pathway

[edit]
Diagram showing some of the key components of the mesocorticolimbic ("reward") circuit

Ventral tegmental area

  • The ventral tegmental area (VTA) is important in responding to stimuli and cues that indicate a reward is present. Rewarding stimuli (and all addictive drugs) act on the circuit by triggering the VTA to release dopamine signals to the nucleus accumbens, either directly or indirectly.[citation needed] The VTA has two important pathways: The mesolimbic pathway projecting to limbic (striatal) regions and underpinning the motivational behaviors and processes, and the mesocortical pathway projecting to the prefrontal cortex, underpinning cognitive functions, such as learning external cues, etc.[31]
  • Dopaminergic neurons in this region converts the amino acid tyrosine into DOPA using the enzyme tyrosine hydroxylase, which is then converted to dopamine using the enzyme DOPA decarboxylase.[32]

Striatum (Nucleus Accumbens)

  • The striatum is broadly involved in acquiring and eliciting learned behaviors in response to a rewarding cue. The VTA projects to the striatum, and activates the GABA-ergic Medium Spiny Neurons via D1 and D2 receptors within the ventral (Nucleus Accumbens) and dorsal striatum.[33]
  • The Ventral Striatum (the Nucleus Accumbens) is broadly involved in acquiring behavior when fed into by the VTA, and eliciting behavior when fed into by the PFC. The NAc shell projects to the pallidum and the VTA, regulating limbic and autonomic functions. This modulates the reinforcing properties of stimuli, and short term aspects of reward. The NAc Core projects to the substantia nigra and is involved in the development of reward-seeking behaviors and its expression. It is involved in spatial learning, conditional response, and impulsive choice; the long term elements of reward.[31]
  • The Dorsal Striatum is involved in learning, the Dorsal Medial Striatum in goal directed learning, and the Dorsal Lateral Striatum in stimulus-response learning foundational to Pavlovian response.[34] On repeated activation by a stimuli, the Nucleus Accumbens can activate the Dorsal Striatum via an intrastriatal loop. The transition of signals from the NAc to the DS allows reward associated cues to activate the DS without the reward itself being present. This can activate cravings and reward-seeking behaviors (and is responsible for triggering relapse during abstinence in addiction).[35]

Prefrontal Cortex

  • The VTA dopaminergic neurons project to the PFC, activating glutaminergic neurons that project to multiple other regions, including the Dorsal Striatum and NAc, ultimately allowing the PFC to mediate salience and conditional behaviors in response to stimuli.[35]
  • Notably, abstinence from addicting drugs activates the PFC, glutamatergic projection to the NAc, which leads to strong cravings, and modulates reinstatement of addiction behaviors resulting from abstinence. The PFC also interacts with the VTA through the mesocortical pathway, and helps associate environmental cues with the reward.[35]
  • There are several parts of the brain related to the prefrontal cortex that help with decision-making in different ways. The dACC (dorsal anterior cingulate cortex) tracks effort, conflict, and mistakes. The vmPFC (ventromedial prefrontal cortex) focuses on what feels rewarding and helps make choices based on personal preferences. The OFC (orbitofrontal cortex) evaluates options and predicts their outcomes to guide decisions. Together, they work with dopamine signals to process rewards and actions.[36]

Hippocampus

  • The Hippocampus has multiple functions, including in the creation and storage of memories. In the reward circuit, it serves to contextual memories and associated cues. It ultimately underpins the reinstatement of reward-seeking behaviors via cues, and contextual triggers.[37]

Amygdala

  • The AMY receives input from the VTA, and outputs to the NAc. The amygdala is important in creating powerful emotional flashbulb memories, and likely underpins the creation of strong cue-associated memories.[38] It also is important in mediating the anxiety effects of withdrawal, and increased drug intake in addiction.[39]

Pleasure centers

[edit]

Pleasure is a component of reward, but not all rewards are pleasurable (e.g., money does not elicit pleasure unless this response is conditioned).[2] Stimuli that are naturally pleasurable, and therefore attractive, are known as intrinsic rewards, whereas stimuli that are attractive and motivate approach behavior, but are not inherently pleasurable, are termed extrinsic rewards.[2] Extrinsic rewards (e.g., money) are rewarding as a result of a learned association with an intrinsic reward.[2] In other words, extrinsic rewards function as motivational magnets that elicit "wanting", but not "liking" reactions once they have been acquired.[2]

The reward system contains pleasure centers or hedonic hotspots – i.e., brain structures that mediate pleasure or "liking" reactions from intrinsic rewards. As of October 2017, hedonic hotspots have been identified in subcompartments within the nucleus accumbens shell, ventral pallidum, parabrachial nucleus, orbitofrontal cortex (OFC), and insular cortex.[3][15][40] The raphe nucleus has also been implicated.[41] The hotspot within the nucleus accumbens shell is located in the rostrodorsal quadrant of the medial shell, while the hedonic coldspot is located in a more posterior region. The posterior ventral pallidum also contains a hedonic hotspot, while the anterior ventral pallidum contains a hedonic coldspot. In rats, microinjections of opioids, endocannabinoids, and orexin are capable of enhancing liking reactions in these hotspots.[3] The hedonic hotspots located in the anterior OFC and posterior insula have been demonstrated to respond to orexin and opioids in rats, as has the overlapping hedonic coldspot in the anterior insula and posterior OFC.[40] On the other hand, the parabrachial nucleus hotspot has only been demonstrated to respond to benzodiazepine receptor agonists.[3]

Hedonic hotspots are functionally linked, in that activation of one hotspot results in the recruitment of the others, as indexed by the induced expression of c-Fos, an immediate early gene. Furthermore, inhibition of one hotspot results in the blunting of the effects of activating another hotspot.[3][40] Therefore, the simultaneous activation of every hedonic hotspot within the reward system is believed to be necessary for generating the sensation of an intense euphoria.[42]

Wanting and liking

[edit]
Tuning of appetitive and defensive reactions in the nucleus accumbens shell (above). AMPA blockade requires D1 function in order to produce motivated behaviors, regardless of valence, and D2 function to produce defensive behaviors. GABA agonism, on the other hand, does not requires dopamine receptor function (below). The expansion of the anatomical regions that produce defensive behaviors under stress, and appetitive behaviors in the home environment produced by AMPA antagonism. This flexibility is less evident with GABA agonism.[27]

Incentive salience is the "wanting" or "desire" attribute, which includes a motivational component, that is assigned to a rewarding stimulus by the nucleus accumbens shell (NAcc shell).[2][43][44] The degree of dopamine neurotransmission into the NAcc shell from the mesolimbic pathway is highly correlated with the magnitude of incentive salience for rewarding stimuli.[43]

Activation of the dorsorostral region of the nucleus accumbens correlates with increases in wanting without concurrent increases in liking.[45] However, dopaminergic neurotransmission into the nucleus accumbens shell is responsible not only for appetitive motivational salience (i.e., incentive salience) towards rewarding stimuli, but also for aversive motivational salience, which directs behavior away from undesirable stimuli.[10][46][47] In the dorsal striatum, activation of D1 expressing MSNs produces appetitive incentive salience, while activation of D2 expressing MSNs produces aversion. In the NAcc, such a dichotomy is not as clear cut, and activation of both D1 and D2 MSNs is sufficient to enhance motivation,[48][49] likely via disinhibiting the VTA through inhibiting the ventral pallidum.[50][51]

Terry Robinson and Kent Berridge's 1993 incentive-sensitization theory proposed that reward contains separable psychological components: wanting (incentive) and liking (pleasure).[52] To explain increasing contact with a certain stimulus such as chocolate, there are two independent factors at work – our desire to have the chocolate (wanting) and the pleasure effect of the chocolate (liking). According to Robinson and Berridge, wanting and liking are two aspects of the same process, so rewards are usually wanted and liked at the same time. However, wanting and liking also change independently under certain circumstances. For example, rats that do not eat after receiving dopamine (experiencing a loss of desire for food) act as though they still like food. In another example, activated self-stimulation electrodes in the lateral hypothalamus of rats increase appetite, but also cause more adverse reactions to tastes such as sugar and salt; apparently, the stimulation increases wanting but not liking. Such results demonstrate that the reward system of rats includes independent processes of wanting and liking. The wanting component is thought to be controlled by dopaminergic pathways, whereas the liking component is thought to be controlled by opiate-GABA-endocannabinoids systems.[5]

Anti-reward system

[edit]

Koobs & Le Moal proposed that there exists a separate circuit responsible for the attenuation of reward-pursuing behavior, which they termed the anti-reward circuit. This component acts as brakes on the reward circuit, thus preventing the over pursuit of food, sex, etc. This circuit involves multiple parts of the amygdala (the bed nucleus of the stria terminalis, the central nucleus), the Nucleus Accumbens, and signal molecules including norepinephrine, corticotropin-releasing factor, and dynorphin.[53] This circuit is also hypothesized to mediate the unpleasant components of stress, and is thus thought to be involved in addiction and withdrawal. While the reward circuit mediates the initial positive reinforcement involved in the development of addiction, it is the anti-reward circuit that later dominates via negative reinforcement that motivates the pursuit of the rewarding stimuli.[54]

Learning

[edit]

Rewarding stimuli can drive learning in both the form of classical conditioning (Pavlovian conditioning) and operant conditioning (instrumental conditioning). In classical conditioning, a reward can act as an unconditioned stimulus that, when associated with the conditioned stimulus, causes the conditioned stimulus to elicit both musculoskeletal (in the form of simple approach and avoidance behaviors) and vegetative responses. In operant conditioning, a reward may act as a reinforcer in that it increases or supports actions that lead to itself.[1] Learned behaviors may or may not be sensitive to the value of the outcomes they lead to; behaviors that are sensitive to the contingency of an outcome on the performance of an action as well as the outcome value are goal-directed, while elicited actions that are insensitive to contingency or value are called habits.[55] This distinction is thought to reflect two forms of learning, model free and model based. Model free learning involves the simple caching and updating of values. In contrast, model based learning involves the storage and construction of an internal model of events that allows inference and flexible prediction. Although pavlovian conditioning is generally assumed to be model-free, the incentive salience assigned to a conditioned stimulus is flexible with regard to changes in internal motivational states.[56]

Distinct neural systems are responsible for learning associations between stimuli and outcomes, actions and outcomes, and stimuli and responses. Although classical conditioning is not limited to the reward system, the enhancement of instrumental performance by stimuli (i.e., Pavlovian-instrumental transfer) requires the nucleus accumbens. Habitual and goal directed instrumental learning are dependent upon the lateral striatum and the medial striatum, respectively.[55]

During instrumental learning, opposing changes in the ratio of AMPA to NMDA receptors and phosphorylated ERK occurs in the D1-type and D2-type MSNs that constitute the direct and indirect pathways, respectively.[57][58] These changes in synaptic plasticity and the accompanying learning is dependent upon activation of striatal D1 and NMDA receptors. The intracellular cascade activated by D1 receptors involves the recruitment of protein kinase A, and through resulting phosphorylation of DARPP-32, the inhibition of phosphatases that deactivate ERK. NMDA receptors activate ERK through a different but interrelated Ras-Raf-MEK-ERK pathway. Alone NMDA mediated activation of ERK is self-limited, as NMDA activation also inhibits PKA mediated inhibition of ERK deactivating phosphatases. However, when D1 and NMDA cascades are co-activated, they work synergistically, and the resultant activation of ERK regulates synaptic plasticity in the form of spine restructuring, transport of AMPA receptors, regulation of CREB, and increasing cellular excitability via inhibiting Kv4.2.[59][60][61]

Disorders

[edit]

Addiction

[edit]

ΔFosB (DeltaFosB) – a gene transcription factoroverexpression in the D1-type medium spiny neurons of the nucleus accumbens is the crucial common factor among virtually all forms of addiction (i.e., behavioral addictions and drug addictions) that induces addiction-related behavior and neural plasticity.[62][63][64][65] In particular, ΔFosB promotes self-administration, reward sensitization, and reward cross-sensitization effects among specific addictive drugs and behaviors.[62][63][64][66][67] Certain epigenetic modifications of histone protein tails (i.e., histone modifications) in specific regions of the brain are also known to play a crucial role in the molecular basis of addictions.[65][68][69][70]

Addictive drugs and behaviors are rewarding and reinforcing (i.e., are addictive) due to their effects on the dopamine reward pathway.[14][71]

The lateral hypothalamus and medial forebrain bundle has been the most-frequently-studied brain-stimulation reward site, particularly in studies of the effects of drugs on brain stimulation reward.[72] The neurotransmitter system that has been most-clearly identified with the habit-forming actions of drugs-of-abuse is the mesolimbic dopamine system, with its efferent targets in the nucleus accumbens and its local GABAergic afferents. The reward-relevant actions of amphetamine and cocaine are in the dopaminergic synapses of the nucleus accumbens and perhaps the medial prefrontal cortex. Rats also learn to lever-press for cocaine injections into the medial prefrontal cortex, which works by increasing dopamine turnover in the nucleus accumbens.[73][74] Nicotine infused directly into the nucleus accumbens also enhances local dopamine release, presumably by a presynaptic action on the dopaminergic terminals of this region. Nicotinic receptors localize to dopaminergic cell bodies and local nicotine injections increase dopaminergic cell firing that is critical for nicotinic reward.[75][76] Some additional habit-forming drugs are also likely to decrease the output of medium spiny neurons as a consequence, despite activating dopaminergic projections. For opiates, the lowest-threshold site for reward effects involves actions on GABAergic neurons in the ventral tegmental area, a secondary site of opiate-rewarding actions on medium spiny output neurons of the nucleus accumbens. Thus the following form the core of currently characterised drug-reward circuitry; GABAergic afferents to the mesolimbic dopamine neurons (primary substrate of opiate reward), the mesolimbic dopamine neurons themselves (primary substrate of psychomotor stimulant reward), and GABAergic efferents to the mesolimbic dopamine neurons (a secondary site of opiate reward).[72]

Motivation

[edit]

Dysfunctional motivational salience appears in a number of psychiatric symptoms and disorders. Anhedonia, traditionally defined as a reduced capacity to feel pleasure, has been re-examined as reflecting blunted incentive salience, as most anhedonic populations exhibit intact "liking".[77][78] On the other end of the spectrum, heightened incentive salience that is narrowed for specific stimuli is characteristic of behavioral and drug addictions. In the case of fear or paranoia, dysfunction may lie in elevated aversive salience.[79] In modern literature, anhedonia is associated with the proposed two forms of pleasure, "anticipatory" and "consummatory".

Neuroimaging studies across diagnoses associated with anhedonia have reported reduced activity in the OFC and ventral striatum.[80] One meta analysis reported anhedonia was associated with reduced neural response to reward anticipation in the caudate nucleus, putamen, nucleus accumbens and medial prefrontal cortex (mPFC).[81]

Reward system development is particularly important in adolescents, as they can be prone to increased risk-taking behaviors, substance use disorders, and mood dysregulation.[82] Due to dopamine's role in reward processing, it can be linked to addictive behavior that may arise in adolescence. Risk taking behaviors and reward anticipation of monetary reward have been found to increase ventral striatum activity, a key region highlighted in the development of reward pathways.[82]

Mood disorders

[edit]

Certain types of depression are associated with reduced motivation, as assessed by willingness to expend effort for reward. These abnormalities have been tentatively linked to reduced activity in areas of the striatum, and while dopaminergic abnormalities are hypothesized to play a role, most studies probing dopamine function in depression have reported inconsistent results.[83][84] Although postmortem and neuroimaging studies have found abnormalities in numerous regions of the reward system, few findings are consistently replicated. Some studies have reported reduced NAcc, hippocampus, medial prefrontal cortex (mPFC), and orbitofrontal cortex (OFC) activity, as well as elevated basolateral amygdala and subgenual cingulate cortex (sgACC) activity during tasks related to reward or positive stimuli. These neuroimaging abnormalities are complemented by little post mortem research, but what little research has been done suggests reduced excitatory synapses in the mPFC.[85] Reduced activity in the mPFC during reward related tasks appears to be localized to more dorsal regions(i.e. the pregenual cingulate cortex), while the more ventral sgACC is hyperactive in depression.[86]

Attempts to investigate underlying neural circuitry in animal models has also yielded conflicting results. Two paradigms are commonly used to simulate depression, chronic social defeat (CSDS), and chronic mild stress (CMS), although many exist. CSDS produces reduced preference for sucrose, reduced social interactions, and increased immobility in the forced swim test. CMS similarly reduces sucrose preference, and behavioral despair as assessed by tail suspension and forced swim tests. Animals susceptible to CSDS exhibit increased phasic VTA firing, and inhibition of VTA-NAcc projections attenuates behavioral deficits induced by CSDS.[87] However, inhibition of VTA-mPFC projections exacerbates social withdrawal. On the other hand, CMS associated reductions in sucrose preference and immobility were attenuated and exacerbated by VTA excitation and inhibition, respectively.[88][89] Although these differences may be attributable to different stimulation protocols or poor translational paradigms, variable results may also lie in the heterogenous functionality of reward related regions.[90]

Optogenetic stimulation of the mPFC as a whole produces antidepressant effects. This effect appears localized to the rodent homologue of the pgACC (the prelimbic cortex), as stimulation of the rodent homologue of the sgACC (the infralimbic cortex) produces no behavioral effects. Furthermore, deep brain stimulation in the infralimbic cortex, which is thought to have an inhibitory effect, also produces an antidepressant effect. This finding is congruent with the observation that pharmacological inhibition of the infralimbic cortex attenuates depressive behaviors.[90]

Schizophrenia

[edit]

Schizophrenia is associated with deficits in motivation, commonly grouped under other negative symptoms such as reduced spontaneous speech. The experience of "liking" is frequently reported to be intact,[91] both behaviorally and neurally, although results may be specific to certain stimuli, such as monetary rewards.[92] Furthermore, implicit learning and simple reward-related tasks are also intact in schizophrenia.[93] Rather, deficits in the reward system are apparent during reward-related tasks that are cognitively complex. These deficits are associated with both abnormal striatal and OFC activity, as well as abnormalities in regions associated with cognitive functions such as the dorsolateral prefrontal cortex (DLPFC).[94]

Attention deficit hyperactivity disorder

[edit]

In those with ADHD, core aspects of the reward system are underactive, making it challenging to derive reward from regular activities. Those with the disorder experience a boost of motivation after a high-stimulation behaviour triggers a release of dopamine. In the aftermath of that boost and reward, the return to baseline levels results in an immediate drop in motivation.[95]

People with more ADHD-related behaviors show weaker brain responses to reward anticipation (not reward delivery), especially in the nucleus accumbens. While there is the initial boost of motivation and release of dopamine, as stated above, there is a higher risk of a noticeable drop in motivation.

Impairments of dopaminergic and serotonergic function are said to be key factors in ADHD.[96] These impairments can lead to executive dysfunction such as dysregulation of reward processing and motivational dysfunction, including anhedonia.[97]

History

[edit]
Skinner box

The first clue to the presence of a reward system in the brain came with an accidental discovery by James Olds and Peter Milner in 1954. They discovered that rats would perform behaviors such as pressing a bar, to administer a brief burst of electrical stimulation to specific sites in their brains. This phenomenon is called intracranial self-stimulation or brain stimulation reward. Typically, rats will press a lever hundreds or thousands of times per hour to obtain this brain stimulation, stopping only when they are exhausted. While trying to teach rats how to solve problems and run mazes, stimulation of certain regions of the brain where the stimulation was found seemed to give pleasure to the animals. They tried the same thing with humans and the results were similar. The explanation to why animals engage in a behavior that has no value to the survival of either themselves or their species is that the brain stimulation is activating the system underlying reward.[98]

In a fundamental discovery made in 1954, researchers James Olds and Peter Milner found that low-voltage electrical stimulation of certain regions of the brain of the rat acted as a reward in teaching the animals to run mazes and solve problems.[99][failed verification][100] It seemed that stimulation of those parts of the brain gave the animals pleasure,[99] and in later work humans reported pleasurable sensations from such stimulation.[citation needed] When rats were tested in Skinner boxes where they could stimulate the reward system by pressing a lever, the rats pressed for hours.[100] Research in the next two decades established that dopamine is one of the main chemicals aiding neural signaling in these regions, and dopamine was suggested to be the brain's "pleasure chemical".[101]

More recently, in 2018, Ivan De Araujo and colleagues used nutrients inside the gut to stimulate the reward system via the vagus nerve.[102]

Earlier history

[edit]

Ivan Pavlov was a psychologist who used the reward system to study classical conditioning in the late 19th century. Pavlov used the reward system by rewarding dogs with food after they had heard a bell or another stimulus. Pavlov was rewarding the dogs so that the dogs associated food, the reward, with the bell, the stimulus.[103] Around the same time, Edward Thorndike used the reward system to study operant conditioning. He began by putting cats in a puzzle box and placing food outside of the box so that the cat wanted to escape. The cats worked to get out of the puzzle box to get to the food. Although the cats ate the food after they escaped the box, Thorndike learned that the cats attempted to escape the box without the reward of food. Thorndike used the rewards of food and freedom to stimulate the reward system of the cats. Thorndike used this to see how the cats learned to escape the box.[104]

Other species

[edit]

Animals quickly learn to press a bar to obtain an injection of opiates directly into the midbrain tegmentum or the nucleus accumbens. The same animals do not work to obtain the opiates if the dopaminergic neurons of the mesolimbic pathway are inactivated. In this perspective, animals, like humans, engage in behaviors that increase dopamine release.

Kent Berridge, a researcher in affective neuroscience, found that sweet (liked ) and bitter (disliked ) tastes produced distinct orofacial expressions, and these expressions were similarly displayed by human newborns, orangutans, and rats. This was evidence that pleasure (specifically, liking) has objective features and was essentially the same across various animal species. Most neuroscience studies have shown that the more dopamine released by the reward, the more effective the reward is. This is called the hedonic impact, which can be changed by the effort for the reward and the reward itself. Berridge discovered that blocking dopamine systems did not seem to change the positive reaction to something sweet (as measured by facial expression). In other words, the hedonic impact did not change based on the amount of sugar. This discounted the conventional assumption that dopamine mediates pleasure. Even with more-intense dopamine alterations, the data seemed to remain constant.[105] However, a clinical study from January 2019 that assessed the effect of a dopamine precursor (levodopa), antagonist (risperidone), and a placebo on reward responses to music – including the degree of pleasure experienced during musical chills, as measured by changes in electrodermal activity as well as subjective ratings – found that the manipulation of dopamine neurotransmission bidirectionally regulates pleasure cognition (specifically, the hedonic impact of music) in human subjects.[106][107] This research demonstrated that increased dopamine neurotransmission acts as a sine qua non condition for pleasurable hedonic reactions to music in humans.[106][107]

Berridge developed the incentive salience hypothesis to address the wanting aspect of rewards. It explains the compulsive use of drugs by drug addicts even when the drug no longer produces euphoria, and the cravings experienced even after the individual has finished going through withdrawal. Some addicts respond to certain stimuli involving neural changes caused by drugs. This sensitization in the brain is similar to the effect of dopamine because wanting and liking reactions occur. Human and animal brains and behaviors experience similar changes regarding reward systems because these systems are so prominent.[105]

See also

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
The reward system, also known as the mesolimbic system, is a network of interconnected structures and neural pathways that processes rewards, motivates , and reinforces learning by associating environmental stimuli with pleasurable outcomes, primarily through the release of the . This system evolved to promote survival-enhancing actions, such as seeking , , and social bonds, by generating feelings of and in response to natural reinforcers. Central to its function is the modulation of goal-directed , where signals not only the receipt of rewards but also their , enabling adaptive and habit formation. Key components of the reward system include the (VTA) in the , which serves as the primary source of dopamine-producing neurons, and the (NAc) in the ventral , a major target region where release culminates in the subjective experience of reward. The system encompasses two primary pathways: the , connecting the VTA to the NAc and facilitating immediate reward processing and motivation, and the , linking the VTA to the to support like planning and impulse control. Additional structures, such as the , , basolateral , and hippocampus, integrate sensory information, emotional context, and memory to refine reward valuation and prediction error signaling. Beyond natural rewards, the system plays a critical role in when dysregulated; for instance, addictive substances hijack these pathways by inducing excessive surges, leading to compulsive behaviors and tolerance. Conversely, underactivation contributes to conditions like depression and , impairing , while optimal functioning enhances stress resilience and overall by buffering negative emotional states. Research highlights the system's plasticity, influenced by , environment, and experience, underscoring its importance in both therapeutic interventions and understanding .

Introduction

Definition

The reward system is a group of interconnected structures and neural pathways responsible for detecting and processing rewarding stimuli, which in turn reinforce behaviors essential for and by eliciting sensations of and . This system evolved to prioritize actions that promote adaptive outcomes, such as seeking or social bonds, by associating them with positive affective states. At a high level, the system involves the mesolimbic pathway as a primary circuit, where dopamine acts as the key neuromodulator signaling the salience and incentive value of rewards. Dopamine release in this pathway enhances the drive to pursue rewarding experiences without directly encoding the hedonic pleasure itself. The psychological foundations of rewards as reinforcers originated in behavioral psychology during the mid-20th century, rooted in operant conditioning theories. Pioneering work by B.F. Skinner in the 1930s and 1940s formalized the idea that positive reinforcers, or rewards, increase the likelihood of repeated actions, laying the groundwork for later neuroscientific explorations of underlying brain mechanisms. This framework intersected with neuroscience in the 1950s, notably through experiments by James Olds and Peter Milner demonstrating that rats would avidly self-administer electrical stimulation to specific brain sites, such as the septal area, revealing a centralized reward architecture.

Functions and significance

The reward system plays a fundamental biological role in promoting essential behaviors through mechanisms. It drives individuals to seek out and repeat actions associated with positive outcomes, such as consuming nutritious , engaging in reproductive activities, and forming social bonds, thereby enhancing fitness and perpetuating survival. For instance, activation of this system reinforces feeding behaviors by associating nutrient intake with pleasurable sensations, motivating and energy acquisition in resource-scarce environments. Similarly, it facilitates by linking sexual interactions to rewarding experiences, increasing the likelihood of , while social bonding rewards, such as those from affiliation and , support group cohesion and protection against threats. Psychologically, the reward system underpins hedonic experiences, goal-directed behavior, and emotional regulation, shaping how individuals perceive and pursue objectives. It generates feelings of satisfaction from rewarding stimuli, which in turn fuels to anticipate and achieve future goals, as seen in the system's role in encoding reward predictions that guide adaptive . This process also aids emotional regulation by modulating responses to stressors, promoting resilience through positive that buffers against negative affect. For example, neural responses to rewards can predict reductions in depressive symptoms over time, highlighting its significance in maintaining psychological . From an evolutionary perspective, the reward system evolved to provide adaptive value in ancestral environments, particularly in foraging and mating contexts, but can lead to maladaptive outcomes in modern settings. In foraging, it incentivizes efficient resource acquisition by rewarding successful hunts or gatherings, optimizing energy balance and survival in variable habitats. For mating, it reinforces mate selection and pair bonding, ensuring genetic propagation through pleasurable associations with reproductive cues. However, in contemporary environments abundant with artificial rewards, this system can extend beyond adaptive limits, contributing to overconsumption and dependency. Societally, the reward system's influence manifests in consumerism, technology engagement, and public health issues like obesity, often amplifying maladaptive behaviors through engineered stimuli. In consumerism, exploits reward pathways to foster compulsive purchasing, akin to models where surges from acquisitions drive repeated engagement, as evidenced in compulsive disorders. Technology, particularly , creates loops via unpredictable notifications and likes, promoting excessive use and dependency that mirrors substance reward patterns. These dynamics contribute to obesity epidemics by heightening the appeal of hyper-palatable foods, leading to despite satiety signals and posing significant challenges.

Neuroanatomy

Core structures

The core structures of the brain's reward system include the (VTA), (NAc), (PFC), , and hippocampus, which form an interconnected network primarily within the and . The (VTA) is situated in the , dorsomedial to the and near the midline on the floor of the . It comprises a heterogeneous population of , GABA, and glutamate neurons. The (NAc) occupies the ventral in the , positioned anterior to the and ventromedial to the caudate-putamen. It is divided into a core and shell subregion, with the core featuring more structured neuronal layering. The (PFC), particularly its orbital and medial divisions, lies at the anterior portion of the , encompassing Brodmann areas 10, 11, 12, 13, 14, 24, 25, 32, and 47. These regions integrate sensory and reward-related inputs for higher-order processing. The is an almond-shaped complex embedded in the medial , forming part of the with basolateral, central, and medial nuclei. It resides ventral to the and lateral to the hippocampus. The hippocampus is a curved structure within the medial , extending from the septal nuclei to the , and includes the , cornu ammonis fields, and . It lies adjacent to the and fimbria. These structures exhibit basic connectivity, such as dense projections from the VTA to the NAc shell and core, as well as to the PFC, amygdala, and hippocampus, forming the foundational mesolimbic and mesocortical links. The NAc receives inputs from the PFC and amygdala, while the hippocampus sends efferents to the NAc and VTA.

Major pathways

The mesolimbic pathway constitutes a core neural circuit in the reward system, originating from dopamine neurons in the ventral tegmental area (VTA) and projecting primarily to the nucleus accumbens (NAc) in the ventral striatum. This pathway facilitates the transmission of reward signals, particularly in the anticipation of pleasurable stimuli, by modulating activity in limbic structures that integrate sensory and motivational inputs. Dopamine serves as the primary neurotransmitter carrier along this route, enabling phasic bursts that encode predictive reward value. Connectivity between the VTA and NAc involves dense axonal projections that synapse onto medium spiny neurons, allowing for rapid signal propagation essential to goal-directed behaviors. The extends from the VTA to various regions of the (PFC), including the orbitofrontal and anterior cingulate cortices, forming a circuit that links reward processing with higher-order cognitive functions. This pathway supports executive control over reward evaluation, such as assessing long-term outcomes and inhibiting impulsive responses, through reciprocal connections that allow feedback from cortical areas to modulate VTA activity. Signal flow in this circuit emphasizes top-down regulation, where PFC neurons influence release to refine based on contextual reward information. The arises from neurons in the and targets the dorsal , comprising the and , to coordinate motor and associative aspects of reward-guided actions. This circuit plays a key role in formation by strengthening stimulus-response associations that become automated over repeated reward experiences, with projections forming loops that integrate sensory cues from the cortex and . Unlike the mesolimbic route, its connectivity prioritizes circuitry, enabling the consolidation of rewarded behaviors into efficient routines. Dynamics within these major pathways are governed by synaptic plasticity mechanisms, such as (LTP), which enhance connectivity and signal efficacy in response to reward-related activity. LTP in the , for instance, occurs at synapses onto NAc neurons, driven by coincident and glutamate release to strengthen reward prediction errors. Similar plasticity in the mesocortical and nigrostriatal pathways supports adaptive modifications, allowing circuits to recalibrate based on experience without altering core anatomical projections.

Neurotransmitters involved

Dopamine serves as the primary neurotransmitter in the brain's reward circuitry, synthesized in dopaminergic neurons of the ventral tegmental area (VTA) from the amino acid tyrosine via the rate-limiting enzyme tyrosine hydroxylase, followed by aromatic L-amino acid decarboxylase. These VTA neurons release dopamine into key reward-related regions such as the nucleus accumbens through the mesolimbic pathway. Dopamine signaling occurs via two main receptor families: D1-like receptors (D1 and D5), which are Gs-coupled and excitatory, and D2-like receptors (D2, D3, and D4), which are Gi-coupled and inhibitory. Dopamine release patterns include tonic release, which maintains baseline extracellular levels for sustained modulation, and phasic release, characterized by brief bursts in response to salient stimuli, enabling rapid signaling for reward prediction errors. D2 autoreceptors, located on somata, dendrites, and terminals, provide by inhibiting further synthesis and release upon , thereby regulating overall tone in reward processing. Endogenous opioids, such as enkephalins, contribute to the hedonic aspect of reward by binding to mu- and delta-opioid receptors, primarily in the , to enhance sensations during reward consumption. Serotonin modulates reward valuation by influencing the perceived value of rewards, with serotonergic neurons in the projecting to reward areas to adjust motivational responses through 5-HT1B and 5-HT2A receptors. Glutamate acts as the principal excitatory in reward circuits, driving neuron activity in the VTA via ionotropic receptors ( and NMDA) that facilitate synaptic potentiation and signals. In contrast, GABA maintains inhibitory balance within the reward system, with in the VTA and suppressing excessive excitation to prevent overactivation during reward processing. Recent studies highlight the role of endocannabinoids, such as (2-AG) and , in fine-tuning reward signals through CB1 receptors on presynaptic terminals, where they modulate release in the VTA to refine encoding of and .

Mechanisms of reward processing

Wanting and liking distinction

The distinction between "wanting" and "liking" represents a core dissociation in reward processing, where "wanting" refers to the incentive or desire to approach and pursue a reward, primarily driven by signaling in the . In contrast, "liking" denotes the hedonic pleasure or sensory enjoyment derived from consuming the reward itself, mediated mainly by systems within specific hotspots. This framework, developed by Berridge and colleagues, underscores that while wanting and liking often co-occur for natural rewards like , they can be neurologically and behaviorally separated. Neurologically, wanting is attributed to the attribution of incentive salience via the mesolimbic dopamine system, originating from the and projecting to the (NAc) and beyond, which amplifies the motivational pull of reward cues without necessarily enhancing pleasure. Liking, however, arises from a more restricted set of opioid-sensitive hedonic hotspots, including the medial shell of the NAc and the posterior , where mu-opioid receptor stimulation—such as by drugs like —dramatically increases affective reactions to , elevating hedonic impact by up to 1000% in localized sites. These hotspots form a functional circuit, with reciprocal interactions between the NAc and required to generate and sustain liking responses. Experimental evidence for this dissociation comes prominently from using taste reactivity tests in rodents, which measure innate facial expressions of pleasure (e.g., tongue protrusions for liking) versus aversion to or . In rats depleted of nearly all via 6-hydroxydopamine (6-OHDA) lesions in the NAc and neostriatum, hedonic liking reactions to remain intact and even normal in intensity, demonstrating preserved sensory pleasure despite the absence of . However, these same dopamine-depleted rats exhibit profound deficits in wanting, such as (refusal to eat) and lack of approach behavior toward food rewards, even when hungry, highlighting that is essential for motivational pursuit but not for hedonic experience. In humans, (fMRI) studies corroborate this dissociation, showing distinct neural patterns for wanting (craving or anticipation) versus liking (enjoyment during consumption) in contexts like food reward. For instance, exposure to food odors activates wanting-related regions such as the and ventral for motivational craving, while actual tasting engages liking-specific areas like the insula and anterior cingulate for hedonic pleasure, with minimal overlap. A 2022 of fMRI data further supports this by distinguishing "wanting" activations in cue-driven incentive networks from homeostatic "needing" signals, aligning with Berridge's model where hedonic liking remains separable from wanting. Recent 2024 fMRI research on preferences demonstrates that self-reported craving (wanting) correlates with ventral activity, whereas explicit liking ratings activate distinct hedonic regions like the mid-insula, reinforcing the cross-species validity of the framework.

Anti-reward system

The anti-reward system comprises neural and hormonal mechanisms that counteract excessive activation of the brain's reward circuitry, promoting homeostasis by inducing aversive states during prolonged or intense reward exposure. Key components include the hypothalamic-pituitary-adrenal (HPA) axis, which orchestrates stress responses through glucocorticoid release; the kappa-opioid receptor (KOR) system, activated by endogenous ligands like dynorphin; and the lateral habenula (LHb), a diencephalic structure that signals negative reward prediction errors. The extended amygdala, encompassing the central nucleus of the amygdala and bed nucleus of the stria terminalis, further integrates these elements to amplify stress-induced aversion. These components function primarily to generate , an unpleasant emotional state that discourages overindulgence in rewarding stimuli and restores behavioral balance. For instance, during stress, dynorphin is released from neurons in the central amygdala and , binding to KORs distributed across limbic and regions, thereby evoking aversive responses that limit reward pursuit. The LHb contributes by inhibiting dopamine neurons upon detection of unfavorable outcomes, enhancing the salience of potential harms over benefits. This system thus serves as an inhibitory counterweight to the facilitatory processes of wanting and liking in reward processing. Interactions between the anti-reward system and involve loops that dampen reward signaling to foster tolerance. Activation of KORs in the suppresses release in target regions like the , reducing the motivational impact of rewards and promoting . Similarly, HPA axis-mediated surges enhance KOR expression and LHb excitability, further attenuating transmission and contributing to diminished reward sensitivity over time. These mechanisms ensure that repeated reward exposure leads to adaptive downregulation, preventing unchecked escalation. To specifically counteract dopamine surges, the brain activates opposing anti-reward systems, including downregulation of dopamine receptors or signaling, which reduces sensitivity to pleasure and contributes to tolerance. Additionally, stress pathways are recruited, involving dynorphin to promote aversion, cortisol release via the HPA axis, and pain signals in overlapping brain regions such as the extended amygdala and nucleus accumbens. These processes lead to negative affective states including anxiety, irritability, restlessness, or intensified craving, thereby balancing excessive reward activation and restoring homeostasis. Recent studies from 2022 to 2025 have elucidated the anti-reward system's role in and withdrawal states, emphasizing neuroplastic changes in the extended . For example, persistent exposure upregulates KOR signaling in the central , intensifying dynorphin-mediated aversion during withdrawal and altering circuit connectivity to heighten negative affective states. In models, LHb hyperactivity, driven by HPA axis hyperactivity, amplifies anti-reward signals via projections to the extended , sustaining dysphoric responses that interfere with pain modulation. These findings highlight the extended 's integration of stress and anti-reward pathways, offering insights into therapeutic targets for balancing .

Role in learning and behavior

Reinforcement and learning

The reward system facilitates associative learning by reinforcing that lead to positive outcomes or the avoidance of negative ones, primarily through classical and operant conditioning mechanisms. In , positive occurs when the presentation of a rewarding stimulus, such as or social approval, increases the likelihood of a preceding , as demonstrated in foundational experiments with animals where lever-pressing was strengthened by . Negative , conversely, strengthens by removing or preventing an aversive stimulus, like terminating an electric shock upon a specific action, thereby associating the behavior with relief and promoting its repetition. These processes underpin how the reward system shapes adaptive by linking actions or cues to their hedonic consequences. A core principle of reward-driven learning is the prediction error hypothesis, which posits that dopamine neurons signal discrepancies between anticipated and actual rewards, guiding updates to behavioral expectations. This mechanism adapts the Rescorla-Wagner model of , where learning depends on the difference between predicted and received outcomes, originally formulated to explain how associations form between conditioned stimuli and unconditioned rewards. In neurophysiological terms, unexpected rewards elicit phasic bursts, while better-than-expected outcomes at predicted times suppress activity, and omitted rewards produce dips, thereby encoding positive and negative prediction errors to refine future predictions. This -mediated signal propagates through the reward circuitry to facilitate plasticity in downstream regions, enabling organisms to adjust strategies based on environmental feedback. At the neural level, these prediction errors induce synaptic changes in the (NAc) and (PFC) through Hebbian learning principles, where correlated pre- and postsynaptic activity strengthens connections. In the NAc, modulates long-term potentiation (LTP) and depression (LTD) at glutamatergic synapses from the PFC, allowing reward-associated cues to enhance behavioral responses over time. Hebbian plasticity in these circuits integrates temporal contiguity between stimuli and rewards, as timing aligns presynaptic inputs with postsynaptic to tag eligible synapses for modification. Such adaptations support the consolidation of reward contingencies, transforming transient experiences into enduring behavioral habits. Human studies using (EEG) and computational modeling provide evidence for these processes in reward prediction during tasks. In the , event-related potentials (ERPs) like the reward positivity component reflect prediction errors, with larger amplitudes for unexpected gains versus predicted ones, aligning with Rescorla-Wagner model fits to participant choices. Recent EEG investigations of simulations (2023) reveal dynamic sub-second shifts in frontal and delta oscillations tied to evolving reward expectations, where mismatches amplify learning signals and influence subsequent bets. Computational models incorporating these EEG-derived errors accurately predict individual learning rates, underscoring the reward system's role in associative plasticity up to 2025.

Motivation and decision-making

The reward system plays a pivotal role in distinguishing between intrinsic and extrinsic , where intrinsic motivation arises from the inherent satisfaction of an activity, while extrinsic motivation stems from external incentives like monetary rewards. studies demonstrate that extrinsic rewards can undermine intrinsic motivation by altering striatal activity, reducing voluntary engagement in tasks once incentives are removed. In sustaining effort toward delayed rewards, the reward system facilitates persistence through dopamine-mediated signaling that enhances the perceived value of future outcomes, countering the tendency to devalue them over time. This process is evident in tasks requiring cognitive effort, where repeated exposure to rewarding outcomes increases the intrinsic valuation of demanding activities, independent of external payoffs. Decision-making models integrate reward system dynamics with economic theories, such as , which posits that individuals weigh potential gains and losses asymmetrically, with losses looming larger than equivalent gains. Dopamine neurons in the reward circuitry encode these valuations by signaling prediction errors that adjust subjective value functions, aligning neural responses with prospect theory's reference-dependent evaluations. Temporal further shapes these choices, as modulates the preference for immediate smaller rewards over larger delayed ones, reflecting a hyperbolic decline in future value perception. For instance, ventral striatal activity diminishes with increasing delays, prioritizing short-term gratification in value-based selections. Neural integration between the (PFC) and (NAc) underpins cost-benefit analysis in and , forming bidirectional loops that evaluate effort, risks, and rewards. efflux in these circuits dynamically fluctuates to represent the net value of options, with PFC-NAc interactions enabling the suppression of impulsive choices in favor of adaptive, goal-directed behaviors. This circuitry supports effort-related decisions by integrating sensory cues with , ensuring actions align with long-term benefits despite immediate costs. Recent research from 2023 to 2025 highlights the reward system's involvement in social rewards during , particularly in economic games assessing fairness. Functional MRI studies show that social reward , such as equitable distributions in ultimatum games, activates NAc and PFC regions, influencing choices toward fairness over . These findings underscore how social contexts enhance reward valuation, promoting prosocial decisions through integrated neural reward mechanisms.

Clinical and pathological aspects

Addiction

Addiction arises from the dysregulation of the brain's reward system, where repeated exposure to substances or behaviors hijacks the mesolimbic pathway, leading to compulsive use despite adverse consequences. This process transforms natural reward processing into a pathological cycle characterized by three main stages: binge/intoxication, withdrawal/negative affect, and preoccupation/anticipation. In the binge/intoxication stage, drugs or addictive behaviors trigger a surge in release within the (), producing intense and reinforcing the behavior through enhanced incentive salience. This dopamine surge from drugs is substantially larger than those elicited by natural rewarding activities such as sex, resulting in a more intense short-term sensation of pleasure compared to these natural experiences. However, this heightened pleasure is short-lived and leads to rapid tolerance, where the brain adapts by reducing dopamine sensitivity, requiring progressively higher doses to achieve similar effects. Over time, chronic drug use downregulates the dopamine system, impairing the brain's ability to derive pleasure from normal activities like sex, food, or social interactions, which become unpleasurable or "so dull," and contributes to long-term risks including addiction, anxiety, depression, organ damage, and death. The withdrawal/negative affect stage involves activation of the anti-reward system, primarily in the extended , resulting in , anxiety, and aversion that drives further consumption to alleviate discomfort. Finally, the preoccupation/anticipation stage is marked by intense craving, mediated by the "wanting" mechanism in and striatal circuits, where cues associated with the reward elicit persistent anticipation and relapse vulnerability. Chronic addiction induces profound neuroadaptations in reward circuitry, altering sensitivity to both natural and drug-induced rewards. A key change is the downregulation of dopamine D2 receptors in the , including the NAc, which reduces the brain's responsiveness to non-drug rewards and perpetuates reliance on the addictive stimulus to achieve pleasure. Concurrently, repeated drug exposure leads to sensitization of glutamatergic transmission in the NAc, particularly involving trafficking and synaptic strengthening, which enhances cue-induced craving and compulsive seeking behaviors. These adaptations shift the reward system from homeostatic balance to a hypodopaminergic state, where tolerance develops and the threshold for reward activation rises, contributing to the persistence of . Behavioral addictions, such as pathological and internet gaming disorder, exhibit neurobiological parallels to substance use disorders, involving similar dysregulation of dopamine-mediated reward anticipation and habit formation in the ventral striatum. In the DSM-5-TR, is classified as the sole formal behavioral addiction, reflecting its alignment with substance criteria through shared features like tolerance, withdrawal, and loss of control, while internet gaming disorder remains a condition for further study pending additional validation. Recent 2024 reviews highlight ongoing refinements in diagnostic criteria, emphasizing functional impairments and cue-reactivity in reward circuits for these non-substance conditions. Treatment strategies targeting reward system dysregulation offer promising interventions, particularly pharmacotherapies that modulate key circuits to restore balance. For opioid addiction, , an antagonist, blocks the rewarding effects of opioids by inhibiting mu- signaling in the NAc, thereby reducing craving and relapse rates without producing itself. Similar approaches, including modulators and glutamate stabilizers, aim to counteract neuroadaptations across addiction types, though efficacy varies by stage and individual factors.

Mood and anxiety disorders

, defined as the diminished capacity for experiencing pleasure and motivation toward rewards, represents a central feature of (MDD) and is closely tied to dysfunction in the brain's reward circuitry. In MDD, this manifests as reduced "liking" (the hedonic impact of rewards) and "wanting" (the incentive salience driving pursuit of rewards), primarily due to blunted release from the (VTA) and its projections to limbic structures like the . studies, including functional MRI, have consistently shown hypoactivation in the VTA-striatal pathway during reward anticipation and consumption tasks in individuals with MDD, correlating with severity and overall depressive symptoms. This dopaminergic hypofunction contributes to impaired , where patients exhibit slower acquisition of reward-associated behaviors compared to healthy controls. In , reward system alterations display state-dependent patterns, contrasting the more uniform hypoactivity seen in unipolar MDD. During manic or hypomanic phases, individuals often exhibit , characterized by exaggerated signaling in the VTA-nucleus accumbens pathway, which drives heightened , risk-taking, and goal-directed activity. This aligns with the Behavioral Approach System (BAS) dysregulation model, where over-responsivity to reward cues precipitates manic episodes. Conversely, during depressive episodes in , reward processing mirrors MDD with VTA hypoactivity and reduced striatal responses to positive stimuli, leading to and motivational deficits that exacerbate mood lows. These bipolar-specific dynamics highlight the reward system's role in mood polarity, with fluctuations underpinning the disorder's cyclical nature. Anxiety disorders involve an imbalance where the anti-reward system dominates, suppressing positive reward signals and amplifying aversive learning. Structures such as the lateral habenula and extended amygdala activate in response to negative outcomes, inhibiting VTA dopamine neurons and promoting avoidance behaviors over reward-seeking. This leads to enhanced conditioning to aversive stimuli, as seen in , where patients show heightened sensitivity to punishment cues and reduced differentiation between rewards and threats in tasks. Consequently, the overpowering anti-reward mechanisms contribute to persistent and behavioral inhibition, with revealing reduced ventral striatal activation during mixed reward-aversion paradigms. Longitudinal studies from 2021 to 2025 using (PET) have provided evidence that reward blunting serves as a for relapse risk in mood disorders.

Neurodevelopmental disorders

In attention-deficit/hyperactivity disorder (ADHD), dysregulation of the contributes to reduced activity in brain reward centers, leading to a reward deficiency that manifests as intolerance to delayed rewards. This altered sensitivity to reward timing impairs sustained attention and motivation, as children with ADHD exhibit abnormal responses to delayed reinforcement compared to neurotypical peers, linked to disruptions in signaling dynamics. medications, such as and amphetamines, address this by inhibiting dopamine reuptake and enhancing release in reward pathways, thereby improving behavioral symptoms and reward processing efficiency in affected individuals. In autism spectrum disorder (ASD), reward system alterations particularly affect social processing, with diminished activation in circuits connecting the (TPJ) to the (NAc), resulting in impaired valuation of social stimuli like faces or voices. The TPJ, a key node in , fails to integrate with NAc-mediated reward signals, leading to reduced motivational drive for interpersonal interactions and contributing to core social deficits. Functional imaging studies confirm hypoactivation in these pathways during social reward tasks, distinguishing ASD from other conditions by its specificity to human-related rewards rather than general . Although is typically adult-onset, early neurodevelopmental disruptions in reward prediction error (RPE) signaling serve as risk factors, with aberrant responses to unexpected rewards evident in individuals at clinical high risk for . These early RPE abnormalities, detectable in , reflect immature wiring in meso-cortico-striatal circuits that heighten vulnerability to later psychotic symptoms by misassigning salience to neutral stimuli. Prenatal and perinatal factors exacerbating these prediction errors during critical developmental windows further link them to 's neurodevelopmental origins. Recent genetic research, including 2025 studies, has identified variants in reward-related genes like DRD4 as shared risk factors across ADHD, ASD, and , influencing function and early reward circuitry development. For instance, the DRD4 7-repeat correlates with heightened susceptibility to autistic traits in ADHD populations and broader neuropsychiatric overlaps, underscoring polygenic influences on reward hypersensitivity or deficiency. These findings highlight how common genetic variants disrupt reward during neurodevelopment, increasing disorder .

Historical development

Early discoveries

The foundational behavioral investigations into the brain's reward mechanisms began in the mid-20th century with experiments demonstrating that direct electrical of specific brain regions could serve as a powerful reinforcer for voluntary actions. In 1954, psychologists James Olds and Peter Milner at implanted electrodes in the brains of rats and observed that animals with placements in the septal area would repeatedly press a to self-administer brief pulses of electrical stimulation, often thousands of times per hour, forgoing food, water, or rest. This serendipitous finding, initially encountered during studies of avoidance learning, revealed discrete "pleasure centers" where stimulation elicited approach behaviors and reinforced learning, contrasting sharply with non-rewarding or aversive sites elsewhere in the brain. Subsequent mapping experiments confirmed that self-stimulation thresholds were lowest in the septal region, establishing it as a core substrate for positive reinforcement and laying the groundwork for understanding intrinsic reward pathways. Building on these behavioral observations, early anatomical studies in the and delineated the neural structures involved in reward processing, focusing on subcortical regions prior to the identification of dopamine's central role. Olds extended his work to systematically explore the , finding that electrical stimulation of its lateral portions not only sustained self-stimulation but also elicited consummatory behaviors such as eating and drinking, suggesting an integration of drive and functions. The septal area and emerged as key nodes, with lesions in these regions disrupting reward-seeking without broadly impairing motor function, as shown in maze-learning tasks where animals failed to pursue rewarded goals. These pre-dopamine-era findings highlighted the 's role in mediating the of rewards, influencing later conceptualizations of distributed reward circuits. Pharmacological probes in the further illuminated the neurochemical underpinnings of reward by linking catecholamines, particularly norepinephrine, to motivational enhancement. Researchers demonstrated that amphetamines, which increase catecholamine release, potently facilitated intracranial self- rates in rats, with effects most pronounced at low doses that selectively boosted hypothalamic and septal responding. Studies by Larry Stein and others showed that amphetamine's rewarding properties mimicked electrical , suggesting catecholaminergic systems as excitatory modulators of the brain's circuitry, independent of peripheral . This work shifted attention from purely electrical to biochemical mechanisms, establishing amphetamines as tools to dissect and foreshadowing the involvement of monoamines in reward processing. In the late 1960s and 1970s, pivotal research identified as the primary mediating reward. Early pharmacological evidence showed that dopamine agonists enhanced self-stimulation while antagonists reduced it, challenging the initial emphasis on norepinephrine. Key studies using 6-hydroxydopamine (6-OHDA) to selectively deplete dopamine neurons, such as those by Ulf Ungerstedt in 1971, demonstrated that damage to mesolimbic dopaminergic pathways abolished intracranial self-stimulation and impaired reward-seeking behaviors, without equivalent effects from noradrenergic depletion. This established the mesolimbic dopamine system—originating in the and projecting to the —as the core neural substrate for , integrating prior behavioral findings into a framework. A pivotal milestone in the 1970s came with the discovery of , endogenous peptides that provided a biochemical basis for natural reward and analgesia. In 1975, John Hughes and Hans Kosterlitz isolated enkephalins from porcine tissue, identifying them as pentapeptides that bound opiate receptors and produced morphine-like effects in behavioral assays, including antinociception and reward facilitation. Concurrently, Choh Hao Li's group purified beta-endorphin from pituitary extracts, revealing its potent activity in modulating pain and pleasure responses, as evidenced by its ability to substitute for exogenous in self-administration paradigms. These findings integrated signaling into the reward system, explaining phenomena like the euphoric effects of stress or exercise and expanding the framework beyond catecholamines to include peptidergic mechanisms.

Key theoretical advancements

In the , a pivotal advancement came from Wolfram Schultz's work, which proposed that neurons function as a "teaching signal" by encoding reward errors—the difference between expected and actual rewards—to guide learning and adaptation. This theory, rooted in principles, demonstrated that phasic bursts occur at unexpected rewards, while dips signal negative errors, thereby updating value representations in downstream circuits like the . Empirical evidence from recordings showed responses shifting from reward delivery to predictive cues over learning trials, establishing this as a core mechanism for associative beyond mere hedonic signaling. Building on this, Kent Berridge and colleagues introduced the wanting/liking framework in the mid-1990s, dissociating motivation ("wanting") from sensory pleasure ("liking") in reward processing. was implicated primarily in "wanting," driving pursuit of rewards through attribution of salience, while opioids mediated "liking" via hedonic hotspots in the . This dissociation was supported by lesion and pharmacological studies showing that depletion impairs motivation without abolishing pleasure reactions, such as affective facial expressions in , thus refining the understanding of reward as multifaceted rather than unitary. Computational models further advanced the field by integrating these neurobiological insights with algorithms, notably , to simulate reward circuit dynamics. In , agents update action-value functions based on prediction errors, mirroring 's role in loops to optimize under uncertainty. applications, from the 2000s onward, fitted these models to electrophysiological data, revealing how modulates striatal learning rates; recent 2020s integrations with have extended this to hierarchical and model-based control, enhancing predictions of complex behaviors like habit formation. From 2022 to 2025, theoretical emphases have shifted toward multi-modal rewards, incorporating social and cognitive dimensions beyond primary reinforcers, with providing causal evidence for circuit-specific integrations. Studies using optogenetic manipulation in have shown that projections to the encode social rewards, such as affiliation, by modulating incentive salience in a manner distinct from food-based signals, supporting hybrid models of valuation. Similarly, ventral hippocampal inputs integrate cognitive context with reward history via optogenetic tagging, enabling flexible adaptation to multi-modal contingencies like effortful social interactions. These advances underscore a broader, distributed reward , informed by precise neural control techniques.

Comparative and evolutionary perspectives

In non-human animals

In non-human animals, the reward system has been extensively studied using model organisms to elucidate conserved neural mechanisms underlying motivation and learning. Rodent models, particularly rats, have been pivotal through intracranial self-stimulation (ICSS) paradigms, where animals voluntarily press levers to electrically stimulate brain regions like the medial forebrain bundle, demonstrating robust reward-seeking behavior driven by dopaminergic pathways. This technique, pioneered in the mid-20th century, reveals how activation of the ventral tegmental area (VTA) and nucleus accumbens (NAc) sustains operant responding, providing insights into the circuitry's role in reinforcement without external incentives. In primates, such as rhesus monkeys, social reward studies highlight dopamine's involvement in processing interpersonal interactions; for instance, dopamine neurons in the VTA encode the value of social cues like gaze or grooming, modulating responses in the striatum during cooperative tasks. These findings underscore parallels to human social bonding, with phasic dopamine release signaling unexpected social rewards to reinforce affiliative behaviors. Behavioral parallels across species illustrate the conservation of dopamine-mediated reward processing, from simple foraging to complex cognitive feats. In social insects like ants and bees, dopamine regulates foraging decisions by modulating risk assessment and activity levels; for example, elevated dopamine titers in ant foragers increase trip frequency and exploration of food sources, adapting colony-wide resource acquisition to environmental demands. This mirrors dopaminergic influences in vertebrates, where dopamine facilitates motivated search behaviors. In corvids, such as American crows, tool use for obtaining rewards involves activation of neural circuits, including the ventral tegmental area (a key reward-related region), in proficient individuals, analogous to mammalian reward centers. These examples demonstrate dopamine's conserved function in value-based decision-making, scaling from invertebrate appetitive drives to avian problem-solving. Post-2010 advancements in experimental techniques, particularly , have enabled causal dissection of reward circuits in like mice. By expressing light-sensitive in VTA neurons, researchers can precisely activate or inhibit projections to the NAc, revealing how phasic drives real-time reward seeking, such as increased sucrose consumption or . These manipulations confirm that release in the NAc shell causally reinforces behaviors, while inhibition disrupts , highlighting circuit-specific contributions to . has further clarified interactions between the VTA and downstream targets, showing how balanced excitation and inhibition fine-tune reward valuation in freely moving animals. Species variations in reward system and function are evident when comparing avian and mammalian models, reflecting divergent evolutionary paths yet functional convergence. In mammals, the , including the , serve as primary reward hubs with dense innervation from the VTA, whereas in birds, the nidopallium caudolaterale (NCL) functions as a striatal analog, processing reward predictions through similar dopamine-modulated loops. Avian reward centers exhibit higher and more compact circuitry compared to mammalian counterparts, enabling efficient integration of sensory and motivational signals in smaller brains. Recent , including single-cell multiome analyses, has identified conserved enhancer codes in pallial regions across birds and mammals, suggesting shared regulatory mechanisms despite structural differences. These insights point to of reward processing, with brief implications for broader adaptive strategies in diverse taxa.

Evolutionary origins

The reward system, particularly its dopaminergic components, traces its origins to the emergence of early free-moving animals in the oceans approximately 540 million years ago during the period, where it facilitated essential survival behaviors such as foraging for food, securing territory, and reproduction to enhance fitness in resource-scarce environments. In early vertebrates, like lampreys diverging over 500 million years ago, these circuits evolved to integrate sensory cues with motivational drive, promoting energy-efficient actions by balancing exploration for potential rewards against conservation of limited caloric resources. signaling played a pivotal role in this adaptation, modulating arousal and movement to favor exploitation of reliable food sources while minimizing unnecessary energy expenditure in unpredictable ancestral habitats. Across phyla, the core machinery of the reward system exhibits remarkable genetic conservation, with pathways showing homology from invertebrates like to , underscoring a shared evolutionary blueprint for reward-seeking and learning. In , eight neurons regulate behaviors akin to reward prediction and aversion, mirroring the mesolimbic system's functions in higher animals and highlighting how these ancient pathways enabled adaptive responses to environmental stimuli long before the diversification of brains. This homology suggests that the reward system's foundational role in motivating survival-oriented actions predates the vertebrate lineage, evolving incrementally to support increasingly complex as nervous systems grew more sophisticated. In humans, the reward system underwent significant expansion, particularly in the (PFC), which enlarged dramatically in parallel with other association areas during hominin , enabling processing of abstract rewards beyond immediate survival needs. This granular PFC development, unique among , facilitated higher-order such as and social cooperation, integrating reward signals with long-term planning to underpin and cumulative knowledge transmission. Such adaptations allowed human ancestors to value symbolic or deferred rewards, like tool-making or alliance-building, which amplified group-level fitness in social environments. Contemporary maladaptations in the reward system, including vulnerability, are explained by the hypothesis, where mechanisms honed for scarce ancestral resources are hijacked by abundant modern cues like calorie-dense foods and psychoactive substances. In calorie-rich environments, hyperstimulation of overrides self-regulation, leading to compulsive behaviors that were adaptive for in famine-prone settings but detrimental today.

References

Add your contribution
Related Hubs
Contribute something
User Avatar
No comments yet.