Abstract
Pavlovian fear conditioning is a fundamental process in both health and disease. We investigate its neural correlates and sources of variability using harmonized functional magnetic resonance imaging data from 2199 individuals across nine countries, including 1888 healthy individuals and 311 with anxiety-related or depressive disorders. Using mega-analysis and normative modeling, we show that fear conditioning consistently engages brain regions within the “central autonomic–interoceptive” or “salience” network. Several task variables strongly modulate activity in these regions, contributing to variability in neural responses. Additionally, brain activation patterns differ between healthy individuals and those with anxiety-related or depressive disorders, with distinct profiles characterizing specific disorders such as post-traumatic stress disorder and obsessive-compulsive disorder. While the neural correlates of fear conditioning are highly generalizable at the population level, variability arises from differences in task design and clinical status, highlighting the importance of methodological diversity in capturing fear learning mechanisms.
Similar content being viewed by others
Introduction
Fear conditioning, also known as threat conditioning, is a psychological paradigm developed over a century ago to study associative learning mechanisms. It remains one of the most widely used and productive experimental models for investigating both normal and pathological fear and anxiety in humans1. Fear conditioning models how the association between an initially neutral stimulus (conditioned stimulus (CS)) and an innately aversive stimulus (unconditioned stimulus, US) is learned. The success of learning in fear conditioning is typically assessed by comparing responses to the fear cue (CS+, paired with the US) and the safety cue (CS-, not paired with the US) across subjective, autonomic, or neural domains. Successful conditioning is indicated by greater responses to the CS+ than to the CS-2. In the brain, this involves activity changes within a “central autonomic–interoceptive” or “salience” network, which in humans includes functionally and anatomically connected regions like the dorsal anterior cingulate cortex (dACC) and the anterior insular cortex (AIC)3. Additionally, fear conditioning has been linked to decreased activity in regions like the ventromedial prefrontal cortex (vmPFC), although such decreases have been less extensively studied3. Although the amygdala plays a crucial role in fear conditioning in rodents4,5,6, and classical lesion studies have implicated the amygdala in human fear conditioning7, this relationship has not been consistently identified in human fMRI studies3,8,9,10,11,12.
Limitations in prior research on the neural correlates of human fear conditioning include the use of small sample sizes (typically n < 30) and the reliance on heterogeneous neuroimaging processing and analytical methods3,13. While group-level meta-analyses can partially address the sample size issue3, individual-level mega-analyses offer additional advantages. These include enhanced statistical power, more precise effect size estimation, standardized preprocessing and analysis techniques, and substantially improved power to detect whether activation is modulated by individual variability -one of the primary goals of the current study14,15,16.
Individual differences, such as sociodemographic factors (e.g., age) and trait variables (e.g., trait anxiety), are likely to modulate the neural correlates of fear conditioning, potentially affecting the generalizability of findings across groups, such as younger versus older adults or individuals with high versus low anxiety13. However, existing research on individual differences has been inconsistent and often hampered by limited sample sizes (n < 5013) or sampling biases17. Moreover, task-specific variables, such as task instructions or characteristics of the US, may also influence conditioning at the behavioral or neural level2,13,18,19. For example, compared to other USs, a tactile electric shock may elicit greater activation in the dACC and the ventral supplementary motor area3. A primary challenge in this field is integrating prior data to accurately assess how individual differences and task variables affect neural outcomes. This complexity arises from variations in adjustable factors and sampling across studies and participants, highlighting the need for methods that can account for and isolate specific sources of variation—such as the normative modeling approach used here. Normative modeling allows us to integrate multiple smaller-scale studies into a common reference space—a standardised baseline against which to statistically quantify individual variations. This approach allows for meaningful comparisons across diverse studies by controlling for multiple sources of variation. As a result, the variance associated with specific variables and individuals can be isolated, quantified, and systematically analysed20. (For details on the underlying mathematics, see refs. 21,22,23; for applications, see refs. 24,25,26,27,28,29).
Fear conditioning has also been used to study the development and persistence of mental health disorders marked by pathological fear, such as anxiety-related disorders1,30,31,32,33, which are highly prevalent and rank among the leading causes of disability worldwide33. However, there is ongoing debate over whether anxiety-related disorders consistently show abnormal fear conditioning at behavioral or neural levels34,35 or if these abnormalities are specific to certain clinical groups—such as post-traumatic stress disorder (PTSD36) but not others, like social anxiety disorder (SAD)37. Inconsistencies maybe due in part to small sample sizes (ns < 100 for anxiety-related disorders as a group, ns < 25 for comparisons among clinical groups). Furthermore, most research in this area has relied on case-control designs and traditional analysis techniques, both of which have limitations that could be addressed through normative modeling. This framework enables statistical inference for individual subjects relative to an expected population pattern, providing a more detailed examination of the heterogeneity underlying group-level analyses20.
In this study with pre-registered hypotheses and analyses (cf. Materials and Methods), we used both mega-analysis and normative modelling to analyse individual-level, harmonized fMRI data acquired during fear-conditioning from 43 samples from 21 laboratories across nine countries (total n = 2199), including both healthy participants and individuals diagnosed with anxiety-related and depressive disorders. First, we assessed the overall neural correlates of fear conditioning in healthy participants to provide a comprehensive delineation of the brain regions underlying human fear conditioning. Based on previous studies, we hypothesized that during fear conditioning, the CS + > CS- contrast would be associated with robust activations in regions such as the dACC, AIC, pre/supplementary motor areas (SMA), and dorsolateral prefrontal cortex (dlPFC), whereas the CS + < CS- contrast would be associated with deactivations in the vmPFC and hippocampus. We expected the mega-analysis to be more sensitive than previous studies in detecting subtle effects in other brain regions not previously (or not consistently) identified. Second, we assessed variation among healthy participants. Given their role in mediating subjective arousal and autonomic expression of fear38, we hypothesised that regions including the vmPFC and the anterior-to-mid cingulate cortex would show the greatest between-subject heterogeneity. Third, we examined how individual differences (e.g., age, trait anxiety) and task variables (e.g., task instructions) influenced this variation. Finally, we explored differences in the neural correlates of fear conditioning between individuals with anxiety-related and depressive disorders and healthy controls, as well as among clinical subgroups (e.g., PTSD, SAD). We show that fear conditioning is consistently associated with brain activation in regions of the central autonomic-interoceptive network, despite methodological variations. However, specific task variables also influence the responses of these regions during conditioning. Additionally, brain activation patterns during conditioning differ between healthy individuals and those with anxiety-related or depressive disorders, with certain groups displaying distinct activation profiles.
Results
All results -including effect sizes for the linear models- are available in a free open-access repository (see Data availability statement).
Conditioning is associated with extensive brain (de)activations
In the mega-analysis (Fig. 1a), we included data from 1888 healthy individuals (42 experiment samples) and used linear mixed-effect models (LMMs) to perform a mega-analysis of whole-brain activation during fear conditioning (CS + > CS− contrast). We observed significant activation encompassing clusters within the bilateral anterior and mid insular cortices; the secondary somatosensory cortices (SII); the dlPFC; the lateral premotor cortices; and the dorsal and lateral cerebellum (Fig. 1b). Significant activation was also observed in multiple regions across the cortical midline, including the dACC extending to the pre-supplementary and SMA, ventral posterior cingulate cortex (PCC), and dorsal precuneus (dPrec).
Schematic indicating the levels of analysis (a). Significant brain functional activation (b) and deactivation (c) to the CS+ versus CS− determined by mega-analysis (n = 1888 healthy controls). Schematic of normative modelling framework (d). Normative probability maps illustrate the percentage of participants in the healthy control test sample who had positive (hot colours -right) or negative deviations (cool colours - left) >±2.6 within each voxel. Circle highlights frequent large deviations (both positive and negative) within the most ventral region of the vmPFC (e). AIC anterior insular cortex, AG angular gyrus, CN caudate nucleus, dACC dorsal anterior cingulate cortex, dlPFC dorsolateral prefrontal cortex, dPFC dorsal prefrontal cortex, dPons dorsal pons, dPrec dorsal precuneus, Hipp hippocampus, HYP hypothalamus, lOFC lateral orbitofrontal cortex, PCC posterior cingulate cortex, SI primary somatosensory cortex, SII secondary somatosensory cortex, SMA supplementary motor area, TG temporal gyrus, Thal thalamus, vmPFC ventromedial prefrontal cortex. Source data are provided as a Source Data file.
Additionally, the CS + > CS- mega-analysis revealed the broad activation of subcortical regions, particularly the thalamus and basal ganglia. The largest of these activation patterns were observed in the dorsal striatum, specifically the caudate nucleus (CN); the globus pallidus extending to the striatum; the ventral tegmental area extending to the habenula; the mediodorsal thalamus (Thal); and the midbrain tegmentum. Activation of the midbrain was noted generally across the dorsal midbrain ( ~ substantia nigra/red nucleus and pretectal nuclei) (Supplementary Fig. S1). To specifically assess the role of the amygdala, we conducted a Region of Interest (ROI) mega-analysis focusing on this region (see Materials and Methods), which indicated that neither the left (t = 1.93, p = 0.054, Cohen’s d = 0.129, 95% CI [–0.002, 0.260]) nor the right amygdala (t = 1.57, p = 0.116, Cohen’s d = 0.117, 95% CI [–0.029, 0.264]) showed significant activation during fear conditioning.
We also observed significant deactivations (CS + < CS- contrast) during fear conditioning, predominantly in regions of the default mode network (Fig. 1c). This included the PCC and precuneus; the vmPFC extending to the mPFC and subgenual cingulate cortex medially, as well as the left dorsal prefrontal cortex (dPFC); the bilateral angular gyri; and the parahippocampi and hippocampi (Hipp). Additional deactivation was observed in the lateral orbitofrontal cortex; the primary somatosensory cortex (SI); as well as the left temporal (TG) and fusiform gyri (see Supplementary Fig. S2 for detailed activation and deactivation across axial, sagittal, and coronal slices).
Heterogeneity in the neural correlates of conditioning
We estimated voxel-wise normative models of fear-conditioning related activation using the CS + > CS- contrast from 894 controls (training sample), and specifying age, biological sex, sample, and task variables as covariates (see Materials and Methods for all variables. The normative modeling sample is smaller than the mega-analysis due to the requirement for participants to have data on all covariates used in model construction). Testing these models with a held-out test sample (n = 646) showed good model fit with explained variance reaching 0.3 in regions that showed activation during fear conditioning (Fig. 1b), and skew and kurtosis within acceptable limits (Supplementary Fig. S3). For each participant in our held-out test sample, we then calculated a deviation score (z-score) within each voxel. In other words, for each participant, we quantified the distance from the predicted mean activation of each voxel, relative to the normative reference distribution for that voxel (Fig. 1d). While almost every voxel had at least 5 participants with large deviations (deviations > ±2.6), including areas, such as the bilateral insula and expanses of the cingulate cortex extending to the medial prefrontal cortex (Supplementary Fig. S4), controls most frequently had large deviations (both positive and negative) within the most ventral region of the vmPFC and inferior temporal pole. As this ventral region is notoriously prone to signal drop out, we interpret this result as most likely reflecting varying signal intensity rather than individual differences, and thus chose to interpret deviations within this region with caution (Fig. 1e).
Individual differences have small associations with conditioning
We examined the role of the following individual differences variables using LMMs and normative models (Fig. 1a): age, biological sex, and self-reported trait anxiety and depressive symptoms. In normative models, we analyzed both regression coefficients, reflecting each variable’s contribution to the regression equation, and structure coefficients, indicating the direct bivariate relationship between a variable and brain activity without accounting for other predictors.
In LMMs, age (n = 1884 controls) and biological sex (n = 1888 controls) showed a significant association with brain activation or deactivation during fear conditioning (Supplementary Fig. S5). However, the effect sizes were small (Supplementary Discussion). Additionally, the age range was restricted (see Table 1). Regression and structure coefficients also showed minimal effects of age and biological sex (n = 646 controls) (Supplementary Fig. S5). Neither trait anxiety (n = 1402 controls), using either harmonised or non-harmonised scores (Supplementary Methods), nor depressive symptoms (n = 213 controls) were significantly associated with brain activation or deactivation during fear conditioning in LMMs. Similarly, elastic net regressions showed that whole-brain deviation scores derived from normative models could not explain the variance in individual levels of trait anxiety (n = 751 controls and cases; r2 = –0.095, p = 0.459), nor depressive symptoms (n = 152 controls and cases; r2 = –0.257, p = 0.605). See Methods for a note on negative r2 values and Supplementary Table S1 for trait anxiety and depressive symptoms scores.
Task variables have a robust effect on conditioning
The influence of task variables on brain activation during fear conditioning was also examined using LMMs and structure coefficients from normative models in healthy controls. Several task variables were associated with robust effects across individuals, showing at least moderate effect sizes in LMMs and reaching significance in normative modeling analyses. These included instructions given to the participant about contingency prior to the task, the type of US, the use of a paradigm with multiple CSs (i.e., more than one CS+ or CS-), the pairing rate (i.e., percentage of CS+ followed by a US), and potential US confounding (i.e. whether trials followed by the US were included in the CS+ vs CS- contrast, and therefore the effects of the US may confound the effects of the CS+).
Partial instructions about CS-US contingency (n = 1388) were associated with significantly increased activation in the supplementary motor area and superior parietal lobule compared to no instructions about contingency (n = 500) in LMMs. Structure coefficients from the normative models (n = 646) showed that partial instructions (as compared to no instructions) produced a model predicting more activation in the bilateral anterior insula, the thalamus, the left caudate, clusters within the dorsomedial prefrontal cortex, the dorsolateral precuneus, and in the posterior region of the vmPFC. The model also predicted less activation within the bilateral visual cortex, the anterior medial temporal gyrus, and in the anterior vmPFC with the use of partial instructions (Fig. 2a). Note that we excluded instructed conditioning studies (Materials and Methods).
Maps show the influence of pre-task instructions about CS-US contingency (a), type of US (b), number of CS used in paradigm (i.e. multiple CS+ or CS- or single CS+ or CS-) (c), pairing rate (d), and potential US confounding in CS + > CS- contrast (e) on mean activation (left; mega-analysis linear mixed-effects models) and relation to predicted activation (right; normative model structure coefficients). For the mega-analysis, warm colours indicate positive correlations (i.e., higher variable values associated with greater activation), while cool colours indicate negative correlations (i.e., higher variable values associated with reduced activation). For normative modeling, structure coefficient maps show the correlation coefficients (rho) thresholded by their respective coefficients of determination (rho2 > 0.3) of selected task variables. This can be interpreted as showing how predicted activation to the CS + > CS- contrast relates to the task variables included in the building of the normative models. Positive correlations (warm colours) indicate greater activation for higher values of the input variable and negative correlations (cool colours) greater activation for lower values of the input variable (note that some variables are dummy coded, e.g., pre-task instructions, type of US). CS Conditioned Stimulus; US Unconditioned Stimulus. For Pairing Rate (RR) in linear mixed-effects models, the figure shows significant results in the ANOVA comparing four categories (RR30, RR50, RR62, RR100). For the results of post-hoc tests, see Supplementary Figs. S6 and S7. Source data are provided as a Source Data file.
Compared with an auditory US (n = 337), a tactile electric shock US (n = 1472) produced significantly greater activation in bilateral dorsal mid-insula, dorsal medial thalamus, and pre-supplementary motor area, extending to the dACC (n = 337) in LMMs. In normative modelling analyses, a tactile electric shock US predicted increased activation within the dACC extending to the pre-supplementary motor area, the dorsal precuneus, secondary somatosensory cortex, the bilateral dorsal mid- to- posterior insula, the midbrain and pons, and the superior cerebellum, and less activation (i.e., more deactivation) within an expanse of the vmPFC, and SI. Moreover, the use of an auditory US was significantly associated with increased activation in the left auditory cortex and was predictive of increased activation in the bilateral auditory cortex (superior temporal lobe) and less deactivation (i.e., more differential activation) within an expanse of the vmPFC extending to the dorsomedial prefrontal cortex, PCC, angular gyrus, and SI (Fig. 2b).
In LMMs, compared to paradigms with a single CS+ (n = 1283), paradigms with multiple CS+ (n = 605) produced increased activation in the left supplementary motor area (SMA) and left dorsal precuneus and widespread increased deactivation in regions including the bilateral temporal poles, the right parahippocampal gyrus extending to the fusiform gyrus, the left visual association cortex extending to the angular gyrus, and the right primary motor and somatosensory cortex. Comparing paradigms with multiple CS- (n = 302) and those with a single CS- (n = 1586) revealed identical regions with increased activation to paradigms with multiple CS + . Conversely, increased deactivation was shown in the bilateral anterior hippocampus, ventral PCC, primary motor and somatosensory cortex, precuneus, and right mid-insula. In normative models, this was modelled using two variables (multiple CS+ and multiple CS-). Multiple CS+ predicted less activation within the bilateral amygdala, a bilateral expanse of SI the angular gyrus, the PCC, the bilateral putamen and caudate, and the lingual gyrus. Similarly, multiple CS- predicted decreased activation within a bilateral expanse of SI and the lingual gyrus (Fig. 2c).
Pairing rate, treated as a continuous variable, did not relate to brain activation during conditioning in LMMs. However, due to the non-normal distribution of pairing rates across studies and individuals, we categorized pairing rates (e.g., 30, 50, and 100%) and conducted ANOVA-like LLMs followed by pairwise comparisons with Holm-Bonferroni correction, which revealed significant effects (Fig. 2d). In particular, the comparisons involving the 50% pairing rate category was the category where significant differences between categories occurred most frequently. The significant differences between the pairing rate categories occurred both with (Supplementary Fig. S6) and without (Supplementary Fig. S7) US confounding. The structure coefficients for pairing rate (as a linear association) showed that a higher pairing rate predicted greater activation within visual regions (calcarine, lingual gyrus and cuneus), the precuneus, the left dorsolateral prefrontal cortex, the superior gyrus of the temporal lobe, and (less deactivation of) an anterior region of the vmPFC. Conversely, a higher pairing rate predicted less activation within the mid-cingulate cortex, the bilateral anterior insula, a posterior region of the vmPFC as well as the thalamus and caudate (Fig. 2d).
Finally, potential US confounding (n = 997), compared to no confounding (n = 891), was associated with significantly increased widespread activation during fear conditioning (CS + > CS- contrast). This activation was observed across the bilateral fusiform and lingual gyri, temporal poles, angular gyri, posterior insula, primary motor cortex, retrosplenial cortex (extending to the posterior hippocampus), and right amygdala, predominantly in the superficial amygdala, in linear mixed models (LMMs). Similarly, structure coefficients from the normative models showed that the model predicted greater activation within the bilateral mid-cingulate cortex extending to the dorsomedial prefrontal cortex and pre-supplementary motor area, angular gyri, mid-to-posterior insula, superior temporal gyrus and temporal poles, fusiform gyri and lateral mid-occipital gyrus, amygdala, caudate, dorsal thalamus, and dorsolateral cerebellum with potential US confounding (Fig. 2e).
None of the above results were affected by excessive multicollinearity, except for the association between pairing rate and the potential US confound (Supplementary Tables S5–S8). We identified six small clusters where the effects of both variables overlapped. To further disentangle their contributions, we conducted mixed-effects models within these clusters, including both variables as predictors. Results indicated that both variables exerted statistically significant effects in all clusters except for one small cluster in the right middle occipital region, where the effect of the US confound was no longer significant. Given the absence of multicollinearity (Variance Inflation Factor = 1.8), we concluded that activation in this region is specifically modulated by the pairing rate, rather than by the US confound.
The remaining task variables (for example, the number of trials during preconditioning) showed weaker effects or were not significantly associated with brain (de)activation during conditioning in the mega-analysis or normative modelling analyses (Supplementary Figs. S8 and S9 and Supplementary Discussion).
Cases and controls show differences in conditioning
In the mega-analysis, individuals with anxiety-related and depressive disorders (cases, n = 311) showed significantly increased activation in the right ventrolateral prefrontal cortex (anterior pars orbitalis), dorsal frontal pole, PCC, left temporal pole, and bilateral primary motor areas compared to controls (n = 1888) (Fig. 3a). Similar results were found when comparing individuals with anxiety-related disorders (i.e., excluding major depressive disorder; remaining n = 297) and controls, with additional clusters observed in the dorsal prefrontal cortex, visual association cortex, and primary somatosensory cortex (Supplementary Fig. S10). After excluding individuals who were taking medication at the time of the scan, those with anxiety-related and depressive disorders (n = 221) still showed significantly increased activation in the dorsal medial prefrontal cortex, dorsal PCC extending to the superior parietal lobule, left medial TG and bilateral ventrolateral prefrontal cortex compared to controls (Supplementary Fig. S11).
Regions wherein individuals with anxiety-related and depressive disorders (n = 311) showed significantly increased functional activation to the CS+ versus CS − , as compared to healthy controls (a). Normative probability maps illustrate the percentage of participants in the sample (test controls – top; individuals with anxiety-related and depressive disorders – bottom) who had positive (hot colours – right) or negative deviations (cool colours – left) > ±2.6 within each voxel (b). Box plots show the distribution of the total number of large deviations (> ±2.6) per group. The centre line indicates the median; box bounds represent the 25 and 75th percentiles (interquartile range, IQR); whiskers extend to the smallest and largest values within 1.5 × IQR from the lower and upper quartiles. Sample sizes: control group n = 646; PTSD n = 55; OCD n = 68; GAD n = 48; SAD n = 31; total clinical group n = 202 (c). Normative probability maps illustrate the percentage of each clinical group who had positive (hot colours – right) or negative deviations (cool colours – left) > ±2.6 within each voxel (d). Confusion matrix for multi-class support vector machine differentiating patterns of deviations among clinical groups (e). ARDD anxiety-related and depressive disorders, GAD Generalised Anxiety Disorder, OCD Obsessive Compulsive Disorder, PTSD Post-traumatic Stress Disorder, SAD Social Anxiety Disorder. * = p < 0.05, ** = p < 0.01, *** = p < 0.0001. Kruskal–Wallis H-tests were used to test for main group effects (cases vs controls), with follow-up Mann–Whitney U tests false discovery rate (FDR) corrected at α = 0.05. Source data are provided as a Source Data file.
In normative modelling, we tested our clinical test sample (260 controls + 222 cases) against our reference normative models. This analysis compared the individuals’ deviation scores (z-score) within each voxel, and quantified, as a percentage of the sample, the frequency of participants with large positive or large negative deviations (Fig. 3b). We compared the frequency of extreme deviations throughout the whole brain (Normative Probability Maps thresholded at > ±2.6), and found that cases had, on average, a greater frequency of extreme deviations than controls (Mann–Whitney U test = 111,167.5, p = 0.014; Fig. 3c). Qualitatively, cases showed a different pattern of deviation frequency than controls. Large deviations (i.e., more activity than would be predicted by the model) were common across cases within the dorsomedial prefrontal cortex, the primary somatomotor cortex, precuneus, the bilateral primary visual cortex (medial occipital lobe extending to the inferior medial and inferior lateral lobe) extending to the lingual and fusiform gyrus. As with controls, cases frequently had large negative deviations within the most ventral region of the vmPFC.
PTSD and OCD show distinct activation patterns and deviations
We divided our patient sample by primary diagnosis (PTSD, n = 141; OCD, n = 68; GAD, n = 48; and SAD, n = 31; other diagnoses were not included due to small sample size). ANOVA-like LMMs indicated that there were significant differences in brain activation during conditioning among patient groups. Post-hoc pairwise comparisons corrected for multiple comparisons showed that the most significant differences occurred between individuals with PTSD and OCD with respect to individuals with GAD and SAD (Supplementary Fig. S12).
Similarly, normative modelling analyses identified a significant difference in the frequency of large deviations among patient groups (Kruskal–Wallis H test = 71.529, p = 1.98 × 10−13; Fig. 3c). Follow-up Mann-Whitney U tests (FDR corrected for multiple comparisons) clarified, for example, that extreme deviations occurred most frequently in individuals with PTSD, as compared to other disorders, followed by OCD. We then illustrated the location of these extreme deviations at the voxel level to determine whether they were spatially overlapping within and between patient groups (Fig. 3d and Supplementary Fig. S13). Individuals with PTSD showed frequent large positive deviations within the bilateral medial occipital lobe extending to the inferior temporal lobe and lingual gyrus, bilateral vlPFC, an expanse of the dmPFC, precuneus, and bilateral amygdala. They also showed frequent large negative deviations within an expanse of the vmPFC (posterior vmPFC focus), precuneus, and a focus of the lingual gyrus and fusiform gyrus. There were very few regions wherein individuals with GAD showed overlapping large deviations, and similarly for SAD except for a small region of the bilateral lingual gyrus frequently found to have large positive deviations. Individuals with OCD showed frequent large deviations within the inferior parietal cortex, and temporal pole. A support vector machine could not classify cases from controls better than chance using whole-brain deviation maps (mean AUC = 0.44, p = 1.0, 95% CI [0.30, 0.58]). However, a multi-class support vector classifier confirmed a unique pattern of deviations among cases (Fig. 3e). More specifically, PTSD, on average, was accurately classified 54.55% of the time (mean F1 score = 0.58; p = 2.06 × 10−23; balanced accuracy = 0.43, where chance level across four classes = 0.25). Interestingly, despite fewer overlapping extreme deviations within the OCD group, the classifier was able to accurately label individuals with OCD 73.74% of the time (mean F1 score: 0.57; p = 1.71 × 10−7, 95% CI [0.54, 0.60]). GAD and SAD were only accurately classified 31.78% (mean F1 score: 0.35) and 13.33% (mean F1 score: 0.17) of the time, respectively, and were often misclassified as OCD. The mean voxel-wise coefficient weights and frequency of contribution (in penalised permutations) to this classification are displayed in Supplementary Fig. S14.
Sample size for future studies
We conducted a series of sample size analyses to guide the design of future fMRI fear-conditioning studies (Supplementary Methods). To detect activation or deactivation in 50% of the brain regions identified in the mega-analysis (based on the AAL atlas39), a sample size of 122 was required, while detecting 80% of these regions required 243 participants (Supplementary Fig. S15). When considering activations only, the required sample sizes were slightly smaller: 100 participants to detect 50% and 199 participants to detect 80% of the mega-analytical findings. In contrast, substantially larger samples were needed to detect deactivations: 263 for 50% detection and 522 for 80%. The overall false positive rate was 9, 8, and 3% when activations and deactivations were assessed separately, averaging 5%. Additional sample sizes results are presented in Supplementary Figs. S16–S18.
Early and late conditioning
Given the importance of accounting for temporal dynamics in brain activity during human fear conditioning8, we compared neural activation during the early and late phases of conditioning in a subset of participants (n = 634 controls; Supplementary Table S2). Consistent with the effects observed across the entire task, both phases showed significant activation in the CS + > CS- contrast across several brain regions. These included the insular cortices, SII, dlPFC, lateral premotor cortices, dorsal and lateral cerebellum, dACC extending to the pre-supplementary motor area and SMA, and the dPrec (Supplementary Fig. S19). Notably, there were several significant differences between the phases. The early phase showed greater activation in the bilateral fusiform gyrus, SMA, right amygdala, left frontal eye fields, and left motor cortex compared to the late phase (Supplementary Fig. S19). Additionally, significant differences were also observed in the left angular gyrus; dorsal, medial, and ventral anterior prefrontal cortices; and lateral orbitofrontal cortex. However, as these regions were implicated in the CS + < CS- contrast, this suggests that they exhibited reduced deactivation during the late phase.
Discussion
We compiled the largest (n = 2199) sample of individual-level fear conditioning fMRI data to date to comprehensively delineate the neural correlates of human fear conditioning, to assess the influence of several relevant sources of variation - including individual differences and task variables- and to evaluate potential differences in fear conditioning at the neural level between individuals with anxiety-related and depressive disorders and controls.
Our individual-level mega-analysis mapped fear conditioning activation to the “central autonomic–interoceptive” or “salience” network. As hypothesised, fear conditioning was associated with robust activations in the anterior insula, ventral striatum, pre-supplementary /supplementary motor areas, dorsal anterior cingulate cortex, and dorsolateral prefrontal cortex. It was also associated with activation in several subcortical regions, particularly the thalamus and basal ganglia. While many of the observed effects replicated previous findings3, the increased statistical power provided by our analyses suggests that peak effects in the dorsal midbrain may originate in the substantia nigra/red nucleus and pretectal nuclei. Future work with a specific focus on these nuclei may aid in disentangling their specific contributions to fear conditioning. Also, as hypothesised, fear conditioning was associated with robust deactivations in the ventromedial prefrontal cortex and hippocampus. Other brain regions that were deactivated during conditioning included primarily regions of the default mode network (e.g., PCC and precuneus).
By incorporating a large sample from multiple laboratories worldwide, our study underscores the generalizability of the neural correlates of conditioning at the population level. At the same time, the methodological diversity across laboratories and studies suggests that our findings extend beyond specific experimental conditions, reinforcing their relevance to the broader fear conditioning process. Notably, at a time when neuroimaging research is increasingly emphasizing sample sizes in the thousands40, our analyses show that studies with 100 participants can still reliably detect the neural correlates of fear conditioning, at least when considering activations only. Furthermore, our findings highlight that a significant source of variability in neural responses during fear conditioning stems from differences in task design. This insight is crucial for future human fMRI studies, as it enables researchers to anticipate the expected effects of various task and contrast design choices, along with their magnitudes, at the neural level. By making these adjustments in advance, researchers can strike a balance between the advantages of large, standardized studies and those of smaller studies with greater methodological diversity. Moreover, our normative modeling results underscore the potential of fear conditioning paradigms for informing the development of targeted interventions. Specifically, normative models can identify brain regions with atypical activation during conditioning, providing valuable guidance for interventions, such as neuromodulatory treatments aimed at these regions41. Additionally, by pinpointing abnormal activation patterns, normative models enable clinicians to tailor treatments more precisely to address these specific neural deviations.
The amygdala was not robustly activated during fear conditioning in either our mega-analysis or ROI-based mega-analysis for the contrast averaging across all trials, consistent with our previous group-level meta-analysis3. However, and in line with a recent study by Wen and colleagues8 (n = 601, including individuals with anxiety-related disorders and controls), our analysis of early versus late trials in a large subsample of participants (n = 634 controls) revealed significantly greater activation in the right amygdala during early compared to late trials.
Inconsistencies regarding amygdala involvement in human fMRI conditioning studies have been attributed to several factors, including small sample sizes and limited anatomical specificity. The amygdala consists of distinct subregions, such as the basolateral (BLA) and centromedial (CMA) amygdala, and averaging responses may mask specific activations8,10. Moreover, the amygdala’s subcortical and ventral location, its small size, and the susceptibility artifacts and low signal-to-noise ratio associated with traditional imaging techniques can further hinder detection of significant effects42. Ultra-high field imaging has been shown to reduce these limitations and allows for more precise investigation of amygdala subnuclei43,44, making it a valuable tool for future human fear conditioning studies.
Taken together with the findings of Wen and colleagues, our results highlight the importance of considering temporal dynamics when assessing amygdala activity during fear conditioning8. Specifically, they confirm that amygdala activation is strongest during early trials and habituates thereafter45,46, suggesting that averaging across all conditioning trials may obscure these effects. In the current study, we also identified specific task features- such as the use of paradigms with multiple CS+ stimuli or US-related confounds- and diagnostic categories (e.g., PTSD; see also ref. 36) that modulate amygdala activity during conditioning. These findings underscore that both clinical and task-related variables may also contribute to the inconsistencies observed in the literature.
Biological sex had only minor effects, suggesting that fear conditioning mechanisms are relatively stable at the neural level between sexes. Additionally, none of our analyses found significant associations between brain activation during conditioning and levels of trait anxiety or depressive symptoms. While some mental health frameworks suggest that dimensional constructs of psychopathology, like trait anxiety, may better reflect neural activation patterns47, the variability and complexity in the neural states underlying these constructs and their lack of direct mapping to neural processes makes it challenging to identify clear linear relationships48,49.
The brain activation differences during conditioning between individuals with anxiety-related and depressive disorders and healthy controls, observed in the mega-analysis, aligned with normative modeling results, showing a higher frequency of large deviations in cases. Importantly, these differences remained significant even after excluding medicated cases, suggesting that the observed effects are not due to medication. This is crucial, as commonly used treatments like selective serotonin reuptake inhibitors (SSRIs) can influence brain activation patterns observed with fMRI50. When the analysis was limited to anxiety-related disorders, significant differences in brain activation persisted, indicating that individuals with pathological anxiety are characterized by abnormal neural responses during fear conditioning. These findings suggest that such abnormalities could eventually serve as neural markers for anxiety-related disorders51,52.
Among individuals with anxiety-related disorders, those with PTSD and OCD showed distinct patterns of brain activation and had distinct patterns of voxel-wise deviations that can be used to distinguish them from other anxiety-related disorders. This provides neurobiological support for the decision of current diagnostic classifications to separate these conditions53. In addition, it may provide valuable insights into the underlying mechanisms of psychopathology. The sample of individuals with PTSD was still relatively heterogeneous, with data from three independent samples, and yet there were often overlapping extreme positive deviations. Furthermore, using the derived deviation scores we were able to differentiate and classify individuals with PTSD and OCD with striking precision, compared to GAD and SAD. This is consistent with the previous literature that used mean averaging methods and reported differences in activation levels between groups of individuals with PTSD, compared to controls36,54. Taken together, this suggests that the neural mechanisms engaged during a fear conditioning paradigm are specifically relevant to the psychopathology of, and to some extent, similarly altered across individuals with PTSD; reinforcing the notion that fear conditioning is a foundational process in PTSD psychopathology, and as such, related tasks are a useful clinical model31. The accurate differentiation of OCD, despite few regions of overlapping large deviations, appeared to be driven by consistent coefficient weights with a region of the bilateral superior temporal gyrus and right vlPFC. Combined with no strong behavioural evidence55, mixed imaging evidence of differences in fear conditioning tasks in this population56,57,58,59, and evidence of altered baseline activity within the superior temporal region60, this finding may be interpreted as capturing compensatory mechanisms that individuals with OCD engage to overcome obsessions and achieve the same behavioural output55,60,61. Despite significant differences in the frequency of extreme deviations between individuals with GAD and SAD compared to controls, their limited spatial overlap and less accurate classifications, suggest that there is significant heterogeneity in fear conditioning among individuals with these diagnoses. Thus, while we suggest that the psychopathology of PTSD is uniquely related to fear or threat processing as captured by fear conditioning tasks, we propose that other anxiety-related disorders, particularly GAD and SAD are less so.
Our study has several limitations. First, despite using harmonized pre-processing pipelines and statistical models to account for site differences, variations in diagnostic routines and imaging acquisition contributed to sample heterogeneity, particularly among individuals with anxiety and depressive disorders (a label that includes already heterogenous disorders). Second, mega-analyses may have limited power to detect effects in small subgroups (e.g., SAD patients). Third, for participants with a mental health diagnosis, we focused on primary diagnoses and we could not assess (or control for) comorbidity. Fourth, while our normative models adjusted for site, age, biological sex, and task influences on brain activity, future studies should explore the impact of adding more variables in the model construction. It is possible that alternative model structures could have increased the explained variance in the relatively noisy BOLD activation (where other literature has explained up to 51.3% of the variance25). However, care must be taken not to overfit or reduce the generalisability of models to ensure their wider utility. Fifth, we were unable to include other individual-level measures of conditioning (e.g., psychophysiological data) in our analyses, as this would have required separate collection and harmonization procedures. Finally, cross-sectional data on brain activation during fear conditioning raises concerns about the reliability of outcome measures. Although fMRI-based fear conditioning shows limited test-retest reliability at the whole-brain level, significant within-subject similarity across repeated time points has been observed62, suggesting that large test-retest samples could help further validate the normative modeling approach, as demonstrated in other tasks25.
With this work, we provide the largest analysis of the neural correlates of human fear conditioning and potential sources of variation to date. Our results confirm that human fear conditioning is a robust phenomenon at the neural level, consistently engaging multiple brain regions within the central autonomic-interoceptive or salience network. Our comprehensive review of the influence of task design choices on elicited and predicted brain activation can be used to help interpret differences in the previous literature and should remind researchers of the potentially significant influence of task design choices. Finally, we found that there are overall differences in fear conditioning at the neural level between individuals with anxiety-related and depressive disorders and controls, and that a unique mechanism of PTSD psychopathology is well captured by fear conditioning paradigms, supporting the use of these models to study this disorder.
Methods
The current manuscript combines two pre-registered analyses of individual-level fear conditioning fMRI data (https://osf.io/7n953; https://osf.io/w74bt). Data were collated from 43 samples originating from 23 sites in nine countries. Collation was coordinated by the lead group (IDIBAPS Barcelona). ENIGMA Fear Conditioning is part of the larger ENIGMA-Anxiety Working Group63. Table 1 summarizes the descriptive information on the samples. Informed consent was obtained from all participants by the sites providing their data. Some site-specific data have been reported previously, but no reports have examined all individual data together.
Inclusion and ethics
This study involved secondary analyses of previously collected human neuroimaging datasets. No new data were acquired specifically for the purposes of this study. All original studies received approval from their respective institutional ethics committees and were conducted in accordance with the Declaration of Helsinki. Informed consent was obtained from all participants in the original studies. The following Ethics Committees approved the individual studies: Ethics Committee of Ulm University, Ulm, Germany; Regional Ethics Review Board in Uppsala; Institutional Review Board of University Hospital of Bellvitge, Barcelona, Spain and Institutional Review Board of Hospital del Mar, Barcelona, Spain; Institutional Review Board University of Wisconsin; Institutional Review Board University of Arkansas for Medical Sciences; Comissão de Análise de Projetos de Pesquisa do Hospital das Clínicas da Faculdade de Medicina da Universidade de São Paulo (CAPPesq); Institutional Review Board University of Texas at Austin; Ethics Committee of the University Hospital Essen, Germany; University of Southern California Institutional Review Board; Louisiana State University Institutional Review Board; The University of Melbourne Human Research Ethics Committee; Institutional Review Board University of Minnesota; Institutional Review Board University of Florida; Ethics Committee of Klinikum Rechts der Isar, Technische Universität München; Duke University Health System Institutional Review Board; Ethics Committee of the General Medical Council Hamburg; University of Washington and Harvard University Institutional Review Boards; Ethics Committee of the Faculty of Medicine of the Ruhr University Bochum; Ethics Committee of the Faculty of Psychology and Sport Science, University of Giessen; University Research Ethics Committee of the University of Reading, UK; Clinical Research Ethics Committee (CEIC) of the Bellvitge University Hospital; Ethics Committee of the German Psychological Society; Partners HealthCare Institutional Review Board; Ethics Review Board (ERB) University of Amsterdam; Institutional Review Board, New York State Psychiatric Institute.
Inclusion and exclusion criteria were pre-specified within each dataset and were applied consistently during data aggregation. Sex was either self-reported by participants or recorded as sex assigned at birth, depending on the protocol of the original study. Sex was considered in the study design and used for descriptive statistics and group-level comparisons. Gender identity was not systematically collected across datasets and was therefore not analysed.
Fear conditioning task
We included data from participants who completed a fear conditioning experiment during an fMRI scan. There are several human fear conditioning paradigms, which vary based on the time elapsed between the CS and the US (e.g., delay, trace, simultaneous, or backward conditioning), the use of one (single-cue) versus two or more (differential-cue) CSs, and the instructions given to participants2: 1) No instructions: For example, “During this experiment, you will see various images and might experience mild electric shocks at certain times”; 2) Partial instructions: For example, “During this experiment, you may see a particular image sometimes followed by a mild electric shock. However, the shock won’t happen every time you see the image, and sometimes it might not appear at all. Pay attention to the images, as they might give you some indication of when the shock could occur”; 3) Full instructions (instructed conditioning): For example, “During this experiment, you will see the image X, which is always followed by a mild electric shock. Whenever this image appears, it will be followed by the shock shortly afterward. No other images will be associated with the shock”.
We focused on delay differential cue-conditioning paradigms with no or partial instructions (i.e., excluded instructed conditioning studies), and focused our analysis on comparing the response to a CS+ compared to a CS-. Table 2 summarises information on the fear conditioning tasks included in this manuscript.
Non-imaging data: sociodemographics and individual differences
All sites were asked to provide information regarding sociodemographics (age, biological sex) and individual differences: trait anxiety, assessed with the Trait subscale of the State-Trait Anxiety Inventory (STAI-T)64; and depressive symptoms, assessed with the Beck Depression Inventory (BDI)65 (Supplementary Table S1). For individuals with anxiety-related and depressive disorders, sites were asked about principal mental health diagnosis and psychotropic medication use at the time of the scan (Supplementary Table S3). Previous normative studies of trait anxiety (STAI-T) have shown additive and multiplicative differences across countries, for which we harmonised trait anxiety scores across countries using ComBat14 (Supplementary Methods) and conducted subsequent analyses twice: once with the raw scores and once with the country-harmonised scores.
Non-imaging data: task-related variables
We collected information about the following task variables: instructions given to the participant about contingency prior to the task (partial versus no information); type of US (e.g., electric shock versus aversive sound); number of trials during pre-conditioning; use of a paradigm with multiple CSs (i.e., more than one CS+ or CS-) during conditioning; type of CS (e.g. geometrical figures, faces, etc); number of CS+ and CS- trials during conditioning; average ITI (inter-trial interval); average ISI (inter-stimulus interval, i.e., between the CS+ and the US); pairing rate (percentage of CS+ followed by a US); potential US confounding; and the number of CS+ trials and CS- trials included in the fMRI contrast. For studies assessing awareness (conscious recognition of the association between the CS+ and the US, after the task), we also asked about participant´s contingency awareness (yes vs. no). Task variables were not explicitly listed in the pre-registration. The decision to include these variables was based on previous research2,13.
Processing of neuroimaging data
We included only neuroimaging data acquired with whole-brain coverage. Individual-level raw task-based fMRI data were processed using the Harmonized Analysis of Functional MRI pipeline (HALFpipe, version 1.2.2)66, a tool developed within the ENIGMA consortium to harmonise fMRI analyses across sites and facilitate reproducible analyses. HALFpipe provides a standardised workflow that extends fMRIprep67 with several additional preprocessing steps, including spatial smoothing, grand mean scaling, temporal filtering, and confound regression. Moreover, HALFpipe generates a standardised quality assessment of the preprocessing outputs and imaging raw data (Supplementary Table S4). We used HALFPIPE default parameters (smoothing using 6 mm FWHM; confound removals using ICA-AROMA; and a high-pass filter of 125 s).
For the current study, each site was provided with a comprehensive manual to perform image pre-processing and quality control with HALFpipe in a fully harmonised manner, and each group shared the HALFPIPE output files for each individual along with the non-imaging data for each individual. The lead group (IDIBAPS-Barcelona) processed five sites, aggregated all the data, and carried out additional quality control procedures and measures to ensure the comparability of the data, as described in the Supplementary Methods).
Statistical analyses
We conducted two types of statistical analyses: mega-analyses and normative modelling analyses.
Mega-analyses
Participants
We included data from 2199 participants (M Age = 25.26, SD = 5.47; 57.2% female) comprising 1888 healthy controls (M Age = 25.85, SD = 8.51; 51.53% female) and 311 individuals with a primary diagnosis of an anxiety-related or depressive disorder (M Age = 29.91, SD = 10.75; 58.84% female) (Table 1 and Table 3). Diagnoses were established with structured clinical interviews.
Pre-scaling
Although we used the exact same processing protocol and conducted extensive quality control (see above), we observed differences in the BOLD response between samples, most likely due to varying units of measurement (note that MRI scans are acquired in arbitrary units68. To address these differences, we pre-scaled the images for healthy controls so that, for each sample, the voxel-wise-median standard deviation (after removing the effects of covariates) was 1 (see Supplementary Methods). We then applied the pre-scaling parameters obtained from the healthy controls to the cases (individuals with a primary diagnosis of an anxiety-related or depressive disorder). This approach differs from using the individual z-statistic images (i.e., dividing the BOLD response by its standard error), which we did not adopt for the mega-analysis. The reason is that the standard error may differ between cases and controls, and thus, differences in z-statistics between groups could reflect differences in the standard error rather than in the BOLD response (for more details, see Supplementary Methods).
Analyses
Differences in brain coverage across sites prevented us from using the standard ComBat method, which determines the harmonisation parameters using all voxels14. Additionally, there was no need to remove differences in scaling because we had already pre-scaled the images as described above. Thus, we used LMMs (with the sample as a random intercept) to investigate: 1st the pattern of brain activation during fear conditioning in healthy controls and in cases (individuals with anxiety-related and depressive disorders), which tested whether the mean activation in each voxel was non-null; 2nd the pattern of differences in brain activation during fear conditioning between cases and controls, which tested whether activation in each voxel was different between cases and controls; 3rd the pattern of differences in brain activation during fear conditioning among patient groups (PTSD, OCD, GAD, SAD), testing whether activation in each voxel differed among patient groups; 4th the potential influence of individual differences and task variables (see above) on brain activation during fear conditioning in healthy controls, which tested whether activation in each voxel was significantly associated with each individual differences or task variable. In all models, we incorporated age and sex as covariates. Significant LMMs comparing three or more groups (analog to ANOVAs) were followed by pairwise comparisons with Holm-Bonferroni correction.
We also conducted an ROI mega-analysis focusing on the amygdala. For this analysis, we extracted the pre-scaled BOLD response in the left and right amygdala based on the Automated Anatomical Labeling atlas39. We used an LMM, with age and sex as covariates, to test whether the mean activation significantly differed from zero. Potential differences between early and late conditioning were also analyzed using an LMM, with age and sex as covariates in a subsample of controls (n = 679; Supplementary Table S2).
We fitted the LMMs using custom functions (included in ‘combat.enigma’ R package) that call the ‘nlme’ R package voxel-wise and address voxel-specific details (e.g., varying collinearity due to differing brain coverage; see Supplementary Methods). FSL was then used to derive cluster-based corrected p-values using Gaussian Random Field (GRF) theory.
Analyses of multicollinearity
Given the diverse range of variables examined in this study—many of which may be influenced by methodological factors (e.g., pairing rate, type of conditioned stimuli) or sample characteristics (e.g., patient vs. control group)—there is a potential risk of confounding. That is, the observed effects attributed to one task variable may partially or wholly reflect the influence of another. To address this possibility, we systematically assessed interrelationships among all methodological and clinical variables using correlation analysis and evaluated multicollinearity using variance inflation factors (VIF)(Supplementary Tables S5-S8). For pairs of variables with correlation coefficients exceeding 0.5 (or η and Cramér’s V when involving categorical variables), we further examined whether their associated activation maps exhibited spatial overlap. Overlap was defined as clusters of at least 10 contiguous voxels showing significant activation for both task contrasts. This approach was guided by the rationale that classical confounding requires both variables to be associated with activation in the same brain region. For any pair of correlated variables with overlapping activation, we re-estimated the mixed-effects linear models within the overlapping clusters, this time including both variables as predictors, to determine whether their effects remained statistically significant. A reduction to non-significance upon joint inclusion could indicate either collinearity (as suggested by the VIF) or potential confounding.
Effect sizes
To compare the effect sizes of different variables and to exclude findings with negligible or very small effects, we converted the regression coefficients of the peaks into correlation coefficients (Pearson r). For variables comparing two groups (e.g., cases vs. controls), we also calculated the corresponding standardised mean differences (Cohen’s d). We considered effects with r < 0.2 (roughly equivalent to d < 0.4 for balanced binary variables) to be small, and only highlighted larger effects (i.e., r > 0.2, i.e., at least moderate) in the main text. It is important to note that peak effect sizes should be interpreted with caution, as they correspond to the peaks of clusters of statistical significance and are, therefore, larger than those obtained by other methods. Effect sizes for all the LMMs can be found at https://figshare.com/s/d44cc1390711bad3c147
Normative modelling analyses
Participants
We included data from 2022 participants; 1800 healthy controls (age range 8–66 years, M Age: 25.66 ± 8.4, 53.05% female) and 222 individuals with anxiety-related and depressive disorders (age range 9–63, M Age: 28.27 ± 11.06, 54.95% female) to build and test the normative models. See Table 1 note to explain discrepancy in participant numbers from mega-analysis.
Generating Normative Models of Activation to the CS + > CS- contrast
The z-statistic maps (files) from the CS + > CS- contrast for each participant were used as inputs (response variables) for the normative models. We created a normative model of fear-related activation per voxel, as a function of age, sex, and task variables (the same reported in the Non-imaging data: task-related variables section, except contingency awareness) by training a Gaussian Bayesian Linear Regression (BLR) model to predict activation for the CS + > CS- contrast22. We included dummy-coded site-related variables (sample, and MR strength) and a b-spline basis expansion as additional covariates of no-interest. This was performed in the Predictive Clinical Neuroscience toolkit (PCNtoolkit) software v0.26 (https://pcntoolkit.readthedocs.io/en/latest) implemented in Python 3.8. Generalisability was assessed by using a stratified train-test sample (train: 894, control test sample: 646).
Quantifying voxel-wise deviations from the normative model
To estimate a pattern of regional deviations from typical brain function for each participant in the control test sample (n = 646, mean age: 25.45 ± 7.19 years, 52.16% female), we derived a normative probability map (NPM) that quantifies the voxel-wise deviation from the normative model. The subject-specific Z-score indicates the difference between the predicted activation and true activation scaled by the prediction variance. This was repeated for the clinical test sample (n = 482, 260 controls + 222 cases, mean age: 26.76 ± 10.94 years, 54.97% female). We thresholded participants’ NPM at Z = ± 2.6 (i.e., p < 0.005) as in previous work69,70,71 and summed the number of significantly deviating voxels for each participant. Kruskal-Wallis H-tests were used to test for group (cases or controls) and diagnosis effects and, when applicable, follow-up Mann–Whitney U tests were False Discovery Rate (FDR) corrected at α = 0.0572.
Normative models: individual differences and task variables
Model Coefficients
To probe the magnitude of the influence of individual differences (sociodemographics) and task variables on the predicted brain activation, we examined both the regression coefficients and the structure coefficients (correlation coefficients) of all sociodemographic and task variables input variables. Structure coefficients are preferable to regression coefficients when variables are collinear73. Note that negative r2 values(“negative” explained variance) are a possible outcome when the model fails to generalize effectively to new data, despite in-sample performance yielding non-negative explained variance (which is always positive or zero by construction). This phenomenon is not uncommon and arises when the model’s predictions result in a residual sum of squares that exceeds the variance of the true values.
Linear Regression (Elastic Net) and Support Vector Classification (SVC)
We applied an elastic net linear regression as implemented in the scikit-learn package (version 1.0.2)74 with 10 repeats of nested 5-fold cross validation [alphas = 0.0001, 0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 1; 90% train, 10% test split] to predict trait anxiety as measured by the STAI-T (n = 751), or depressive symptoms as measured by the BDI (n = 440) from participants’ whole brain (unthresholded) deviation maps. The mean coefficient values and the frequency of the voxel’s contribution (in other words, how many of the cross-folds split found this voxel to be important) indicate which voxel contributed to the prediction. The statistical significance of these results was tested against a 1000-fold nested 5-fold test for each variable. To classify participants (n = 703) who were contingency aware from those who were not based on their unthresholded whole-brain deviation maps, we used an SVC model with a linear kernel, regularisation parameter set to 1.0, and balanced class weights as implemented in the scikit-learn package (version 1.0.2).
Quantifying patterns of deviations between cases and controls
To classify individuals with anxiety-related or mood disorders and controls based on their whole brain unthresholded deviation maps, we used a SVC model with a linear kernel, regularisation parameter set to 1.0, as is common in neuroimaging, and balanced class weights (i.e. adjusted inversely proportional to class frequencies in the input data) as implemented in the scikit-learn package (version 1.0.2)74. The evaluation metric for the classification is area under the receiving operator curve (AUC) averaged across all folds within a 10-fold cross validation framework.
Quantifying patterns of deviations among patient groups
We used a one versus rest support vector classifier (SVC OvR) model as implemented in the scikit-learn package (sklearn.multiclass.OneVsRestClassifier version 1.0.2) to determine if there were quantifiably differentiable patterns within the whole brain unthresholded deviation maps among patient groups. Due to the small number of individuals with major depressive disorder (n = 11), specific phobia (n = 7), and panic disorder (n = 2), this analysis only included individuals diagnosed with PTSD (n = 55), OCD (n = 68), GAD (n = 48), and SAD (n = 31) (total n = 202). The model classes corresponded to the participants’ diagnoses. The model classes were the participants’ diagnosis. The evaluation metric for the classification was the F1-metric (the harmonic mean of precision and recall, also known as the balanced F-score, where values closer to 1 indicate greater classification success) per class within a 5-fold cross-validation framework, and the statistical significance was tested against a 1000-fold nested 5-fold test.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The results data generated in this study have been deposited at: https://doi.org/10.6084/m9.figshare.28540580.v1 The individual-level fMRI processed data (HALFIPE results files) are available for secondary data analysis. Access can be obtained by submitting an analysis plan to the ENIGMA-Anxiety Working Group (http://enigma.ini.usc.edu/ongoing/enigma-anxiety/). Data access is contingent on approval by PIs from contributing samples. The raw individual fMRI data are protected and are not available due to data privacy laws. Source data are provided with this paper.
Code availability
All code to reproduce the analyses in this manuscript is available at: https://doi.org/10.6084/m9.figshare.28540580.v1 The functions needed to conduct the mega-analysis are also included in the ‘combat.enigma’ R package.
References
Beckers, T. et al. Understanding clinical fear and anxiety through the lens of human fear conditioning. Nat. Rev. Psychol. 2, 233–245 (2023).
Lonsdorf, T. B. et al. Don’t fear ‘fear conditioning’: methodological considerations for the design and analysis of studies on human fear acquisition, extinction, and return of fear. Neurosci. Biobehav. Rev. 77, 247–285 (2017).
Fullana, M. A. et al. Neural signatures of human fear conditioning: an updated and extended meta-analysis of fMRI studies. Mol. Psychiatry 21, 500–508 (2016).
Tovote, P., Fadok, J. P. & Lüthi, A. Neuronal circuits for fear and anxiety. Nat. Rev. Neurosci. 16, 317–331 (2015).
LeDoux, J. The amygdala. Curr. Biol. 17, R868–R874 (2007).
Johansen, J. P., Cain, C. K., Ostroff, L. E. & LeDoux, J. E. Molecular mechanisms of fear learning and memory. Cell 147, 509–524 (2011).
Bechara, A. et al. Double dissociation of conditioning and declarative knowledge relative to the amygdala and hippocampus in humans. Science 269, 1115–1118 (1995).
Wen, Z. et al. Temporally and anatomically specific contributions of the human amygdala to threat and safety learning. Proc. Natl. Acad. Sci. USA 119, e2204066119 (2022).
Visser, R. M., Bathelt, J., Scholte, H. S. & Kindt, M. Robust BOLD responses to faces but not to conditioned threat: challenging the amygdala’s reputation in human fear and extinction learning. J. Neurosci. 41, 10278–10292 (2021).
Bach, D. R., Weiskopf, N. & Dolan, R. J. A stable sparse fear memory trace in human amygdala. J. Neurosci. 31, 9383–9389 (2011).
Fullana, M. A. et al. Amygdala where art thou? Neurosci. Biobehav Rev. 102, 430–431 (2019).
Morriss, J., Hoare, S. & van Reekum, C. M. It’s time: a commentary on fear extinction in the human brain using fMRI. Neurosci. Biobehav Rev. 94, 321–322 (2018).
Lonsdorf, T. B. & Merz, C. J. More than just noise: Inter-individual differences in fear acquisition, extinction and return of fear in humans - biological, experiential, temperamental factors, and methodological pitfalls. Neurosci. Biobehav. Rev. 80, 703–728 (2017).
Radua, J. et al. Increased power by harmonizing structural MRI site differences with the ComBat batch adjustment method in ENIGMA. Neuroimage 218, 116956 (2020).
Müller, V. I. et al. Ten simple rules for neuroimaging meta-analysis. Neurosci. Biobehav. Rev. 84, 151–161 (2018).
Thompson, P. M. et al. ENIGMA and the individual: Predicting factors that affect the brain in 35 countries worldwide. Neuroimage 145, 389–408 (2017).
Sjouwerman, R., Scharfenort, R. & Lonsdorf, T. B. Individual differences in fear acquisition: multivariate analyses of different emotional negativity scales, physiological responding, subjective measures, and neural activation. Sci. Rep. 10, 15283 (2020).
Smith, M. C. CS-US interval and US intensity in classical conditioning of the rabbit’s nictitating membrane response. J. Comp. Physiol. Psychol. 66, 679–687 (1968).
Garcia, J. & Koelling, R. A. Relation of cue to consequence avoidance learning. Psychon. Sci. 4, 123–124 (1966).
Segal, A. et al. Embracing variability in the search for biological mechanisms of psychiatric illness. Trends Cogn. Sci. 29, 85–99 (2025).
Marquand, A. F. et al. Conceptualizing mental disorders as deviations from normative functioning. Mol. Psychiatry 24, 1415–1424 (2019).
Fraza, C. J., Dinga, R., Beckmann, C. F. & Marquand, A. F. Warped Bayesian linear regression for normative modelling of big data. Neuroimage 245, 118715 (2021).
Marquand, A. F., Rezek, I., Buitelaar, J. & Beckmann, C. F. Understanding heterogeneity in clinical cohorts using normative models: beyond case-control studies. Biol. Psychiatry 80, 552–561 (2016).
Rutherford, S. et al. Evidence for embracing normative modeling. Elife 12, e85082 (2023).
Savage, H. S. et al. Dissecting task-based fMRI activity using normative modelling: an application to the emotional face matching task. Commun. Biol. 7, 814–888 (2024).
Wolfers, T. et al. Individual differences v. the average patient: mapping the heterogeneity in ADHD using normative models. Psychol. Med. 50, 314–323 (2020).
Zabihi, M. et al. Fractionating autism based on neuroanatomical normative modeling. Transl. Psychiatry 10, 384 (2020).
Tian, Y. E., Cole, J. H., Bullmore, E. T. & Zalesky, A. Brain, lifestyle and environmental pathways linking physical and mental health. Nat. Ment. Health 2, 1250–1261 (2024).
Kjelkenes, R. et al. Deviations from normative brain white and gray matter structure are associated with psychopathology in youth. Dev. Cogn. Neurosci. 58, 101173 (2022).
Pittig, A., Treanor, M., LeBeau, R. T. & Craske, M. G. The role of associative fear and avoidance learning in anxiety disorders: gaps and directions for future research. Neurosci. Biobehav. Rev. 88, 117–140 (2018).
Fullana, M. A. et al. Human fear conditioning: from neuroscience to the clinic. Behav. Res. Ther. 124, 103528 (2020).
Greco, J. A. & Liberzon, I. Neuroimaging of fear-associated learning. Neuropsychopharmacology 41, 320–334 (2016).
Craske, M. G. et al. Anxiety disorders. Nat. Rev. Dis. Prim. 3, 17024 (2017).
Marin, M.-F., Hammoud, M. Z., Klumpp, H., Simon, N. M. & Milad, M. R. Multimodal categorical and dimensional approaches to understanding threat conditioning and its extinction in individuals with anxiety disorders. JAMA Psychiatry 77, 618–627 (2020).
Kausche, F. M., Carsten, H. P., Sobania, K. M. & Riesel, A. Fear and safety learning in anxiety- and stress-related disorders: an updated meta-analysis. Neurosci. Biobehav. Rev. 169, 105983 (2024).
Suarez-Jimenez, B. et al. Neural signatures of conditioning, extinction learning, and extinction recall in posttraumatic stress disorder: a meta-analysis of functional magnetic resonance imaging studies. Psychol. Med. 50, 1442–1451 (2020).
Savage, H. S., Davey, C. G., Fullana, M. A. & Harrison, B. J. Threat and safety reversal learning in social anxiety disorder - an fMRI study. J. Anxiety Disord. 76, 102321 (2020).
Savage, H. S. et al. Neural mediators of subjective and autonomic responding during threat learning and regulation. Neuroimage 245, 118643 (2021).
Tzourio-Mazoyer, N. et al. Automated anatomical labeling of activations in spm using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. NeuroImage 15, 273–289 (2002).
Marek, S. et al. Reproducible brain-wide association studies require thousands of individuals. Nature 603, 654–660 (2022).
Likhtik, E. & Johansen, J. P. Neuromodulation in circuits of aversive emotional learning. Nat. Neurosci. 22, 1586–1597 (2019).
Boubela, R. N. et al. fMRI measurements of amygdala activation are confounded by stimulus correlated signal fluctuation in nearby veins draining distant brain regions. Sci. Rep. 5, 10499 (2015).
Sladky, R. et al. High-resolution functional MRI of the human amygdala at 7 T. Eur. J. Radio. 82, 728–733 (2013).
Kirstein, C. F., Güntürkün, O. & Ocklenburg, S. Ultra-high field imaging of the amygdala – A narrative review. Neurosci. Biobehav. Rev. 152, 105245 (2023).
Blackford, J. U., Allen, A. H., Cowan, R. L. & Avery, S. N. Amygdala and hippocampus fail to habituate to faces in individuals with an inhibited temperament. Soc. Cogn. Affect Neurosci. 8, 143–150 (2013).
Bas-Hoogendam, J. M. et al. Impaired neural habituation to neutral faces in families genetically enriched for social anxiety disorder. Depress. Anxiety 36, 1143–1153 (2019).
Morris, S. E. et al. Revisiting the seven pillars of RDoC. BMC Med. 20, 220 (2022).
Pessoa, L. How many brain regions are needed to elucidate the neural bases of fear and anxiety?. Neurosci. Biobehav. Rev. 146, 105039 (2023).
Poldrack, R. A. Inferring mental states from neuroimaging data: from reverse inference to large-scale decoding. Neuron 72, 692–697 (2011).
Armand, S. et al. Functional brain responses to emotional faces after three to five weeks of intake of escitalopram in healthy individuals: a double-blind, placebo-controlled randomised study. Sci. Rep. 14, 3149 (2024).
Shackman, A. J. & Fox, A. S. Two decades of anxiety neuroimaging research: New insights and a look to the future. Am. J. Psychiatry 178, 106–109 (2021).
Wen, Z., Marin, M.-F., Blackford, J. U., Chen, Z. S. & Milad, M. R. Fear-induced brain activations distinguish anxious and trauma-exposed brains. Transl. Psychiatry 11, 46 (2021).
American Psychiatric Association. Diagnostic and Statistical Manual of Mental Disorders. (American Psychiatric Association, 2013).
Kaczkurkin, A. N. et al. Neural substrates of overgeneralized conditioned fear in PTSD. Am. J. Psychiatry 174, 125–134 (2017).
Cooper, S. E. & Dunsmoor, J. E. Fear conditioning and extinction in obsessive-compulsive disorder: a systematic review. Neurosci. Biobehav. Rev. 129, 75–94 (2021).
Milad, M. R. et al. Deficits in conditioned fear extinction in obsessive-compulsive disorder and neurobiological changes in the fear circuit. JAMA Psychiatry 70, 608–554 (2013).
Apergis-Schoute, A. M. et al. Neural basis of impaired safety signaling in obsessive compulsive disorder. Proc. Natl. Acad. Sci. USA 114, 3216–3221 (2017).
Cano, M. et al. Neural correlates of fear conditioning and fear extinction and its association with cognitive-behavioral therapy outcome in adults with obsessive-compulsive disorder. Behav. Res. Ther. 144, 103927 (2021).
Hearne, L. J. et al. Revisiting deficits in threat and safety appraisal in obsessive-compulsive disorder. Hum. Brain Mapp. 44, 6418–6428 (2023).
Fan, J. et al. Spontaneous neural activity in the right superior temporal gyrus and left middle temporal gyrus is associated with insight level in obsessive-compulsive disorder. J. Affect. Disord. 207, 203–211 (2017).
Westlin, C. et al. Improving the study of brain-behavior relationships by revisiting basic assumptions. Trends Cogn. Sci. 27, 246–257 (2023).
Klingelhöfer-Jens, M., Ehlers, M. R., Kuhn, M., Keyaniyan, V. & Lonsdorf, T. B. Robust group- but limited individual-level (longitudinal) reliability and insights into cross-phases response prediction of conditioned fear. Elife 11, e78717 (2022).
Bas-Hoogendam, J. M. et al. ENIGMA-anxiety working group: Rationale for and organization of large-scale neuroimaging studies of anxiety disorders. Hum. Brain Mapp. 43, 83–112 (2022).
Spielberger, C. D., Gorsuch, R. L., Lushene, R., Vagg, P. R. & Jacobs, G. A. Manual for the State-Trait Anxiety Inventory. (Consulting Psychologists Press, 1983).
Beck, A. T., Ward, C. H., Mendelson, M., Mock, J. & Erbaugh, J. An inventory for measuring depression. Arch. Gen. Psychiatry 4, 561–571 (1961).
Waller, L. et al. ENIGMA HALFpipe: interactive, reproducible, and efficient analysis for resting-state and task-based fMRI data. Hum. Brain Mapp. 43, 2727–2742 (2022).
Gorgolewski, K. et al. Nipype: a flexible, lightweight and extensible neuroimaging data processing framework in Python. Front. Neuroinform. 5, 13 (2011).
Shinohara, R. T. et al. Statistical normalization techniques for magnetic resonance imaging. Neuroimage Clin. 6, 9–19 (2014).
Wolfers, T. et al. Mapping the heterogeneous phenotype of schizophrenia and bipolar disorder using normative models. JAMA Psychiatry 75, 1146–1155 (2018).
Holz, N. E. et al. A stable and replicable neural signature of lifespan adversity in the adult brain. Nat. Neurosci. 26, 1603–1612 (2023).
Floris, D. L. et al. Atypical brain asymmetry in autism—a candidate for clinically meaningful stratification. Biol. Psychiatry Cogn. Neurosci. Neuroimag. 6, 802–812 (2021).
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B Stat. Methodol. 57, 289–300 (1995).
Kraha, A., Turner, H., Nimon, K., Zientek, L. R. & Henson, R. K. Tools to support interpreting multiple regression in the face of multicollinearity. Front Psychol. 3, 44 (2012).
Pedregosa, F. et al. Scikit-learn: machine Learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2012).
Acknowledgements
The research was supported by Secretaria d’Universitats i Recerca del Departament d’Economia i Coneixement de la Generalitat de Catalunya (No. 2021 SGR 1128, J.R.; M.A.F.), the Swedish Research Council (No. 2014-01160, F.Å.), the European Research Council (No. 648176, T.B.; ERC-2018 CoG-816564, D.B.; 101001118, A.F.M.), the National Institute of Mental Health (No. R01MH119132 and MH108753, J.C.; R01MH095904 and K23MH076054, D.J.H.; R01MH125615, A.K.; R01-MH103291, K.A.M.; 1R61MH129559-02, Y.N.; R01-AA031261, A.S.; K01MH122774, X.Z.; R01MH131532, B.S.J; R01MH125615, L.A.), the German Research Foundation (No. 316803389 - SFB 1280, S.E., C.J.M. and A.I.; LO 1980/1-1 and CRC-TRR58 subproject B07, T.B.L.; LO 1980/4-1, T.B.L. and M.R.E.; 521379614 – TRR393, B.S., U.D. and T.K.; 44541416 CRC-TRR58 T.S., K.R., K.D. and M.J.; 442075332 (RU 5187) and 461947532 (RU 5389), U.L.; WA 1539/11-1, H.W.;), the Economic and Social Research Council UK (No. ES/W000776/1, D.B.), the NSERC Discovery (No. RGPIN-2021-02906, S.G.), the National Institute of Health (No. R01 MH106574, C.L.; R01 MH122387, J.D.), the United states-Israel binational science foundation and NIA (No. 2P30AG064198, Y.N.), the National Institute of Mental Health Intramural Research (No. ZIA-MH-002781, D.P.), the Biotechnology and Biological Sciences Research Council (No. BB/L02697X/1, C.vR.), NIAAA (No. R01-AA030042, A.S.), the Instituto de Salud Carlos III (ISCIII) (No. PI16/00889 and PI19/01171, C.S.M.; FI22/00219, M.O.), Fundació Marató de TV3 (No. 202201-31, C.S.M.), Agència de Gestió d’Ajuts Universitaris i de Recerca (No. 2021SGR01017, C.S.M.), the Ministerio de Ciencia, Innovación y Universidades, Spain (No. PID2022-139081OB-C22, C.S.M.), the Department of Veteran Affairs (No. I01-CX002760, D.M.S.), the German Federal Ministry of Education and Research (No. 01EE1402E, B.S., T.K., A.P., U.D., K.D. and H-U.W.; 01EE2101, U.L.), Brain and Behavior Research Foundation NARSAD Young Investigator Award (X.Z.), “la Caixa” Foundation (No. LCF/BQ/IN17/11620071, V.P.A.), Spanish Ministry of Science, Innovation and Universities (No. JDC2022–048445-I, V.P.A), the National Eye Institute Core Grant (No. P30 EY001319, B.S.J.), the Medical Research Council (No. MR/J003980/1, J.M.), National Science Foundation Graduate Research Fellowship (No. DGE1745303, S.N.D.), the South African Medical Research Council (D.J.S.), the EU Innovative Medicines Initiative (IMI) 2 Joint Undertaking (IMI2 JU) (No. 101034377, NJVW), the Dutch Research Agenda (NeuroLabNL–Small Projects for NWA routes 21/22) (No. NWA.1418.22.025, JM.B-H.), the Talent Acceleration grant (Medical Delta) (JM.B-H.), the NIH Big Data to Knowledge (BD2K) award (No. U54 EB020403 and R01MH131806, P.M.T.), the National Health and Medical Research Council of Australia (NHMRC) Project Grants (No. 1161897 and 1145010, B.J.H.).
Author information
Authors and Affiliations
Contributions
J.R., H.S.S., E.V., and M.A.F. designed and performed analyses. J.R., H.S.S., A.J., and. M.A.F. wrote the manuscript. J.M.B-H., N.A.G., D.J.S., N.J.V.W., J.D., A.F.M., and. B.J.H. discussed the results. M.A.F. supervised research. J.R., H.S.S., E.V.,. A.J., B.A., F.Å., T.B., N.C., J.M.C., J.B.D., D.R.B., S.E., S.G., D.J.H., A.N.K., A.K., M.K., K.K., K.S.L., C.L.L., Cr.L.L., T.B.L., C.J.M., K.A.M., Y.N., D.S.P., C.R., A.J.S.,. C.S-M., V.I.S., D.M.S., B.S., T.S., L.T., R.M.V., L.A., V.A., M.C.B., P.R.B., E.E.B.,. M.C., P.C-E., S.E.C., U.D., V.P-A., S.N.D., K.D., M.R.E., J.L.G., A.O.H., M.J.H.,. A.A.H., A.I., A.J-S., M.J., T.K., K.K., M.K., F.L., S.M.L., M.L., U.L., J.M., I.M.-Z., R.M., J.M., M.O., A.P., D.P-C., J.R., I.C.R., W.R., K.R., J.R., A.N.R., R.S., J.S., A.S., B.S-J., M.U., H.U.W., X.Z., L.W., H.W., P.M.T., J.M.B-H., N.A.G., D.J.S., N.J.V.W.,. J.E.D., A.F.M., B.J.H., and M.A.F. commented on the manuscript.
Corresponding author
Ethics declarations
Competing interests
Dr Stein has received consultancy honoraria from Discovery Vitality, Johnson & Johnson,. Kanna, L’Oreal,Lundbeck, Orion, Sanofi, Servier, Takeda and Vistagen. The other authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Ajay Satpute, Candace Raio, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Radua, J., Savage, H.S., Vilajosana, E. et al. Neural correlates of human fear conditioning and sources of variability in 2199 individuals. Nat Commun 16, 7869 (2025). https://doi.org/10.1038/s41467-025-63078-x
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41467-025-63078-x