Main

Obesity is a major public health concern that causes or exacerbates many chronic diseases and leads to reduced life expectancy1,2,3. By 2035, more than half of the global population is projected to be living with overweight or obesity4. Although intensive lifestyle interventions (ILIs), bariatric surgery and weight loss medications are effective treatment options5,6, they are not without risk and likely to remain inaccessible to most people. Thus, preventing obesity remains paramount.

In contrast to many other chronic conditions, obesity often manifests itself during childhood and tends to persist into adulthood7,8,9. Therefore, predictors available in early life, such as genetic variants, which are fixed at conception, could be of particular value. In recent years, PGSs that capture an individual’s inherited polygenic susceptibility to a trait or disease have shown great promise in enhancing disease risk prediction and population screening10,11. However, it remains unclear how, when and under what circumstances PGSs for obesity might demonstrate utility for risk prediction.

The widely used PGS for obesity by Khera et al.12, based on a genome-wide association study (GWAS) of BMI in over 339,000 people of predominantly European ancestry, explains approximately 8.5% of the variation in BMI in adults. However, as a PGS based on one ancestry population may have weak transferability to other ancestry populations13,14, there is a growing recognition that PGSs that represent a broad range of populations are needed to ensure quality healthcare for all15.

By leveraging the results of the largest GWAS meta-analyses for BMI from the Genetic Investigation of ANthropometric Traits (GIANT) consortium and 23andMe, we derived ancestry-specific and multi-ancestry PGSs for BMI and obesity to examine their performance (1) across diverse adult populations, (2) across childhood and adolescence and (3) in the context of ILIs aimed at weight loss (Fig. 1).

Fig. 1: Study overview.
figure 1

PGSs were constructed using ancestry-specific GWAS summary statistics, using ancestry-specific (PRS-CS) and ancestry-combined (PRS-CSx) approaches. Tuning of the global shrinkage parameter (ϕ) and optimal weights for the linear combination version of PRS-CSx was performed in the UKBB. The best-performing score across multiple ancestries was taken forward to independent validation studies (linear combination version of PRS-CSx with ϕ 1 × 10−2). Population descriptors shown in the figure reflect a combination of self-identified ethnicity and genetic similarity. Created in BioRender: Smit, R. (2025): https://BioRender.com/fglrflj.

Results

Selecting the best-performing PGS

To develop a PGS for BMI, we used GWAS meta-analysis summary statistics for BMI from over 200 studies from the GIANT consortium and 23andMe (Supplementary Tables 1 and 2), excluding eight studies used for tuning parameters and for testing performance (Methods). The GWAS summary statistics included contributions from over 5.1 million people of diverse populations based on a combination of self-identified ethnicity and genetic similarity: 71.1% of participants were of predominantly European ancestry; 14.4% were of Hispanic ethnicity with typically admixed ancestries; 8.4% were of predominantly East Asian ancestry; 4.6% were of predominantly African ancestry (primarily admixed African American populations); and 1.5% were of predominantly South Asian ancestry. We refer to these groups, and to the most closely aligning, genetically inferred population groups from our PGS tuning and testing studies, as being of European-like ancestry (EUR), East Asian-like ancestry (EAS), American-like ancestry (AMR), African-like ancestry (AFR) and South Asian-like ancestry (SAS), respectively, while acknowledging that these groupings oversimplify the actual genetic diversity among participants (see Supplementary Tables 2 and 3 for study-specific population descriptors). Using PRS-CS(x)16,17, we created ancestry-specific and multi-ancestry PGSs leveraging up to 1.3 million common variants. We first identified the optimal genome-wide shrinkage parameter and linear combination weights for PRS-CS(x) that achieved the highest explained variance for BMI in six ancestry subpopulations of the UK Biobank (UKBB)18, including individuals of Middle Eastern-like ancestry (MID). For the EUR-tuning population, we selected a random subset of 20,000 unrelated individuals (Methods and Supplementary Tables 3 and 4).

Overall, a multi-ancestry PGS consisting of a linear combination of five ancestry-specific PGSs (PGSLC) was the best-performing score (Methods, Fig. 2a and Supplementary Tables 5 and 6). In absolute terms, the explained variance of this multi-ancestry PGS for BMI ranged between 7.2% (AFR) and 17.5% (EUR), with a median of 14.0% (Fig. 2a and Supplementary Table 5). The multi-ancestry PGS resulted in a higher explained variance than the PGSs trained only with GWAS summary statistics most closely corresponding to the target population (‘ancestry-matched’) (Supplementary Table 5). This was particularly evident for the populations of African-like and Central/South Asian-like ancestry (2.6-fold and 2.8-fold increase, respectively), consistent with the smaller GWAS sample sizes available for these populations. The performance of a PGS consisting of near-independent, genome-wide significant variants from the overall multi-ancestry GWAS meta-analysis was generally intermediate to that of the ancestry-matched and multi-ancestry PGSs, with the exception of populations of East Asian-like and European-like ancestry, in whom it performed worse than either (Fig. 2a).

Fig. 2: Explained variance for BMI in adult populations.
figure 2

a, Explained variance for BMI in UKBB tuning populations, defined as adjusted R2 of the rank-based inverse-normal transformed BMI (by sex) predicted by the PGS, incremental to age, genotyping array and ancestry principal components. Error bars represent 95% confidence intervals (CIs) from 1,000 bootstrap resamples. The ancestry-matched PGS is the best-performing PRS-CS score using ancestry-specific GWAS summary statistics. The genome-wide significant score reflects a weighted sum of near-independent SNPs obtained from approximate COJO multi-SNP analyses of a fixed-effect meta-analysis of all contributing GWASs. The multi-ancestral PGSLC reflects the best-performing PRS-CSx score consisting of a linear combination of five ancestry-specific scores, with weights being specific to the validation population (for example, AFR). Population labels follow PAN-UKBB assignment of genetically determined ancestry. Sample sizes (distinct individuals): African 6,154; Admixed American 971; Middle Eastern 1,553; East Asian 2,660; Central/South Asian 8,005; European 20,000. b, Explained variance for BMI within validation populations, comparing the multi-ancestry PGSLC to a previously published score (PGSKhera) based on a smaller BMI GWAS meta-analysis. Same R2 definition and CI estimation as in a. Population descriptors reflect a combination of self-identified ethnicity and genetic similarity. For the MVP’s non-Hispanic Asian (AS) group, the result shown is for the PGSLC using the linear combination weights derived from UKBB-EAS. Sample sizes (distinct individuals), from left to right: AFR 12,263, 2,332, 18,701; AMR 10,281, 8,096; AS 4,201; EAS 1,359; SAS 1,177; EUR 13,673, 69,828, 340,224. c, Separation in BMI, body fat percentage (BF%) and waist-to-hip ratio (WHR) across deciles of the PGSLC within the validation subset of the UKBB participants of European-like ancestry (n ~ 340,000). All traits were rank-based inverse-normal transformed by sex.

Predicting prevalent obesity in adulthood

Having tuned parameters in a subset of the UKBB, we took PGSLC forward to independent validation populations, estimating the prediction accuracy for BMI and obesity in 482,135 participants from the UKBB, the Million Veteran Program (MVP)19,20, the BioMe Biobank21 and the Uganda General Population Cohort (GPC-UGR)22. Individuals were grouped by population group (AMR 22,612, AFR 29,454, EAS 1,617, EUR 423,420, SAS 1,164 and non-Hispanic Asian (AS) 4,201) (MVP-specific population label) (Methods and Supplementary Tables 3 and 4). The prevalence of obesity varied substantially across populations and cohorts, with those having obesity class I or higher (that is, BMI ≥ 30 kg m2 in AFR, AMR and EUR populations and BMI ≥ 27.5 kg m2 for Asian-like ancestry populations23) ranging from 4.3% for the GPC-UGR to more than 45% for all ancestry subgroups in the MVP and mean BMI ranging from 22.2 kg m2 to 30.6 kg m2 (Extended Data Fig. 1 and Supplementary Table 4).

The performance of the PGSLC was highest in individuals of European-like ancestry from the UKBB, with an explained variance of 17.6%. A pronounced lower performance was seen for populations with greater proportions of African-like ancestry, with the explained variance being 6.3% and 5.1% in African American populations (from BioMe and MVP, respectively) and 2.2% in the GPC-UGR population from rural southwestern Uganda (Fig. 2b and Supplementary Table 7). Overall, we observed a median explained variance of 10.3% across our testing populations.

Compared to the previously reported PGS by Khera et al. (PGSKhera)12, which was based on a smaller BMI GWAS of up to 339,224 individuals of primarily European ancestry24, we observed a 1.9–2.6-fold increase in explained variance for BMI (Fig. 2b).

Within the UKBB participants of European-like ancestry, the explained variance was marginally higher in males than in females (males: 17.9%; females: 17.3%) and higher in younger compared to older participants (≤50 years: 18.8%; >50 years: 17.0%) (Supplementary Table 8), both directionally consistent with sex-specific and age-specific genetic differences previously observed for BMI25,26. Within this same cohort, an s.d. increase of PGSLC was associated with a 1.98 kg m2 higher BMI (equivalent to 5.7 kg of body weight for a 1.7-m tall person) (Supplementary Table 9). The corresponding mean separation across deciles of PGSLC was less pronounced for body fat percentage and waist-to-hip ratio compared to BMI (Fig. 2c), revealing that the score does not equally capture differences in body composition.

We further evaluated the overall performance and discrimination of the PGS for prevalent obesity. Among European-like ancestry population groups, the PGSLC showed an improved capacity to differentiate between participants with and without obesity (BMI ≥ 30 kg m2) compared to PGSKhera. Specifically, the prevalence in the top 1% of the PGSLC was 69.5% versus 54.9% for PGSKhera and, in the bottom 1%, 1.7% versus 5.1%, respectively (Fig. 3a and Extended Data Fig. 2). Across populations, an s.d. increase in PGSLC was associated with a median 1.9–2.6-fold increase in odds of obesity class I or higher (Fig. 3b). We observed larger effects for more severe obesity (Extended Data Fig. 3 and Supplementary Tables 9 and 10). The area under the receiver operating characteristic curve (AUC) similarly increased with the severity of obesity and neared 0.80 for severe obesity in multiple populations (Fig. 3c). The AUC for PGSLC on its own was consistently larger than those for age and sex, PGSKhera and within the UKBB self-reported comparative body size at age 10 (Fig. 3c and Supplementary Table 11).

Fig. 3: Prediction of prevalent obesity outcomes in adults.
figure 3

a, Separation in prevalence of obesity (BMI ≥ 30 kg m2) across 1% groups of PGSKhera and PGSLC within the validation subset of the UKBB participants of European-like ancestry (n ~ 340,000), with reference lines for the bottom and top 1% groups. Error bars show 95% confidence intervals (CIs) based on the normal approximation to the binomial distribution. The horizontal lines correspond to the average prevalence (black, dotted) and the prevalence of obesity within the top and bottom 1% of PGSKhera and PGSLC (red and blue, respectively). b, Odds ratios with 95% CIs for prevalent obesity class I or higher, per s.d. of PGSKhera and PGSLC, adjusted for age, sex, principal components of ancestry and genotyping array. All PGSs were standardized using the mean and s.d. of the PGS within individuals who did not have obesity class I or higher, to account for differences in prevalence across validation populations. Sample sizes (distinct individuals), from left to right: AFR 12,263, 2,332, 18,701; AMR 10,281, 8,096; AS 4,201; EAS 1,359; SAS 1,177; EUR 13,673, 69,828, 340,224. c, AUC classification of prevalent obesity outcomes in the BioMe Biobank, the MVP and the UKBB. Models including PGSs (included as a continuous predictor) additionally include principal components of ancestry. CBS10, self-reported comparative body size at age 10 years. Restricted to estimates where the number of individuals with the obesity outcome was at least 50.

After an ancestry adjustment (Methods and Extended Data Fig. 4), the median odds ratio for obesity class I or higher across BioMe’s population groups was 3.6 for those in the top 10% of PGSLC (compared to bottom 90%) and 4.1 for the top 5% (versus those in the bottom 95%) (Supplementary Table 12), with greater separation between these two tail estimates for European-like and Asian-like ancestry groups.

Childhood and adolescence

We investigated whether PGSLC has predictive value at an early age within the Avon Longitudinal Study of Parents and Children (ALSPAC), a geographically homogeneous prospective birth cohort from the southwest of England with follow-up until early adulthood (Methods and Supplementary Table 3)27,28,29. Repeated cross-sectional associations with BMI showed small effects for PGSLC soon after birth with much stronger effects emerging in early childhood, from 0.12 s.d. per s.d. of PGSLC at 12 months to nearly quadruple that size (0.45 s.d. per s.d. of PGSLC) by age 12 years, after which effects plateaued (Fig. 4a). The PGS was also associated, albeit less strongly, with height in early childhood with effects increasing until age 12 years, after which they returned to zero by mid-adolescence, suggesting that genetic predisposition for higher BMI early in life promotes increased body size in general, including postnatal linear growth, but that this early growth does not translate into differences in height after puberty (Extended Data Fig. 5). This is in line with childhood overweight and obesity’s known associations with earlier pubertal timing30. Over time, BMI in children with a higher genetic predisposition (PGS ≥10th percentile) increased at a faster rate than those with a lower genetic predisposition, most evident after age 2.5 years (Methods, Fig. 4b and Supplementary Table 13). Both boys and girls with a very early age of adiposity rebound (≤43 months), a well-established predictor of future obesity risk31, had a higher mean PGS than those with later ages of adiposity rebound (Supplementary Table 14).

Fig. 4: PGSLC performance during childhood and adolescence.
figure 4

a, Repeated cross-sectional linear regression associations of standardized PGSLC with BMI and height, with both standardized within sample by sex and timepoint. Data are presented as regression coefficient with 95% confidence interval (CI). Ponderal index was used instead of BMI at birth. Associations were adjusted for age and principal components of ancestry. Sample sizes, based on repeated measurements, from left to right, for BMI: 4,740, 638, 847, 814, 769, 41, 732, 729, 725, 720, 699, 5,816, 4,863, 5,570, 5,368, 5,187, 4,910, 4,556, 4,024, 3,603, 2,780; and height: 4,802, 638, 847, 814, 771, 741, 734, 729, 726, 723, 701, 5,820, 5,160, 5,572, 5,379, 5,188, 4,956, 4,561, 4,032, 3,606, 2,782. b, Sequentially plotted mean BMI trajectories from the age of 4 months to 24 years with knot points from linear spline multilevel models, accounting for sex and principal components of ancestry, of PGSLC (bottom 10%, middle 80%, top 10%). c, Contribution of PGSLC to explained variance (adjusted R2) for BMI, rank-based inverse-normal transformed by sex and timepoint. Ponderal index was used instead of BMI at birth. Data are presented as R2 values computed from the original dataset, with error bars representing 95% confidence intervals (2.5th–97.5th percentiles) estimated from 1,000 bootstrap resamples. Predictors available at birth were birthweight, maternal education, pre-pregnancy maternal BMI, maternal age at date of birth and household social status. The left panel shows explained variance for BMI at the timepoint shown on the x axis. In contrast, the right panel shows explained variance for BMI measured at 18 years, with early-life BMI measurement shown on the x axis used as predictors. Sample sizes, based on repeated measurements, from left to right: 3,800, 839, 714, 1,192, 4,062, 3,044, 940, 594, 737, 725, 890, 3,310.

We then examined whether the PGS adds predictive value over and above clinically available predictors of obesity. When added to predictors that are measurable at birth (birthweight, maternal education, pre-pregnancy maternal BMI, maternal age at date of birth and household social status), the PGS showed no clear added value for predicting ponderal index at birth or BMI at ages 3 years and 5 years (Methods and Fig. 4c, left panel). However, for predicting BMI at later ages (8, 11 and 15 years), the contribution of the predictors at birth plateaued, whereas inclusion of the PGS roughly doubled the total explained variance from 11% to 21% at age 8 and from 13% to 26% at age 15 (Fig. 4c, left panel).

For the prediction of BMI in early adulthood, the contribution of the PGS especially shines through in the first few years after birth. BMI measured at age 8 explained 44% of the variation in BMI at age 18, and adding the PGS to the prediction model only raised this to 49%. In contrast, at younger ages, adding the PGS to measured BMI led to larger relative increases in the explained variance for BMI at 18 years, from 1.5-fold at age 5 (from 22% to 35%) to more than three-fold (from 8% to 26%) at 1 year of age (Fig. 4c, right panel, and Supplementary Table 15).

From early adulthood to middle life

Within the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial (Methods)32, both males and females with a higher PGS experienced a greater change in self-reported weight and BMI between age 20 and age 50. Across populations, the per-5-year change per s.d. in PGSLC was consistently larger for females than for males (median of 0.35 kg for females versus 0.21 kg for males and 0.14 kg m2 versus 0.07 kg m2 for BMI), also when corrected for initial weight or BMI (Supplementary Table 16).

The addition of the PGS to a model with birth year, sex and BMI at age 20 was associated with modest increases in discriminative ability for predicting obesity at age 50 (incremental AUC ranged from 0.01 to 0.03; Fig. 5). There was evidence of statistical interaction between the PGS and BMI at age 20 for obesity at age 50 (Methods and Supplementary Table 17), which translated into a larger incremental AUC of the PGS for those who did not have overweight or obesity at age 20 (range, 0.02–0.05; Fig. 5), without strong evidence of sex specificity (Extended Data Fig. 6).

Fig. 5: AUC for obesity classification at age 50, stratified by having overweight or obesity at age 20.
figure 5

Analyses performed within the PLCO Cancer Screening Trial. Restricted to estimates where the number of individuals with the obesity outcome was at least 50. Population descriptors were provided by the PLCO investigators and reflect genetically determined ancestry using Genetic Relationship and Fingerprinting (GRAF). PCs, principal components; y, years; M1, model with birth year, sex, BMI at 20y, and PCs.

Response to ILIs

Lifestyle modification remains a cornerstone of weight management, with multicomponent ILIs having shown success in achieving clinically relevant weight loss33. However, ILIs require substantial, sustained investment from both healthcare providers and patients, with large variation in weight loss and unintentional weight regain being common34,35,36. To examine whether polygenic predisposition to obesity might modify the effectiveness of such interventions, we examined whether the PGS was associated with weight loss during the first year of an ILI, and with weight regain thereafter, in two randomized controlled trials with similar intervention arms aimed at achieving 7% weight loss: the Diabetes Prevention Program (DPP33) and the Look AHEAD (Action for Health in Diabetes37) studies (Methods).

Among 3,909 participants (Supplementary Table 4), individuals with a higher PGSLC lost more weight during the first year in response to the ILI compared to the control group (−0.55 kg per s.d. in PGS, 95% confidence interval: −0.94 to −0.16) (Fig. 6), when adjusting for starting weight. In addition, among those who lost at least 3% of their baseline weight during the first year, a higher PGS was associated with more weight regain in the following years (up to 3 years) (0.48 kg per PGS s.d., 95% confidence interval: 0.00–0.95) (Fig. 6). These findings were directionally consistent across (self-)reported population groups.

Fig. 6: Impact of PGSLC on weight change due to ILIs.
figure 6

Interaction effect between PGSLC and trial arm (ILI versus comparison arm) for weight change within the first year of follow-up (left) and weight change after the first year in the subset of individuals who had lost ≥3% of baseline weight at year 1 (right). Data are presented as study-specific and study-combined interaction regression coefficients (with 95% confidence interval (CI)), pooled through inverse-variance weighted fixed-effect meta-analysis. Weight change was assessed by using weight at follow-up timepoint(s) as outcome in the linear mixed models, adjusting for initial weight (baseline or year 1, respectively). LA, Look AHEAD. Population descriptors reflect race and ethnicity, as reported by dbGAP variable phv00201855.v2.p1 (DPP) or self-report by participants (LA). Study-specific sample sizes (distinct individuals, with pooled sample sizes for ‘All’), from top to bottom, for forest plot on the left: 295, 374, 251, 428, 839, 1,722, 1,385, 2,524; and on the right: 118, 186, 128, 233, 428, 951, 674, 1,370.

Discussion

We used GWAS summary statistics from the GIANT consortium and 23andMe, encompassing over 5.1 million people, to create ancestry-specific and multi-ancestry PGSs capturing genetic predisposition to weight gain and obesity. Our multi-ancestry PGS more than doubled the explained variance for BMI compared to the widely used PGS by Khera et al.12. Additionally, we demonstrate the potential added value of the PGS in two distinct clinical applications: predicting adult BMI at an early age and weight change in response to ILIs.

Across diverse populations, the increases in explained variance for BMI were accompanied by substantial improvements in effect sizes and discrimination metrics for obesity, with the PGSs’ standalone AUC nearing 0.80 for severe obesity. Consistent with other polygenic traits, our results underscore the importance of training sample size in driving PGS performance improvements. Nonetheless, our results also show diminishing returns of sample size increases, as has been observed for other polygenic traits38, with a 15-fold increase in total sample size (compared to PGSKhera, ~10-fold in EUR alone) leading to a 2–3-fold increase in prediction accuracy. We leveraged PRS-CSx’s unique ability to integrate GWAS summary statistics from all five broad ancestry population groups from the meta-analyses from GIANT and 23andMe17. However, recent Bayesian methods have shown promising results with smaller sample sizes by including additional variants and functional annotation, meriting consideration for future comparisons39,40. Although our results highlight the value of multi-ancestry PGS to increase PGS performance, a considerable performance gap persists for populations with substantial African-like genetic ancestry compared to other populations. This discrepancy is likely due to the underrepresentation of individuals with African-like ancestry in the training GWAS, particularly from continental Africa41, and differences in minor allele frequency (MAF) and linkage disequilibrium (LD) patterns42,43. The lower performance of the PGS in the GPC-UGR is in line with those seen therein for PGSs for other traits and the limited transferability of PGS performance between African ancestry populations, reflecting genetic and environmental differences as well as their interactions22,44,45,46,47, emphasising that much remains to be done to improve the performance of genetic risk prediction across the genetically diverse populations within Africa. For example, it remains unclear whether a finer classification of broad ancestry groups in GWAS meta-analyses and PGS development could improve downstream PGS generalizability, especially given the dominant role of GWAS sample size. Given the observed performance gap, careful implementation of the score is essential to avoid potentially introducing public health disparities. For example, establishing a standardized ancestry threshold14,48 may improve the accuracy of population-specific risk stratification and ensure that the score is returned only in populations where it meets an acceptable performance threshold.

Within ALSPAC, the PGS—which was developed for adult BMI—already shows an effect on BMI early in life. Effect sizes rapidly increased with age and were accompanied by clear divergence of BMI trajectories throughout childhood and adolescence. As a result, the PGS showed added value beyond clinically available predictors assessed at birth for predicting BMI measured after age 5. This is in line with previous research on the dynamic relationship between genetic variation and adiposity in early life12,49,50. Particularly elucidating are the observations from the Norwegian Mother, Father and Child Cohort Study (MoBa) with repeated measurements across narrow age windows51. They show that BMI has a rapidly changing genetic architecture during the first 8 years of life with a ‘late rise’ cluster of variants, emerging in late childhood, which show limited association before the adiposity rebound but have persistent effects on BMI into adult life51. This also explains why our PGS shows its value for predicting adult BMI during the earliest years of life, up to age 5, when measured BMI has limited value as a predictor. With a growing number of obesity prevention trials commencing in early life52,53, our results provide important context at what age(s) PGSs for obesity, potentially in tandem with PGSs of obesity-related complications, could be considered promising candidate predictors to help guide risk stratification and the implementation of such interventions.

In the PLCO study, a higher PGS was associated with weight gain over a 30-year period from early adulthood to midlife. Additionally, for individuals aged 20, knowing their PGS modestly improved the prediction of obesity at age 50, more so for those without overweight or obesity at age 20. This suggests that particularly individuals with an early adulthood BMI below their genetically predicted BMI are at risk of gaining weight to match their innate predisposition. Our findings are consistent with those from the CARDIA study54, showing that PGSs for obesity only marginally improve the prediction of midlife BMI when combined with early adulthood BMI measurements, contrasting with its predictive value in early life.

Analyzing clinical trial data of ILIs, we observed that individuals with a higher PGS lost modestly more weight during the first year. However, this group was also at higher risk of weight regain after this most intensive portion of the intervention had concluded. This may seem counterintuitive to the expectation that those with a higher genetic risk will benefit less from weight loss interventions. However, this observation is supported by a strong body of literature reporting that those most genetically predisposed to obesity are also those most responsive to changes in an obesogenic environment55,56,57,58,59,60,61,62,63. However, the literature on the role of genetics modifying response to weight loss interventions remains sparse and ambiguous, reflecting differences in variants considered, methods and study design, highlighting the need for well-powered PGSs64,65,66,67,68,69,70,71,72,73. It will be of interest to investigate whether PGSs for obesity have predictive utility for the fast-growing arsenal of pharmacotherapies aimed at weight loss. Separately, the current results offer crucial context when conveying genetic results of obesity risk to participants, which has generally shown either no or even short-lived adverse effects on risk-reducing behavior74,75,76,77. Rather, our findings emphasize that individuals with a high genetic predisposition to obesity may respond more to lifestyle changes and, thus, contrast with the determinist view that genetic predisposition is unmodifiable78. This is further reinforced by evidence that polygenic susceptibility can mitigate or exacerbate the impact of pathogenic variants in MC4R, underscoring the complexity of genetic influences on obesity79. We look forward to the findings of the Electronic MEdical Records and GEnomics (eMERGE) network’s PGS-based genome-informed risk assessment being returned to 25,000 diverse adults and children for 11 conditions, including obesity/BMI14.

Taken together, we show that BMI PGSs can be used for prediction of adult obesity throughout the life course, particularly in early life, and for severe obesity. This PGS represents a substantial improvement compared to previous scores and may help to identify individuals at high risk to allow for timely prevention or treatment of obesity, such as through integration in a broader predictive framework jointly modeling genetic and environmental risk.

Methods

Study populations

The list of studies contributing to the GWAS meta-analyses for BMI, which served as the training data for the presented PGSs, is provided in Supplementary Table 1. The study populations described below are those that contributed as a PGS tuning or validation population.

The UKBB is a prospective cohort study that enrolled approximately 500,000 people from across the UK, aged 40–69 years at recruitment, between 2006 and 2010 (ref. 18). At recruitment, participants completed detailed questionnaires, underwent a range of physical measures and provided blood, urine and saliva samples. BMI was calculated using height (measured in whole centimeters) and weight (to the nearest 0.1 kg). Females who were pregnant at the time of assessment were excluded from our analysis. As a measure of comparative childhood body size, participants were asked: ‘When you were 10 years old, compared to average, would you describe yourself as: (i) thinner, (ii) plumper, (iii) about average?’.

The MVP has recruited over 1 million people from Veteran Affairs (VA) Medical Centers across the United States since 2011 (ref. 19). Veterans who volunteer provide a blood sample for biobanking, complete baseline and lifestyle questionnaires and consent to allow access to clinical data from VA electronic health records (EHRs). For the current study, PGS performance was assessed in the MVP’s third release of genomic data (R3) using additional participants without overlap, with earlier data releases contributing to the GIANT GWAS meta-analyses. Phenotypic and genomic data related to the association analysis of BMI were previously described20.

The Institute for Personalized Medicine BioMe Biobank, founded in 2007, is an ancestrally and culturally highly diverse EHR-linked biorepository enrolling participants non-selectively from across the Mount Sinai Health System in New York City21. At enrollment, participants consent to link their DNA and plasma samples to deidentified EHRs. The clinical and EHR information is complemented by a baseline questionnaire that gathers demographic and lifestyle information. BMI was calculated using weight and height from baseline and outpatient EHR measurements, using median height and weight measured at or within 60 days before/after the enrollment visit, after several cleaning steps (Supplementary Table 3).

The GPC-UGR is a population-based open cohort study established in 1989 by the Medical Research Council (UK) in collaboration with the Uganda Virus Research Institute (UVRI) to monitor the HIV epidemic and its determinants in neighboring villages in rural southwestern Uganda, in the Kyamulibwa subcounty of the Kalungu district, approximately 120 km from Entebbe town. Since 2010, its mandate has expanded to incorporate the epidemiology and genetics of both communicable and non-communicable diseases22. The GPC-UGR was initially recruited and assessed through annual house-to-house census and survey rounds until 2012, when biannual surveys commenced. For the current study, we included individuals aged 18 years and older who have been either whole-genome genotyped or whole-genome sequenced80,81.

The ALSPAC is a prospective birth cohort from the southwest of England established to investigate environmental and genetic characteristics that influence health, development and growth of children and their parents27,28,29. Full details of the cohort and study design are available at http://www.alspac.bris.ac.uk. In brief, pregnant females residing in Avon, UK, with expected dates of delivery between 1 April 1991 and 31 December 1992 were invited to take part in the study. The initial number of pregnancies enrolled was 14,541, with 13,988 children who were alive at 1 year of age. The children resulting from these pregnancies have been followed-up to date with measures obtained through regular questionnaires and clinical visits, providing information on a range of behavioral, lifestyle and biological data. More specifically, a 10% sample of the ALSPAC cohort, known as the Children in Focus (CiF) group, attended clinics at the University of Bristol at various time intervals between 4 months and 61 months of age, whereas the entire ALSPAC cohort was invited to attend regular research clinics from age 7. When the oldest children were approximately age 7, an attempt was made to bolster the initial sample with eligible individuals who had failed to join the study originally. The total sample size available for analyses using any data collected after the age of 7 is, therefore, 15,447 pregnancies, with 14,901 children who were alive at 1 year of age. Study data were collected and managed using Research Electronic Data Capture (REDCap) tools hosted at the University of Bristol82. REDCap is a secure, web-based software platform designed to support data capture for research studies. Details on sample selection for the current analyses are presented in Supplementary Table 3. Please note that the study website contains details of all data that are available through a fully searchable data dictionary and variable search tool (http://www.bristol.ac.uk/alspac/researchers/our-data/).

The PLCO Cancer Screening Trial was a multicenter randomized controlled trial in the United States that enrolled males and females aged 55–74 years from 1993 to 2001 to evaluate the effectiveness of different screening programs on cancer mortality32. In the baseline questionnaire, participants self-reported their weight at age 20 years, age 50 years and baseline as well as their height. For the current study, we restricted the analyses to genotyped participants who had not contributed to the underlying GWAS meta-analyses and had information available on their weight and BMI at both age 20 and age 50.

The DPP was a 27-site parallel-arm randomized controlled trial designed to determine whether either the oral diabetes drug metformin or an ILI (primarily fat gram, calorie and physical activity goals) aimed at approximately 7% weight loss, compared to inactive tablets and standard lifestyle recommendations, could prevent or delay type 2 diabetes onset in ethnically diverse high-risk individuals with prediabetes and overweight or obesity33. For the current study, we excluded individuals from the metformin arm as well as those from a fourth intervention arm, troglitazone, which was discontinued at an early stage. Moreover, we focused on the individuals comprising the three largest race and ethnicity categories (White, Black and Hispanic, as reported by database of Genotypes and Phenotypes (dbGAP) variable phv00201855.v2.p1) who had consented to being genotyped.

Look AHEAD was a 16-site parallel-arm randomized controlled trial that assessed the long-term effects of an ILI in ethnically diverse patients with overweight or obesity and type 2 diabetes on cardiovascular morbidity and mortality37. The lifestyle intervention was modeled after the DPP lifestyle intervention and similarly aimed at 7% weight loss but with more ambitious individual goals for several intervention components83. Although those in both trial arms were provided one session of education on diabetes and cardiovascular risk at baseline, the comparison arm thereafter received the option of attending only three sessions per year on nutrition, physical activity and social support, with no explicit weight loss goals. As for the DPP, we restricted our analyses to genotyped participants who self-reported as being of African American/Black, Hispanic and White race and ethnicity on the baseline questionnaire.

Population descriptors and study-provided population labels and genotyping, phenotyping and participant characteristics are presented in Supplementary Tables 3 and 4. No new data (that is, measurements) were collected for this study, and no participant compensation was provided for the analyses conducted.

Our research complies with all relevant ethical regulations. The UKBB study was approved by the North West Multi-Centre Research Ethics Committee (ref. 11/NW/0382), and all participants provided written informed consent to participate in the UKBB study. The VA central institutional review board (IRB) approved the MVP study protocol in accordance with the principles outlined in the Declaration of Helsinki. Informed consent was obtained from all participants. The BioMe Biobank Program (IRB no. 07-0529) operates under a Mount Sinai IRB-approved research protocol. All study participants provided written informed consent. The Uganda Genome Resource was approved by the Science and Ethics Committee of the UVRI Research and Ethics Committee (UVRI-REC no. HS 1978), the Uganda National Council for Science and Technology (UNCST no. SS 4283) and the East of England-Cambridge South (formerly Cambridgeshire 4) NHS Research Ethics Committee UK. Ethical approval for the study was obtained from the ALSPAC Ethics and Law Committee and the local research ethics committees. Consent for biological samples has been collected in accordance with the Human Tissue Act (2004). Informed consent for the use of data collected via questionnaires and clinics was obtained from participants following the recommendations of the ALSPAC Ethics and Law Committee at the time. The PLCO study was approved by the human subjects review boards at the National Cancer Institute and at the 10 study centers (National Institutes of Health IRB no. OH97CN041). Written informed consent was given by all participants. The DPP and Look AHEAD trials were approved by the IRB at each center, and all participants gave written informed consent.

PGS construction and tuning

To derive our scores, we used summary statistics from the ongoing GWAS meta-analyses for BMI conducted by the GIANT consortium (https://portals.broadinstitute.org/collaboration/giant/index.php/GIANT_consortium) in collaboration with 23andMe. Prior to each GWAS, BMI (kg m2) underwent rank-based inverse-normal transformation by sex, similarity to major ancestry group (AFR, AMR, EAS, EUR, SAS) and case–control status as appropriate, with age, age2, principal components of population structure and study-specific covariates regressed out. Ancestry group assignment was defined independently by each contributing study, as previously described84. Additive GWAS analyses were performed primarily using RVTESTS for 1000 Genomes Project (1000G) phase 3 or Haplotype Reference Consortium (HRC)-imputed data85. For 23andMe, which performed its GWAS on BMI using self-reported height and weight, see Supplementary Table 2 for details on phenotyping, ancestry group assignment and GWAS methodology. Quality control of the study-specific files was conducted using EasyQC86, followed by fixed-effect meta-analysis by ancestry group using RAREMETAL87. When performing the GWAS meta-analyses to train the PGSs with, we purposely excluded data from several populations (ALSPAC, BioMe and UKBB). No additional genomic control was applied to the resulting summary statistics.

For PGS derivation, we employed PRS-CS and its multi-ancestral extension PRS-CSx (versions 1.0.0)16,17. In short, PRS-CS(x) uses a Bayesian regression framework and assumes a continuous shrinkage prior on single-nucleotide polymorphism (SNP) effects. Whereas PRS-CS focuses on creating ancestry-specific scores, informed only by the GWAS summary statistics and LD information of a given population, PRS-CSx uses a shared prior to couple SNP effects across populations and explicitly models population-specific allele frequencies and LD patterns to create multi-ancestry scores. Both methods were implemented using the software’s default settings (that is, gamma–gamma priors set to a = 1 and b = 0.5, Markov chain Monte Carlo (MCMC) total and burn-in iterations set at 1,000 times and 500 times the number of discovery populations and MC thinning factor set at 5), with input being ancestry-matched summary statistics and PRS-CS(x) developer-provided 1000G LD reference panels of common HapMap3 variants (AFR, AMR, EAS, EUR, SAS). PRS-CS(x) uses a global scaling parameter (Φ), which requires tuning. In addition to small-scale grid testing of Φ (1 × 10−6, 1 × 10−4, 1 × 10−2, 1), we ran PRS-CS(x) using its ‘auto’ option, which learns Φ automatically from the GWAS summary statistics. Finally, PRS-CSx was performed separately using its default ‘linear combination’ as well as its ‘meta’ version. In the first, a set of population-specific PGSs is outputted (that is, five in our case), for which the optimal linear combination needs to be derived for each target population (for example, AFR) in a population-matched tuning dataset. For ‘meta’, the population-specific posterior SNP effects are integrated using an inverse-variance weighted meta-analysis in the Gibbs sampler, producing a single score that can theoretically be applied regardless of the target population.

The variants considered for PGS derivation were those present in the 1000G reference panels mentioned above (HapMap3 variants (n = 1–1.2 million across panels), which were common in 1000G populations, excluding ambiguous A/T or G/C variants), which additionally had an INFO > 0.3 (that is, imputation quality) in the overall UKBB. Ancestry-specific summary statistics were further restricted to the variants with a variant-specific sample size of at least a third of the maximum sample size for that ancestry.

Tuning of Φ and derivation of PRS-CSx’s linear combination weights were performed in the same subsets of the UKBB. More specifically, we used genetically determined relatedness and ancestry assignments provided by the Pan-UKBB Team (https://pan.ukbb.broadinstitute.org, UKBB return 2442) to subset the UKBB into groups of unrelated individuals of African-like, Admixed American-like, Central/South Asian-like, East Asian-like, European-like (random subset of 20,000 individuals) and Middle Eastern-like ancestry. The target-population-specific (for example, AFR) linear combination weights for PRS-CSx (for each Φ) were derived from a joint linear regression model of rank-based inverse-normal transformed BMI (by sex and population) on the five population-specific scores (each standardized to mean zero and unit variance in the corresponding tuning dataset—for example, UKBBAFR), age, 10 principal components and genotyping array.

Explained variance for BMI was defined as the incremental adjusted R2 from linear regression for rank-based inverse-normal transformed BMI (by sex and population), when adding the PGS predictor to a model containing age, principal components and genotyping array. Confidence intervals (95%) were determined using bootstrapping with 1,000 repetitions. As comparator, we created a score of quasi-independent associations after performing approximate conditional and joint (COJO) multiple-SNP analysis88, as implemented in GCTA89, using the results from the multi-ancestral GWAS summary statistics when leaving out only the UKBB. We used genotypes from a previously described set of 50,000 unrelated UKBBEUR participants84 as our LD reference panel for GCTA-COJO, with parameters set to: –diff-freq 0.1; –maf 0.01; –cojo-collinear 0.9; –cojo-p 5 × 10−9. The scores with the highest explained variance across multiple tuning population datasets (PRS-CSx linear combination and meta versions, with Φ 1 × 10−2) were then taken forward to the validation populations, which performed study-level variant filtering on imputation quality when constructing each PGS (Supplementary Table 3). The results of PGSmeta are presented throughout the supplementary tables.

Of note, in August 2023, a new version of PRS-CSx was released (version 1.1.0) to address potential reductions in performance due to truncating of GWAS summary statistics P values below 1 × 10−323. Given our inclusion of several variants with P values below this threshold (among the PGS variants: n = 154 for EUR, n = 12 for AMR), we reran PRS-CSx with Φ 1 × 10−2 but did not observe markedly different results in PGS performance within the UKBB tuning populations (Supplementary Table 18).

PGS performance in adulthood

In addition to explained variance (%) for BMI, several population-stratified metrics for overall performance and discrimination were calculated in the validation studies with adult participants:

  1. A.

    Nagelkerke R2, with a null model of age, sex, principal components and genotyping array, for prevalent obesity categories based on World Health Organization (WHO) BMI cutoff points (membership of category or above versus rest), using Asian-specific cutoffs where applicable (Supplementary Table 3)23. Confidence intervals (95%) were calculated using bootstrapping with 1,000 repetitions.

  2. B.

    Mean difference in BMI, both untransformed and after rank-based inverse-normal transformation by sex, per s.d. increase in PGS. Covariates included age, sex (where applicable), principal components and genotyping array.

  3. C.

    Odds ratio per s.d. of PGS for prevalent WHO obesity outcome. Same covariates as under (B). Here, the PGSs were standardized to mean zero and unit variance using the mean and s.d. of the PGS observed within participants without the outcome (for example, individuals who do not fall within obesity class I (or higher), where this was the outcome), to account for prevalence differences across studies.

  4. D.

    AUC statistic for prevalent WHO obesity outcomes. We compare models including (i) age and sex, (ii) genotyping array, principal components and PGSKhera, (iii) genotyping array, principal components and PGSLC and (iv) models (i)/(iii) combined. In the UKBB, we additionally ran models including and excluding self-reported comparative body size at age 10 (UKBB field ID 1687). Confidence intervals (95%) were calculated using DeLong’s method.

In addition, the explained variance for BMI was calculated separately in age strata (≤50 years or >50 years, with adjustment for residual age differences within age strata) and sex strata of the European UKBB participants. The main comparator for all analyses was a previously published LDpred-derived PGS for BMI (PGSKhera)12, which was based on the results from a GIANT consortium GWAS meta-analysis including up to 339,224 individuals of primarily European ancestry24. As BioMe had contributed to this meta-analysis, metrics for PGSKhera were determined in individuals who had not contributed nor were related to those who did (second or stronger degree of relatedness, based on KING-derived kinship coefficients). We chose PGSKhera as our comparator as it is the most widely recognized and extensively used PGS for BMI within the obesity research community, thereby serving as a well-established benchmark.

Tail comparisons after ancestry correction

Due to on-average differences in LD structure and allele frequencies, the distribution of a PGS (for example, mean and variance) can differ across ancestral populations. This was also observed for the five population-specific PGSs that were linearly combined to create PGSLC (Extended Data Fig. 4). As performance metrics that rely on thresholding the PGS distribution (for example, top 10%) may be impacted by such on-average differences90, particularly in admixed populations, we modeled the mean and variance of the population-specific PGSs through principal component analysis of the 1000G reference panel. For this, we applied a correction described by the eMERGE network14,91, which represents a modified version of the approach by Khera et al.92. Starting with a publicly available, curated version of the 1000G reference panel (https://broadinstitute.github.io/warp/docs/Pipelines/Imputation_Pipeline/references_overview; gs://broad-gotc-test-storage/imputation/1000 G_reference_panel), we first excluded variants with MAF < 1% across 1000G superpopulations, before determining the overlap with genotyped variants available for both the UKBB and BioMe. After excluding long-range LD regions, we pruned the list of variants using plink (–indep-pairwise 1000 50 0.05) and ran principal component analysis with flashpca93, whereafter we projected UKBB and BioMe participants to the same 1000G principal component space. For each of the five population-specific PGSs underlying PGSLC (that is, AFR, AMR, EAS, EUR and SAS), we regressed the PGS of 1000G participants against the first 10 principal components of ancestry:

$${PGS}={\alpha }_{0}\,+\,\mathop{\sum }\limits_{i=1}^{10}{\alpha }_{i}\times {{PC}}_{i}$$

In addition, we modeled its residual variance (\({\delta }^{2}\)) as a function of the same principal components:

$${\delta }^{2}={\beta }_{0}\,+\,\mathop{\sum }\limits_{i=1}^{10}{\beta }_{i}\times {{PC}}_{i}$$

Then, using the projected principal component, predicted mean and residual s.d. were calculated and used to create ancestry-adjusted z-scores of each PGS for UKBB and BioMe participants:

$${adjusted\; zPGS}=\frac{\mathop{\sum}\nolimits_{j=1}^{M}{w}_{\!j}\times {{dosage}}_{j}\,-\,({\alpha }_{0}\,+\,\mathop{\sum}\nolimits_{i=1}^{10}{\alpha }_{i}\times {{PC}}_{i})}{\sqrt{{\beta }_{0}\,+\,\mathop{\sum }\nolimits_{i=1}^{10}{\beta}_{i}\times {{PC}}_{i}}}$$

with \({\sum }_{j=1}^{M}{w}_{\!j}\times {{dosage}}_{j}\) representing the raw PGS.

We observed that mean and variance differences of scores across ancestry populations were largely, but not fully, resolved through this correction in both study populations (Extended Data Fig. 4). In keeping with PRS-CSx’s linear combination approach, weights to linearly combine scores were derived within each UKBB tuning population and applied to BioMe validation populations. The ancestry-adjusted PGSLC was then thresholded at its top 5% and 10% separately (versus bottom 95% and 90%, respectively), and odds ratios for prevalent obesity were calculated, adjusting for age, sex and genotyping array. Given the relatively lower number of East Asian and South Asian individuals available for these tail comparisons, we present pooled estimates.

Multi-ancestry PGS trained only with GIANT consortium data

Although we primarily report on the development and performance of the PGSLC trained with the larger GIANT+23andMe BMI GWAS meta-analyses results, we initially developed a multi-ancestry PGS with the GIANT-only results. For this, we used PRS-CS, combining GIANT’s combined multi-ancestral GWAS summary statistics (including BioMe but excluding the UKBB and the ALSPAC study) with a previously described multi-ancestry 1000G-based LD reference panel created following the same protocol as described by the PRS-CS authors94. We restricted variants to those present in the LD reference panel, which additionally had MAF ≥ 0.1% and INFO ≥ 0.8 in the overall UKBB, resulting in a total of 1,217,710 variants included. We did not specify a genome-wide shrinkage parameter for PRS-CS, which it therefore learned from the training data using a fully Bayesian approach (its ‘auto’ option). In collaboration with the eMERGE network, this GIANT-only PGS was included among a select group of trait-specific scores currently being returned as part of a genome-informed risk assessment (GIRA) to 25,000 diverse adults and children and their healthcare providers in an ongoing prospective cohort study across 10 clinical sites, as recently described14,48. In line with the aim of returning a high-risk versus a not-high-risk GIRA status to participants, we estimated the odds ratio for obesity class I (or higher) for being in the top 3% of the PGS distribution (versus bottom 97%) within the subset of BioMe participants genotyped with the Global Screening Array (with 1000G imputation) who had not contributed to the underlying GIANT GWAS meta-analysis nor were related to those who did. Accounting for age, sex and four principal components, we observed that values within the top 3% of this PGS were associated with 4.08 (95% confidence interval: 3.02–5.52) times as high odds for prevalent obesity in individuals of self-reported European descent, with 2.54 (95% confidence interval: 1.55–3.98) for individuals of self-reported African descent, with 2.33 (95% confidence interval: 1.64–3.31) for individuals of self-reported Hispanic/Latino descent and with 5.73 (95% confidence interval: 2.28–14.57) for individuals of self-reported Asian descent. The weights for this GIANT-only PGS are made available through the PGS Catalog alongside the weights for PGSLC and PGSmeta (see ‘Data availability’).

Childhood and adolescence

Within the ALSPAC study, repeated anthropometric measurements are available from birth to age 24. We employed both cross-sectional and multilevel analyses to provide complementary perspectives: cross-sectional analyses facilitate granular comparisons at each timepoint, whereas multilevel models capture longitudinal trajectories and population-level divergence over time. First, we assessed repeated cross-sectional associations between the standardized PGS (PGSLC using linear combination weights from UKBBEUR) and all available measures of BMI (or ponderal index at birth) and height. Anthropometric measurements were rank-based inverse-normal transformed separately by sex, at each timepoint, and analyses were adjusted for age at timepoint and principal components.

Longitudinal analyses using linear spline multilevel models were conducted to examine the association between the PGS (bottom 10%, middle 80%, top 10%) and change in untransformed BMI and height between 4 months and 24 years. Multilevel models estimate the mean trajectories of each anthropometric trait while accounting for non-independence of repeated measures within individuals, change in scale and variance of measures over time and differences in the number and timing of measurements between individuals (using all available data from all eligible participants under a missing-at-random assumption). Linear splines allow knot points to be fitted at different ages to derive periods of change that are approximately linear. All participants with at least one measure of the anthropometric traits were included under a missing-at-random assumption to minimize selection bias in trajectories estimated using linear spline multilevel models (with two levels of random effects: measurement occasion (that is, age to the nearest integer in years) and individual), allowing individuals to have different intercepts and slopes and, thus, their own trajectories. Knot points were placed as follows for each anthropometric trait based on the distribution and longitudinal pattern of measures between the earliest measure and 24 years: at ages 4 months and 8 months and at ages 2.5 years, 5 years, 8 years and 15 years for BMI and at ages 1 year, 5 years and 15 years for height. Interaction terms between PGS and each spline were included in the models to estimate the difference in the intercepts (earliest anthropometric trait measurement) and slopes (change in anthropometric trait from the earliest measure to 24 years) between the three PGS categories. Additionally, interaction terms between sex and the first 10 genetic principal components with each spline were included to estimate the difference in intercepts and slopes between males and females; therefore, models were adjusted for sex and principal components. All longitudinal models were created in MLwiN version 3.04 called from Stata version 15 using the ‘runmlwin’ command95.

In the CiF subset of the cohort, age at adiposity rebound was available, categorized as very early (at/before 43 months), early (from 49 months but before 61 months) and later (after 61 months)96. These children represent a 10% sample of the cohort, randomly selected for more detailed investigations, and are representative of the entire cohort (https://www.bristol.ac.uk/alspac/researchers/cohort-profile/). One-way ANOVA was used to test for differences in mean PGS across categories, in boys and girls separately and combined.

Added value of the PGS for prediction of BMI (rank-based inverse-normal transformed by sex) at given timepoints was quantified as incremental explained variance, when adding the PGS to different sets of predictors. First, we examined R2 for BMI at birth and at 3, 5, 8, 11 and 15 years, when adding the PGS to a model containing multiple relatively easily obtainable clinical variables that would be available at birth (birthweight, maternal education, pre-pregnancy maternal BMI, maternal age at date of birth and household social status). Details regarding phenotyping of these variables can be found in Supplementary Table 3. Separately, for R2 for BMI at age 18, we added the PGS to separate models containing BMI from a given early-life timepoint (1, 2, 3, 4, 5 and 8 years). Confidence intervals (95%) were calculated using bootstrapping with 1,000 repetitions.

From early adulthood to middle age

Within PLCO, we assessed the population-statified and sex-stratified association between the standardized PGSLC (using linear combination weights from the UKBB most closely corresponding to each testing population) and per-5-year change between ages 20 and 50 in weight and BMI, with and without adjustment for the initial measurement at age 20. These analyses were additionally adjusted for birth year, sex (where appropriate) and principal components.

We calculated the population-stratified AUC of the PGS for obesity outcomes at age 50. For this, we ran models including (i) birth year and sex, (ii) birth year, sex, principal components and PGSLC, (iii) birth year, sex, principal components and BMI at age 20 and (iv) birth year, sex, principal components, BMI at age 20 and PGSLC. We separately ran model (iv) when including an interaction term between the PGS and BMI at age 20. Based on evidence for statistical interaction between the PGS and BMI at age 20 for predicting obesity at age 50 across multiple populations (Supplementary Table 16), we stratified our analyses on the presence of overweight or obesity at age 20 and, additionally, on sex. Confidence intervals (95%) for AUC were calculated using DeLong’s method.

PGS and weight change due to ILIs

Although neither intervention nor comparison arms were identical across DPP and Look AHEAD, both intervention arms were aimed at 7% weight loss while focused on the same process features83, and both represent a large shift in lifestyle behavior relative to the comparison arms. As such, we decided to pool study-specific effect estimates via inverse-variance weighted meta-analysis. In each study, we ran analyses after pooling the three largest population groups present in both trials: Black and/or African American, Hispanic and White (Supplementary Table 3). Population-stratified analyses were separately run as well. Within each study-specific population, we applied the linear combination weights for PGSLC most closely aligned with the target population (AFR, AMR and EUR, respectively). These scores were then standardized within the matching population, ignoring trial arm, whereafter we standardized again across all individuals for the population-pooled analyses. Due to restricting analyses to genotyped individuals from the three largest population groups, trial randomization was, by definition, broken. To investigate potential shifts in the PGS, we plotted its distribution by trial arm, which showed minimal shifts in distribution (Extended Data Fig. 7).

To investigate weight change, we focused on annual weight measurements. The weight loss nadir in both trials is known to occur at the 1-year mark, after which many individuals start regaining weight. As such, we separately examined the association of the PGS with (i) weight at year 1 (adjusting for baseline weight) and (ii) repeated weight measurements beyond year 1 up to year 4 (adjusting for weight at year 1) in the subset of individuals who had lost ≥3% of their initial body weight in the first year. Analyses were run by trial arm using linear mixed models with random intercepts, with additional covariates being age, sex, principal components, the initial weight measurement (baseline or year 1) and random slopes for time (years). In addition, we ran trial–arm combined models in which we examined interaction effects between trial arm and PGS. These interaction terms are our main estimates of interest, as they reflect how genetic predisposition to obesity modifies the intervention’s effect on weight change, relative to the comparison arm. Study-specific main and interaction effect estimates were separately pooled by means of inverse-variance weighted fixed-effect meta-analysis. Participant characteristics are presented in Supplementary Table 4.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.