Body mass index stratification optimizes polygenic prediction of type 2 diabetes in cross-biobank analyses

Ojima, Takafumi; Namba, Shinichi; Suzuki, Ken; Yamamoto, Kenichi; Sonehara, Kyuto; Narita, Akira; Kamatani, Yoichiro; Tamiya, Gen; Yamamoto, Masayuki; Yamauchi, Toshimasa; Kadowaki, Takashi; Okada, Yukinori

doi:10.1038/s41588-024-01782-y

Article
Published: 11 June 2024

Body mass index stratification optimizes polygenic prediction of type 2 diabetes in cross-biobank analyses

Nature Genetics volume 56, pages 1100–1109 (2024)Cite this article

6668 Accesses
11 Citations
38 Altmetric
Metrics details

Subjects

Abstract

Type 2 diabetes (T2D) shows heterogeneous body mass index (BMI) sensitivity. Here, we performed stratification based on BMI to optimize predictions for BMI-related diseases. We obtained BMI-stratified datasets using data from more than 195,000 individuals (n_T2D = 55,284) from BioBank Japan (BBJ) and UK Biobank. T2D heritability in the low-BMI group was greater than that in the high-BMI group. Polygenic predictions of T2D toward low-BMI targets had pseudo-R² values that were more than 22% higher than BMI-unstratified targets. Polygenic risk scores (PRSs) from low-BMI discovery outperformed PRSs from high BMI, while PRSs from BMI-unstratified discovery performed best. Pathway-specific PRSs demonstrated the biological contributions of pathogenic pathways. Low-BMI T2D cases showed higher rates of neuropathy and retinopathy. Combining BMI stratification and a method integrating cross-population effects, T2D predictions showed greater than 37% improvements over unstratified-matched-population prediction. We replicated findings in the Tohoku Medical Megabank (n = 26,000) and the second BBJ cohort (n = 33,096). Our findings suggest that target stratification based on existing traits can improve the polygenic prediction of heterogeneous diseases.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Overview of the study design.**

**Fig. 2: Variation in heritability based on BMI-stratified groups.**

**Fig. 3: BMI-stratified PRS heatmaps for T2D within each biobank.**

**Fig. 4: BMI-stratified PRS heatmaps for CAD within each biobank.**

**Fig. 5: Odds ratio curve of T2D prevalence for each BMI-stratified PRS.**

**Fig. 6: BMI-stratified PRS heatmaps for T2D across biobanks.**

Dissecting the clinical relevance of polygenic risk score for obesity—a cross-sectional, longitudinal analysis

Article 25 June 2022

Polygenic risk modeling with latent trait-related genetic components

Article 08 February 2021

Associations between polygenic risk scores for cardiometabolic phenotypes and adolescent depression and body dissatisfaction

Article 15 June 2024

Data availability

The genotype data of BBJ are available at the NBDC Human Database (research IDs hum0014 and hum0311). The UKBB analysis was conducted under application number 47821 (https://www.ukbiobank.ac.uk). Individual genotyping results and other cohort data used for the polygenic prediction are stored in Tohoku Medical Megabank Organization (ToMMo). In response to reasonable requests for these data (contact us at dist@megabank.tohoku.ac.jp), we will share the stored data after assembling the dataset following approval of the Ethics Committee and Materials and Information Distribution Review Committee of ToMMo. The PRS weights developed for TMM and BBJ-2nd have been released through the PGS Catalog (https://www.pgscatalog.org) with publication ID PGP000593 and score IDs PGS004615-PGS004620. The PRS weights can also be accessed through application to the NBDC Human Database with accession code hum0197 (https://humandbs.dbcls.jp/en/hum0197-latest).

Code availability

We used publicly available software for the analysis. The software used is described in the Methods section. The codes used in this study are available at https://github.com/takafumiojima/BMI_stratified_T2D_PRS and https://doi.org/10.5281/zenodo.11057931 (ref. ⁷⁷).

References

Chatterjee, N., Shi, J. & García-Closas, M. Developing and evaluating polygenic risk prediction models for stratified disease prevention. Nat. Rev. Genet. 17, 392–406 (2016).
CAS PubMed PubMed Central Google Scholar
Khera, A. V. et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 50, 1219–1224 (2018).
CAS PubMed PubMed Central Google Scholar
Ruan, Y. et al. Improving polygenic prediction in ancestrally diverse populations. Nat. Genet. 54, 573–580 (2022).
CAS PubMed PubMed Central Google Scholar
Choi, S. W., Mak, T. S. H. & O’Reilly, P. F. Tutorial: a guide to performing polygenic risk score analyses. Nat. Protoc. 15, 2759–2772 (2020).
CAS PubMed PubMed Central Google Scholar
Haslam, D. W. & James, W. P. T. Obesity. Lancet 366, 1197–1209 (2005).
PubMed Google Scholar
Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).
CAS PubMed PubMed Central Google Scholar
Kanai, M. et al. Genetic analysis of quantitative traits in the Japanese population links cell types to complex human diseases. Nat. Genet. 50, 390–400 (2018).
CAS PubMed Google Scholar
Mansour Aly, D. et al. Genome-wide association analyses highlight etiological differences underlying newly defined subtypes of diabetes. Nat. Genet. 53, 1534–1542 (2021).
CAS PubMed Google Scholar
Ahlqvist, E. et al. Novel subgroups of adult-onset diabetes and their association with outcomes: a data-driven cluster analysis of six variables. Lancet Diabetes Endocrinol. 6, 361–369 (2018).
PubMed Google Scholar
Li, L. et al. Identification of type 2 diabetes subgroups through topological analysis of patient similarity. Sci. Transl. Med. 7, 311ra174 (2015).
PubMed PubMed Central Google Scholar
Langenberg, C. & Lotta, L. A. Genomic insights into the causes of type 2 diabetes. Lancet 391, 2463–2474 (2018).
CAS PubMed Google Scholar
Udler, M. S. Type 2 diabetes: multiple genes, multiple diseases. Curr. Diab. Rep. 19, 55 (2019).
PubMed PubMed Central Google Scholar
Martin, A. R. et al. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 51, 584–591 (2019).
CAS PubMed PubMed Central Google Scholar
Mahajan, A. et al. Multi-ancestry genetic study of type 2 diabetes highlights the power of diverse populations for discovery and translation. Nat. Genet. 54, 560–572 (2022).
CAS PubMed PubMed Central Google Scholar
Agrawal, S. et al. Inherited basis of visceral, abdominal subcutaneous and gluteofemoral fat depots. Nat. Commun. 13, 3771 (2022).
CAS PubMed PubMed Central Google Scholar
Mavaddat, N. et al. Polygenic risk scores for prediction of breast cancer and breast cancer subtypes. Am. J. Hum. Genet. 104, 21–34 (2019).
CAS PubMed Google Scholar
Zhang, H. et al. Genome-wide association study identifies 32 novel breast cancer susceptibility loci from overall and subtype-specific analyses. Nat. Genet. 52, 572–581 (2020).
CAS PubMed PubMed Central Google Scholar
Lewis, K. J. S. et al. Comparison of genetic liability for sleep traits among individuals with bipolar disorder I or II and control participants. JAMA Psychiatry 77, 303–310 (2020).
PubMed Google Scholar
Wang, Y. et al. Theoretical and empirical quantification of the accuracy of polygenic scores in ancestry divergent populations. Nat. Commun. 11, 3865 (2020).
PubMed PubMed Central Google Scholar
Durvasula, A. & Lohmueller, K. E. Negative selection on complex traits limits phenotype prediction accuracy between populations. Am. J. Hum. Genet. 108, 620–631 (2021).
CAS PubMed PubMed Central Google Scholar
Ma, R. C. W. & Chan, J. C. N. Type 2 diabetes in East Asians: similarities and differences with populations in Europe and the United States. Ann. N. Y. Acad. Sci. 1281, 64–91 (2013).
PubMed PubMed Central Google Scholar
Sone, H. et al. Obesity and type 2 diabetes in Japanese patients. Lancet 361, 85 (2003).
PubMed Google Scholar
Nagai, A. et al. Overview of the BioBank Japan Project: study design and profile. J. Epidemiol. 27, S2–S8 (2017).
PubMed PubMed Central Google Scholar
Hirata, M. et al. Cross-sectional analysis of BioBank Japan clinical data: a large cohort of 200,000 patients with 47 common diseases. J. Epidemiol. 27, S9–S21 (2017).
PubMed PubMed Central Google Scholar
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).
CAS PubMed PubMed Central Google Scholar
Kuriyama, S. et al. The Tohoku Medical Megabank Project: design and mission. J. Epidemiol. 26, 493–511 (2016).
PubMed Google Scholar
Fuse, N. et al. Establishment of integrated biobank for precision medicine and personalized healthcare: The Tohoku Medical Megabank Project. JMA J. 2, 113–122 (2019).
PubMed PubMed Central Google Scholar
Udler, M. S. et al. Type 2 diabetes genetic loci informed by multi-trait associations point to disease mechanisms and subtypes: a soft clustering analysis. PLoS Med. 15, e1002654 (2018).
PubMed PubMed Central Google Scholar
Graham, S. E. et al. The power of genetic diversity in genome-wide association studies of lipids. Nature 600, 675–679 (2021).
CAS PubMed PubMed Central Google Scholar
Dudbridge, F. Power and predictive accuracy of polygenic risk scores. PLoS Genet. 9, e1003348 (2013).
CAS PubMed PubMed Central Google Scholar
Sakaue, S. et al. Trans-biobank analysis with 676,000 individuals elucidates the association of polygenic risk scores of complex traits with human lifespan. Nat. Med. 26, 542–548 (2020).
CAS PubMed Google Scholar
Fisher, R. A. Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population. Biometrika 10, 507–521 (1915).
Google Scholar
Silver, N. C. & Dunlap, W. P. Averaging correlation coefficients: should Fisher’s z transformation be used? J. Appl. Psychol. 72, 146–148 (1987).
Google Scholar
Ge, T., Chen, C. Y., Ni, Y., Feng, Y. C. A. & Smoller, J. W. Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat. Commun. 10, 1776 (2019).
PubMed PubMed Central Google Scholar
Willer, C. J., Li, Y. & Abecasis, G. R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190–2191 (2010).
CAS PubMed PubMed Central Google Scholar
Bulik-Sullivan, B. et al. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).
CAS PubMed PubMed Central Google Scholar
Voight, B. F. et al. Twelve type 2 diabetes susceptibility loci identified through large-scale association analysis. Nat. Genet. 42, 579–589 (2010).
CAS PubMed PubMed Central Google Scholar
Mahajan, A. et al. Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps. Nat. Genet. 50, 1505–1513 (2018).
CAS PubMed PubMed Central Google Scholar
Cho, Y. S. et al. Meta-analysis of genome-wide association studies identifies eight new loci for type 2 diabetes in East Asians. Nat. Genet. 44, 67–72 (2012).
CAS Google Scholar
Dennis, J. M. et al. Sex and BMI alter the benefits and risks of sulfonylureas and thiazolidinediones in type 2 diabetes: a framework for evaluating stratification using routine clinical and individual trial data. Diabetes Care 41, 1844–1853 (2018).
PubMed Google Scholar
Bouchi, R. et al. A consensus statement from the Japan Diabetes Society: a proposed algorithm for pharmacotherapy in people with type 2 diabetes. J. Diabetes Investig. 14, 151–164 (2023).
PubMed Google Scholar
Deutsch, A. J., Ahlqvist, E. & Udler, M. S. Phenotypic and genetic classification of diabetes. Diabetologia 65, 1758–1769 (2022).
PubMed PubMed Central Google Scholar
Dicorpo, D. et al. Type 2 diabetes partitioned polygenic scores associate with disease outcomes in 454,193 individuals across 13 cohorts. Diabetes Care 45, 674–683 (2022).
CAS PubMed PubMed Central Google Scholar
Nishida, C. et al. Appropriate body-mass index for Asian populations and its implications for policy and intervention strategies. Lancet 363, 157–163 (2004).
Google Scholar
Examination Committee of Criteria for ‘Obesity Disease’ in Japan & Japan Society for the Study of Obesity. New criteria for ‘obesity disease’ in Japan. Circ. J. 66, 987–992 (2002).
Wainberg, M. et al. Homogeneity in the association of body mass index with type 2 diabetes across the UK Biobank: a Mendelian randomization study. PLoS Med. 16, e1002982 (2019).
PubMed PubMed Central Google Scholar
Dennis, J. M., Shields, B. M., Henley, W. E., Jones, A. G. & Hattersley, A. T. Clusters provide a better holistic view of type 2 diabetes than simple clinical features—authors’ reply. Lancet Diabetes Endocrinol. 7, 669 (2019).
PubMed Google Scholar
Yengo, L. et al. A saturated map of common genetic variants associated with human height. Nature 610, 704–712 (2022).
CAS PubMed PubMed Central Google Scholar
Akiyama, M. et al. Characterizing rare and low-frequency height-associated variants in the Japanese population. Nat. Commun. 10, 4393 (2019).
CAS PubMed PubMed Central Google Scholar
Das, S. et al. Next-generation genotype imputation service and methods. Nat. Genet. 48, 1284–1287 (2016).
CAS PubMed PubMed Central Google Scholar
Auton, A. et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).
PubMed Google Scholar
Okada, Y. et al. Deep whole-genome sequencing reveals recent selection signatures linked to evolution and disease risk of Japanese. Nat. Commun. 9, 1631 (2018).
PubMed PubMed Central Google Scholar
Sakaue, S. et al. A cross-population atlas of genetic associations for 220 human phenotypes. Nat. Genet. 53, 1415–1424 (2021).
CAS PubMed Google Scholar
Kawai, Y. et al. Japonica array: improved genotype imputation by designing a population-specific SNP array with 1070 Japanese individuals. J. Hum. Genet. 60, 581–587 (2015).
CAS PubMed PubMed Central Google Scholar
Tadaka, S. et al. 3.5KJPNv2: an allele frequency panel of 3552 Japanese individuals including the X chromosome. Hum. Genome Var. 6, 28 (2019).
PubMed PubMed Central Google Scholar
Tadaka, S. et al. JMorp: Japanese multi omics reference panel. Nucleic Acids Res. 46, D551–D557 (2018).
CAS PubMed Google Scholar
Delaneau, O., Zagury, J. F., Robinson, M. R., Marchini, J. L. & Dermitzakis, E. T. Accurate, scalable and integrative haplotype estimation. Nat. Commun. 10, 5436 (2019).
PubMed PubMed Central Google Scholar
Sonehara, K. et al. Genetic architecture of microRNA expression and its link to complex diseases in the Japanese population. Hum. Mol. Genet. 31, 1806–1820 (2022).
CAS PubMed Google Scholar
Tomofuji, Y. et al. Reconstruction of the personal information from human genome reads in gut metagenome sequencing data. Nat. Microbiol. 8, 1079–1094 (2023).
CAS PubMed PubMed Central Google Scholar
Suzuki, K. et al. Identification of 28 new susceptibility loci for type 2 diabetes in the Japanese population. Nat. Genet. 51, 379–386 (2019).
CAS PubMed Google Scholar
Eastwood, S. V. et al. Algorithms for the capture and adjudication of prevalent and incident diabetes in UK Biobank. PLoS One 11, e0162388 (2016).
PubMed PubMed Central Google Scholar
Yeung, S. L. A., Luo, S. & Schooling, C. M. The impact of glycated hemoglobin (HbA1c) on cardiovascular disease risk: a Mendelian randomization study using UK Biobank. Diabetes Care 41, 1991–1997 (2018).
CAS Google Scholar
Ogishima, S. et al. dbTMM: an integrated database of large-scale cohort, genome and clinical data for the Tohoku Medical Megabank Project. Hum. Genome Var. 8, 44 (2021).
PubMed PubMed Central Google Scholar
Itabashi, F. et al. Combined associations of liver enzymes and obesity with diabetes mellitus prevalence: the Tohoku Medical Megabank community-based cohort study. J. Epidemiol. 32, 221–227 (2022).
PubMed PubMed Central Google Scholar
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
CAS PubMed PubMed Central Google Scholar
Lee, S. H., Wray, N. R., Goddard, M. E. & Visscher, P. M. Estimating missing heritability for disease from genome-wide association studies. Am. J. Hum. Genet. 88, 294–305 (2011).
PubMed PubMed Central Google Scholar
Lee, S. H., Goddard, M. E., Wray, N. R. & Visscher, P. M. A better coefficient of determination for genetic profile analysis. Genet. Epidemiol. 36, 214–224 (2012).
PubMed Google Scholar
Robin, X. et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 12, 77 (2011).
PubMed PubMed Central Google Scholar
Delong, E. R., Delong, D. M. & Clarke-Pearson, D. L. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44, 837–845 (1988).
CAS PubMed Google Scholar
Savitzky, A. & Golay, M. J. E. Smoothing and differentiation of data by simplified least squares procedures. Anal. Chem. 36, 1627–1639 (1964).
CAS Google Scholar
Warrier, V. et al. Genome-wide meta-analysis of cognitive empathy: heritability, and correlates with sex, neuropsychiatric conditions and cognition. Mol. Psychiatry 23, 1402–1409 (2018).
CAS PubMed Google Scholar
Pencina, M. J., D'Agostino, R. B. Sr., D’Agostino, R. B. Jr. & Vasan, R. S. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat. Med. 27, 157–172 (2008).
PubMed Google Scholar
Ghouse, J. et al. Genome-wide meta-analysis identifies 93 risk loci and enables risk prediction equivalent to monogenic forms of venous thromboembolism. Nat. Genet. 55, 399–409 (2023).
CAS PubMed Google Scholar
McKearnan, S. B., Wolfson, J., Vock, D. M., Vazquez-Benitez, G. & O’Connor, P. J. Performance of the net reclassification improvement for nonnested models and a novel percentile-based alternative. Am. J. Epidemiol. 187, 1327–1335 (2018).
PubMed PubMed Central Google Scholar
Pencina, M. J., D'Agostino, R. B. Sr. & Steyerberg, E. W. Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers. Stat. Med. 30, 11–21 (2011).
PubMed Google Scholar
Kundu, S., Aulchenko, Y. S., Van Duijn, C. M. & Janssens, A. C. J. W. PredictABEL: an R package for the assessment of risk prediction models. Eur. J. Epidemiol. 26, 261–264 (2011).
PubMed PubMed Central Google Scholar
Ojima, T. Takafumiojima/BMI_stratified_T2D_PRS: v1.0.0. Zenodo https://doi.org/10.5281/zenodo.11057931 (2024).

Download references

Acknowledgements

We acknowledge the participants and investigators of BioBank Japan, UK Biobank and Tohoku Medical Megabank. T.O. was supported by JST SPRING (JPMJSP2138) and the Osaka University Transdisciplinary Program for Biomedical Entrepreneurship and Innovation (WISE program). S.N. and K. Sonehara were Supported by Takeda Science Foundation. Y.O. was supported by JSPS KAKENHI (22H00476), and AMED (JP23km0405211, JP23km0405217, JP23ek0109594, JP23ek0410113, JP23kk0305022, JP223fa627002, JP223fa627010, JP233fa627011, JP23zf0127008, JP23tm0524002), JST Moonshot R&D (JPMJMS2021, JPMJMS2024), Takeda Science Foundation, Bioinformatics Initiative of Osaka University Graduate School of Medicine, Institute for Open and Transdisciplinary Research Initiatives and Center for Infectious Disease Education and Research (CiDER), and Center for Advanced Modality and DDS (CAMaD), Osaka University.

Author information

A full list of members and their affiliations appears in the Supplementary Information.

Authors and Affiliations

Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Japan
Takafumi Ojima, Shinichi Namba, Ken Suzuki, Kenichi Yamamoto, Kyuto Sonehara & Yukinori Okada
Graduate School of Medicine, Tohoku University, Sendai, Japan
Takafumi Ojima, Gen Tamiya & Masayuki Yamamoto
Laboratory for Systems Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
Takafumi Ojima, Kyuto Sonehara & Yukinori Okada
Center for Advanced Intelligence Project, RIKEN, Tokyo, Japan
Takafumi Ojima & Gen Tamiya
Department of Genome Informatics, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
Shinichi Namba, Kyuto Sonehara & Yukinori Okada
Department of Diabetes and Metabolic Diseases, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
Ken Suzuki & Toshimasa Yamauchi
Department of Pediatrics, Osaka University Graduate School of Medicine, Suita, Japan
Kenichi Yamamoto
Laboratory of Statistical Immunology, Immunology Frontier Research Center (WPI-IFReC), Osaka University, Suita, Japan
Kenichi Yamamoto & Yukinori Okada
Laboratory of Children’s Health and Genetics, Division of Health Science, Osaka University Graduate School of Medicine, Osaka, Japan
Kenichi Yamamoto
Tohoku Medical Megabank Organization, Tohoku University, Sendai, Japan
Akira Narita, Gen Tamiya & Masayuki Yamamoto
Laboratory of Complex Trait Genomics, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
Yoichiro Kamatani
Toranomon Hospital, Tokyo, Japan
Takashi Kadowaki
Premium Research Institute for Human Metaverse Medicine (WPI-PRIMe), Osaka University, Osaka, Japan
Yukinori Okada

Authors

Takafumi Ojima
View author publications
Search author on:PubMed Google Scholar
Shinichi Namba
View author publications
Search author on:PubMed Google Scholar
Ken Suzuki
View author publications
Search author on:PubMed Google Scholar
Kenichi Yamamoto
View author publications
Search author on:PubMed Google Scholar
Kyuto Sonehara
View author publications
Search author on:PubMed Google Scholar
Akira Narita
View author publications
Search author on:PubMed Google Scholar
Yoichiro Kamatani
View author publications
Search author on:PubMed Google Scholar
Gen Tamiya
View author publications
Search author on:PubMed Google Scholar
Masayuki Yamamoto
View author publications
Search author on:PubMed Google Scholar
Toshimasa Yamauchi
View author publications
Search author on:PubMed Google Scholar
Takashi Kadowaki
View author publications
Search author on:PubMed Google Scholar
Yukinori Okada
View author publications
Search author on:PubMed Google Scholar

Consortia

the Tohoku Medical Megabank Project Study Group

Masayuki Yamamoto
, Gen Tamiya
& Akira Narita

the Biobank Japan Project

Yukinori Okada
& Yoichiro Kamatani

Contributions

T.O. and Y.O. conceived and designed the study. T.O. and Y.O. wrote the manuscript with critical input from S.N., K. Suzuki, K.Y., K. Sonehara and A.N. T.O., S.N., K. Suzuki and A.N. conducted the data analysis. T.O., S.N., K. Suzuki, K.Y., K. Sonehara, A.N., Y.K., M.Y., G.T., T.Y., T.K., Y.O. and the members of the Tohoku Medical Megabank Project Study Group and the BioBank Japan Project contributed to the collection of samples and management of genotype data and clinical information. G.T., T.Y., T.K. and Y.O. supervised the study. All authors contributed to the article and approved the submitted version.

Corresponding author

Correspondence to Yukinori Okada.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Genetics thanks Cassandra Spracklen and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Performances of BMI-stratified polygenic prediction of T2D.

Schematic representation of the relative levels of the performances of BMI-stratified polygenic prediction of T2D based on the analysis results. Each block in the heatmaps above corresponds to the bar graph below. Red, high polygenic prediction accuracy; white, low polygenic prediction accuracy. All, group consisting of all samples obtained by stratified random sampling; High, high BMI group; Mid, middle BMI group; Low, low BMI group.

Extended Data Fig. 2 BMI-stratified PRS heatmaps for T2D evaluated using AUC in BBJ.

a,b, The values in the PRS heatmaps were calculated by AUC in two ways. The AUC calculated with the model with full covariates was used in the upper heatmap (a), and the AUC calculated with the model with no covariates was used in the lower heatmap (b).

Extended Data Fig. 3 BMI-stratified PRS heatmaps for T2D evaluated using AUC in UKBB.

a,b, The values in the PRS heatmap were calculated by AUC in two ways. The AUC calculated with the model with full covariates was used in the upper heatmap (a), and the AUC calculated with the model with no covariates was used in the lower heatmap (b).

Extended Data Fig. 4 BMI-stratified PRS heatmaps for T2D with BMI adjustment.

PRS heatmaps with and without BMI adjustment as covariates can be compared vertically. The values in PRS heatmaps are Fisher’s z-transformation averages of pseudo-R² calculated by the LOGO-PRS method. Red, high polygenic prediction accuracy; white, low polygenic prediction accuracy. The upper description of the heatmap represents the BMI group of the discovery population, and the left side represents that of the target population. a, PRS heatmaps in BBJ. b, PRS heatmaps in UKBB.

Extended Data Fig. 5 BMI-stratified PRS heatmaps for T2D with BMI adjustment evaluated using AUC.

PRS heatmaps with and without BMI adjustment as covariates can be compared vertically. The values in PRS heatmaps are calculated by simple model AUC. Red, high polygenic prediction accuracy; white, low polygenic prediction accuracy. The upper description of the heatmap represents the BMI group of the discovery population, and the left side represents that of the target population. a, PRS heatmaps in BBJ. b, PRS heatmaps in UKBB.

Extended Data Fig. 6 BMI-stratified PRS heatmaps for T2D across biobanks targeting BBJ evaluated using AUC.

PRS heatmaps for the same target (BBJ) with different discovery populations can be compared vertically. a,b, The values in the PRS heatmaps were calculated by AUC in two ways. The AUC calculated with the model with full covariates was used in the upper heatmap (a), and the AUC calculated with the model with no covariates was used in the lower heatmap (b).

Extended Data Fig. 7 BMI-stratified PRS heatmaps for T2D across biobanks targeting UKBB evaluated using AUC.

PRS heatmaps for the same target (UKBB) with different discovery populations can be compared vertically. a,b, The values in the PRS heatmaps were calculated by AUC in two ways. The AUC calculated with the model with full covariates was used in the upper heatmap (a), and the AUC calculated with the model with no covariates was used in the lower heatmap (b).

Extended Data Fig. 8 Odds ratio curve of T2D prevalence for each BMI-stratified PRS across biobanks.

Odds ratio curves for each BMI-stratified PRS in each box are locally estimated scatterplot smoothing (LOESS) curves. On the left, the comparison is made with combinations of discovery and target populations; in the middle, two BMI-stratified groups in the meta-populational method; on the right, three BMI-stratified groups in the meta-populational method. a, Odds ratio curves for T2D with BBJ as target population. b, Odds ratio curves for T2D with UKBB as target population. High BMI, high BMI group; Mid BMI, middle BMI group; Low BMI, low BMI group; Total, total dataset including both male and female samples.

Supplementary information

Supplementary Information

Supplementary Note, Figs. 1–19 and Tables 1–14.

Reporting Summary

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ojima, T., Namba, S., Suzuki, K. et al. Body mass index stratification optimizes polygenic prediction of type 2 diabetes in cross-biobank analyses. Nat Genet 56, 1100–1109 (2024). https://doi.org/10.1038/s41588-024-01782-y

Download citation

Received: 17 July 2022
Accepted: 26 April 2024
Published: 11 June 2024
Version of record: 11 June 2024
Issue date: June 2024
DOI: https://doi.org/10.1038/s41588-024-01782-y

This article is cited by

Dissecting cross-population polygenic heterogeneity across respiratory and cardiometabolic diseases
- Yuji Yamamoto
- Yuya Shirai
- Yukinori Okada
Nature Communications (2025)
Predictive capabilities of polygenic scores in an East-Asian population-based cohort: the Singapore Chinese health study
- Xuling Chang
- Chih Chuan Shih
- Rajkumar Dorajoo
Communications Biology (2025)
Clinical use of polygenic scores in type 2 diabetes: challenges and possibilities
- Rashmi B. Prasad
- Liisa Hakaste
- Tiinamaija Tuomi
Diabetologia (2025)
Comparison of polygenic risk scores for type 2 diabetes developed from different ancestry groups
- Takuma Furukawa
- Megumi Hara
- Kokichi Arisawa
npj Metabolic Health and Disease (2025)
Type 2 diabetes pathway-specific polygenic risk scores elucidate heterogeneity in clinical presentation, disease progression and diabetic complications in 18,217 Chinese individuals with type 2 diabetes
- Gechang Yu
- Claudia H. T. Tam
- Ronald C. W. Ma
Diabetologia (2025)