Abstract
Genome-wide association studies have identified approximately 200 genetic risk loci for breast cancer, but the causal variants and target genes are mostly unknown. We sought to fine-map all known breast cancer risk loci using genome-wide association study data from 172,737 female breast cancer cases and 242,009 controls of African, Asian and European ancestry. We identified 332 independent association signals for breast cancer risk, including 131 signals not reported previously, and for 50 of them, we narrowed the credible causal variants down to a single variant. Analyses integrating functional genomics data identified 195 putative susceptibility genes, enriched in PI3K/AKT, TNF/NF-κB, p53 and Wnt/β-catenin pathways. Single-cell RNA sequencing or in vitro experiment data provided additional functional evidence for 105 genes. Our study uncovered large numbers of association signals and candidate susceptibility genes for breast cancer, uncovered breast cancer genetics and biology, and supported the value of including multi-ancestry data in fine-mapping analyses.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
Summary-level statistics data from AABC have been uploaded to the GWAS Catalog (GCST90429846 for overall breast cancer, GCST90429847 for ER+ breast cancer and GCST90429848 for ER− breast cancer). Summary-level statistics data from AABCG are available in the GWAS Catalog (GCST90296719 for overall breast cancer, GCST90296720 for ER+ breast cancer, GCST90296721 for ER− breast cancer and GCST90296722 for TNBC). Summary-level statistics data from BCAC are available in the GWAS Catalog (GCST90454341 for overall breast cancer, GCST004988 for ER+ breast cancer, GCST005076 for ER− breast cancer and GCST90454344 for TNBC). The genomic and transcriptomic data for eQTL analyses in this study have been uploaded to the database of Genotypes and Phenotypes (dbGAP) (accession no. phs003535). Data from the 1000 Genomes Project can be obtained from www.internationalgenome.org/data/. Single-cell RNA sequencing data from GTEx are available under dbGaP accession phs000424. The RoadMap ChromHMM annotations are available from https://egg2.wustl.edu/roadmap/web_portal/chr_state_learning.html. The Cistrome datasets are available from http://cistrome.org/. The MSigDB hallmark gene sets can be obtained from https://www.gsea-msigdb.org/gsea/msigdb/human/collections.jsp#H. Data for ChIA-PET, in situ Hi-C and IM-PET are accessible from NCBI/GEO (ID GSE18046, GSE33664 and GSE63525). Processed Capture Hi-C data are available from https://osf.io/2cnw7/. Data from EnhancerAtlas are available from http://www.enhanceratlas.org/downloadv2.php. Data from Super-Enhancer are available from https://bio.liclab.net/. Data from FANTOM are available from https://fantom.gsc.riken.jp/. Data for topologically associating domains can be obtained from http://3dgenome.fsm.northwestern.edu/publications.html.
Code availability
The data analysis code relevant to this paper is available via GitHub at https://github.com/Damon0212/Multi-ancestry_finemapping_breast_cancer. The code has also been uploaded to the Zenodo repository at https://doi.org/10.5281/zenodo.12574126 (ref. 80).
References
Sung, H. et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 71, 209–249 (2021).
Michailidou, K. et al. Association analysis identifies 65 new breast cancer risk loci. Nature 551, 92–94 (2017).
Zhang, H. et al. Genome-wide association study identifies 32 novel breast cancer susceptibility loci from overall and subtype-specific analyses. Nat. Genet. 52, 572–581 (2020).
Shu, X. et al. Identification of novel breast cancer susceptibility loci in meta-analyses conducted among Asian and European descendants. Nat. Commun. 11, 1217 (2020).
Jia, G. et al. Genome- and transcriptome-wide association studies of 386,000 Asian and European-ancestry women provide new insights into breast cancer genetics. Am. J. Hum. Genet. 109, 2185–2195 (2022).
Fejerman, L. et al. Genome-wide association study of breast cancer in Latinas identifies novel protective variants on 6q25. Nat. Commun. 5, 5260 (2014).
Guo, X. et al. Fine-scale mapping of the 4q24 locus identifies two independent loci associated with breast cancer risk. Cancer Epidemiol. Biomarkers Prev. 24, 1680–1691 (2015).
Shi, J. et al. Fine-scale mapping of 8q24 locus identifies multiple independent risk variants for breast cancer. Int. J. Cancer 139, 1303–1317 (2016).
Zeng, C. et al. Identification of independent association signals and putative functional variants for breast cancer risk through fine-scale mapping of the 12p11 locus. Breast Cancer Res. 18, 1–21 (2016).
Guo, X. et al. A comprehensive cis-eQTL analysis revealed target genes in breast cancer susceptibility loci identified in genome-wide association studies. Am. J. Hum. Genet. 102, 890–903 (2018).
French, J. D. et al. Functional variants at the 11q13 risk locus for breast cancer regulate cyclin D1 expression through long-range enhancers. Am. J. Hum. Genet. 92, 489–503 (2013).
Lin, W.-Y. et al. Identification and characterization of novel associations in the CASP8/ALS2CR12 region on chromosome 2 with breast cancer risk. Hum. Mol. Genet. 24, 285–298 (2015).
Kumaran, M. et al. Fine-mapping of a novel premenopausal breast cancer susceptibility locus at Chr4q31.22 in Caucasian women and validation in African and Chinese women. Int. J. Cancer 146, 1219–1229 (2020).
Zheng, Y. et al. Fine mapping of breast cancer genome-wide association studies loci in women of African ancestry identifies novel susceptibility markers. Carcinogenesis 34, 1520–1528 (2013).
Fachal, L. et al. Fine-mapping of 150 breast cancer risk regions identifies 191 likely target genes. Nat. Genet. 52, 56–73 (2020).
Martin, A. R., Teferra, S., Möller, M., Hoal, E. G. & Daly, M. J. The critical needs and challenges for genetic architecture studies in Africa. Curr. Opin. Genet. Dev. 53, 113–120 (2018).
Mahajan, A. et al. Multi-ancestry genetic study of type 2 diabetes highlights the power of diverse populations for discovery and translation. Nat. Genet. 54, 560–572 (2022).
Passaro, A., Jänne, P. A., Mok, T. & Peters, S. Overcoming therapy resistance in EGFR-mutant lung cancer. Nat. Cancer 2, 377–391 (2021).
Loibl, S. & Gianni, L. HER2-positive breast cancer. Lancet 389, 2415–2429 (2017).
Yuan, K. et al. Fine-mapping across diverse ancestries drives the discovery of putative causal variants underlying human complex traits and diseases. Nat. Genet. 56, 1841–1850 (2024).
McLaren, W. et al. The ensembl variant effect predictor. Genome Biol. 17, 1–14 (2016).
Mavaddat, N. et al. Polygenic risk scores for prediction of breast cancer and breast cancer subtypes. Am. J. Hum. Genet. 104, 21–34 (2019).
Zheng, R. et al. Cistrome Data Browser: expanded datasets and new tools for gene regulatory analysis. Nucleic Acids Res. 47, D729–D735 (2019).
Mei, S. et al. Cistrome Data Browser: a data portal for ChIP-Seq and chromatin accessibility data in human and mouse. Nucleic Acids Res. 45, D658–D662 (2017).
Wen, W. et al. Genetic variations of DNA bindings of FOXA1 and co-factors in breast cancer susceptibility. Nat. Commun. 12, 5318 (2021).
Roadmap Epigenomics Consortium et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481–487 (2016).
Wu, L. et al. A transcriptome-wide association study of 229,000 women identifies new candidate susceptibility genes for breast cancer. Nat. Genet. 50, 968–978 (2018).
Feng, H. et al. Transcriptome-wide association study of breast cancer risk by estrogen-receptor status. Genet. Epidemiol. 44, 442–468 (2020).
Ferreira, M. A. et al. Genome-wide association and transcriptome studies identify target genes and risk loci for breast cancer. Nat. Commun. 10, 1741 (2019).
Martínez-Jiménez, F. et al. A compendium of mutational cancer driver genes. Nat. Rev. Cancer 20, 555–572 (2020).
Meyers, R. M. et al. Computational correction of copy number effect improves specificity of CRISPR–Cas9 essentiality screens in cancer cells. Nat. Genet. 49, 1779–1784 (2017).
Eraslan, G. et al. Single-nucleus cross-tissue molecular reference maps toward understanding disease gene function. Science 376, eabl4290 (2022).
Shu, X. et al. Associations between circulating proteins and risk of breast cancer by intrinsic subtypes: a Mendelian randomisation analysis. Br. J. Cancer 127, 1507–1514 (2022).
Shu, X. et al. Evaluation of associations between genetically predicted circulating protein biomarkers and breast cancer risk. Int. J. Cancer 146, 2130–2138 (2020).
Jia, G. et al. Identification of target proteins for breast cancer genetic risk loci and blood risk biomarkers in a large study by integrating genomic and proteomic data. Int. J. Cancer 152, 2314–2320 (2023).
Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA 102, 15545–15550 (2005).
Liberzon, A. et al. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst. 1, 417–425 (2015).
Huang, R. et al. The NCATS BioPlanet – An integrated platform for exploring the universe of cellular signaling pathways for toxicology, systems biology, and chemical genomics. Front. Pharmacol. 10, 445 (2019).
Martens, M. et al. WikiPathways: connecting communities. Nucleic Acids Res. 49, D613–D621 (2021).
Gene Ontology Consortium The Gene Ontology resource: enriching a GOld mine. Nucleic Acids Res. 49, D325–D334 (2021).
Ashburner, M. et al. Gene Ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000).
Wojcik, G. L. et al. Genetic analyses of diverse populations improves discovery for complex traits. Nature 570, 514–518 (2019).
Bojesen, S. E. et al. Multiple independent variants at the TERT locus are associated with telomere length and risks of breast and ovarian cancer. Nat. Genet. 45, 371–384 (2013).
Killedar, A. et al. A common cancer risk-associated allele in the hTERT locus encodes a dominant negative inhibitor of telomerase. PLoS Genet. 11, e1005286 (2015).
Miricescu, D. et al. PI3K/AKT/mTOR signaling pathway in breast cancer: from molecular landscape to clinical aspects. Int. J. Mol. Sci. 22, 173 (2021).
Ortega, M. A. et al. Signal transduction pathways in breast cancer: the important role of PI3K/Akt/mTOR. J. Oncol. 2020, 9258396 (2020).
Wu, Y. & Zhou, B. P. TNF-α/NF-κB/Snail pathway in cancer cell migration and invasion. Br. J. Cancer 102, 639–644 (2010).
Mercogliano, M. F., Bruni, S., Elizalde, P. V. & Schillaci, R. Tumor necrosis factor α blockade: an opportunity to tackle breast cancer. Front. Oncol. 10, 584 (2020).
Liu, J., Zhang, C., Hu, W. & Feng, Z. Tumor suppressor p53 and metabolism. J. Mol. Cell. Biol. 11, 284–292 (2019).
Jia, G. et al. Genome-wide association analyses of breast cancer in women of African ancestry identify new susceptibility loci and improve risk prediction. Nat. Genet. 56, 819–826 (2024).
Zheng, W. et al. Common genetic determinants of breast-cancer risk in East Asian women: a collaborative study of 23 637 breast cancer cases and 25 579 controls. Hum. Mol. Genet. 22, 2539–2550 (2013).
Ishigaki, K. et al. Large-scale genome-wide association study in a Japanese population identifies novel susceptibility loci across different diseases. Nat. Genet. 52, 669–679 (2020).
Zheng, W. et al. Genome-wide association study identifies a new breast cancer susceptibility locus at 6q25.1. Nat. Genet. 41, 324–328 (2009).
Cai, Q. et al. Genome-wide association analysis in East Asians identifies breast cancer susceptibility loci at 1q32.1, 5q14.3 and 15q26.1. Nat. Genet. 46, 886–890 (2014).
Michailidou, K. et al. Genome-wide association analysis of more than 120,000 individuals identifies 15 new susceptibility loci for breast cancer. Nat. Genet. 47, 373–380 (2015).
Cai, Q. et al. Genome-wide association study identifies breast cancer risk variant at 10q21.2: results from the Asia Breast Cancer Consortium. Hum. Mol. Genet. 20, 4991–4999 (2011).
Long, J. et al. Genome-wide association study in East Asians identifies novel susceptibility loci for breast cancer. PLoS Genet. 8, e1002532 (2012).
Han, M.-R. et al. Genome-wide association study in East Asians identifies two novel breast cancer susceptibility loci. Hum. Mol. Genet. 25, 3361–3371 (2016).
Zhang, Y. et al. Rare coding variants and breast cancer risk: evaluation of susceptibility loci identified in genome-wide association studies. Cancer Epidemiol. Biomarkers Prev. 23, 622–628 (2014).
Kim, H.C. et al. A genome-wide association study identifies a breast cancer risk variant in ERBB4 at 2q34: results from the Seoul Breast Cancer Study. Breast Cancer Res. 14, 1–12 (2012).
Willer, C. J., Li, Y. & Abecasis, G. R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190–2191 (2010).
Mägi, R. et al. Trans-ethnic meta-regression of genome-wide association studies accounting for ancestry increases power for discovery and improves fine-mapping resolution. Hum. Mol. Genet. 26, 3639–3650 (2017).
Pepe, M. S., Longton, G. & Janes, H. Estimation and comparison of receiver operating characteristic curves. Stata J. 9, 1 (2009).
Yuan, Y. et al. Multi-omics analysis to identify susceptibility genes for colorectal cancer. Hum. Mol. Genet. 30, 321–330 (2021).
Chen, Z. et al. Identifying putative susceptibility genes and evaluating their associations with somatic mutations in human cancers. Am. J. Hum. Genet. 105, 477–492 (2019).
Beesley, J. et al. Chromatin interactome mapping at 139 independent breast cancer risk signals. Genome Biol. 21, 1–19 (2020).
Fullwood, M. J. et al. An oestrogen-receptor-α-bound human chromatin interactome. Nature 462, 58–64 (2009).
Li, G. et al. Extensive promoter-centered chromatin interactions provide a topological basis for transcription regulation. Cell 148, 84–98 (2012).
Rao, S. S. P. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).
He, B., Chen, C., Teng, L. & Tan, K. Global view of enhancer–promoter interactome in human cells. Proc. Natl Acad. Sci. USA 111, E2191–E2199 (2014).
Teng, L., He, B., Wang, J. & Tan, K. 4DGenome: a comprehensive database of chromatin interactions. Bioinformatics 31, 2560–2564 (2015).
Andersson, R. et al. An atlas of active enhancers across human cell types and tissues. Nature 507, 455–461 (2014).
Gao, T. & Qian, J. EnhancerAtlas 2.0: an updated resource with enhancer annotation in 586 tissue/cell types across nine species. Nucleic Acids Res. 48, D58–D64 (2020).
Hnisz, D. et al. Super-enhancers in the control of cell identity and disease. Cell 155, 934–947 (2013).
Jiang, Y. et al. SEdb: a comprehensive human super-enhancer database. Nucleic Acids Res. 47, D235–D243 (2019).
Jian, X., Boerwinkle, E. & Liu, X. In silico tools for splicing defect prediction: a survey from the viewpoint of end users. Genet. Med. 16, 497–503 (2014).
Jaganathan, K. et al. Predicting splicing from primary sequence with deep learning. Cell 176, 535–548.e24 (2019).
Gusev, A. et al. A transcriptome-wide association study of high-grade serous epithelial ovarian cancer identifies new susceptibility genes and splice variants. Nat. Genet. 51, 815–823 (2019).
Jia, G. Damon0212/Multi-ancestry_finemapping_breast_cancer: Multi-ancestry_finemapping_breast_cancer (fine-mapping). Zenodo https://doi.org/10.5281/zenodo.12574126 (2024).
Acknowledgements
The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding agents. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. This research was supported in part by the US National Institutes of Health grant nos R01CA202981, R01CA235553, R01CA148667 and R01CA124558. Sample preparation and genotyping assays at Vanderbilt were conducted at the Survey and Biospecimen Shared Resources and Vanderbilt Technologies for Advanced Genomics, which are supported in part by the Vanderbilt-Ingram Cancer Center (P30CA068485). Data analyses were conducted using the Advanced Computing Center for Research and Education (ACCRE) at Vanderbilt University. Biospecimens from the Susan G. Komen Tissue Bank at the IU Simon Cancer Center were used in this study. We thank contributors, including Indiana University who collected data used in this study, as well as donors and their families, whose help and participation made this work possible. Additional information, including grant support information for participating studies of the ABCC and AABCG, is provided in the Supplementary Note.
Author information
Authors and Affiliations
Contributions
G.J. and W.Z. conceived and designed the study. Q.C., S.A., M.E.B., Y.C., J.-Y.C., Y.-T.G., M.G.-C., J.G., J.J.H., M.I., E.M.J., S.-S.K., C.I.L., K. Matsuda, K. Matsuo, K.L.N., B.N., O.I.O., T.P., S.K.P., B.P., M.F.P., M.S., D.P.S., C.-Y.S., M.A.T., S.Y., Y.Z., T.A., A.M.B., A.F., A.J.M.H., H.I., M.K., E.-S.L., T.M., P.N., D.-Y.N., K.M.O., O.O., A.F.O., M.-H.P., S.R., T.Y., G.Z., E.N.B., M.H., S.-K.L., J.O., C.R.W., M.L.C., C.B.A., D.H., D.K., J.R.P., X.-O.S., C.B.A.H., J.L. and W.Z. recruited the study participants and collected the data and specimens. G.J., J.P., Q.C., J.-Y.C., Y.-T.G., M.G.-C., J.G., M.I., E.M.J., S.-S.W., K. Matsuda, K. Matsuo, S.K.P., B.P., C.-Y.S., M.A.T., S.Y., Y.Z., T.A., H.I., M.K., E.-S.L., D.-Y.N., A.F.O., M.-H.P., T.Y., E.N.B., M.H., S.-K.L., C.R.W., H. Zhang, H. Zhao, M.L.C., C.B.A., D.H., D.K., X.-O.S. and X.G. managed sample and data preparation or carried out quality control. G.J., Z.C., J.P., R.T., C.L., J.A.B., Y.X., B.L. and X.G. analyzed the data. G.J., Z.C., J.P., B.L., X.-O.S. and W.Z. interpreted the findings. G.J., Z.C., R.T., Y.X., H. Zhang, H. Zhao, J.R.P., C.B.A.H., X.G., J.L. and W.Z. drafted or substantively revised the paper.
Corresponding author
Ethics declarations
Competing interests
O.I.O is co-founder at CancerIQ, serves as Scientific Advisor at Tempus and is on the Board of 54gene. The other authors declare no competing interests.
Peer review
Peer review information
Nature Genetics thanks Paul Pharoah and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Genomic features of CCVs and general genome using RoadMap ChromHMM 15-state models in human mammary epithelial cells (HMEC).
a, Credible causal variants (CCVs) identified by fine-mapping. b, variants across the general genome.
Extended Data Fig. 2 Genomic features of CCVs and general genome using RoadMap ChromHMM 15-state models in breast myoepithelial primary cells.
a, Credible causal variants (CCVs) identified by fine-mapping. b, variants across the general genome.
Supplementary information
Supplementary Information
Supplementary methods, acknowledgement and references.
Supplementary Tables
Supplementary Tables 1–13.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Jia, G., Chen, Z., Ping, J. et al. Refining breast cancer genetic risk and biology through multi-ancestry fine-mapping analyses of 192 risk regions. Nat Genet 57, 80–87 (2025). https://doi.org/10.1038/s41588-024-02031-y
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s41588-024-02031-y