Abstract
Atrial fibrillation (AF) is a prevalent and morbid abnormality of the heart rhythm with a strong genetic component. Here, we meta-analyzed genome and exome sequencing data from 36 studies that included 52,416 AF cases and 277,762 controls. In burden tests of rare coding variation, we identified novel associations between AF and the genes MYBPC3, LMNA, PKP2, FAM189A2 and KDM5B. We further identified associations between AF and rare structural variants owing to deletions in CTNNA3 and duplications of GATA4. We broadly replicated our findings in independent samples from MyCode, deCODE and UK Biobank. Finally, we found that CRISPR knockout of KDM5B in stem-cell-derived atrial cardiomyocytes led to a shortening of the action potential duration and widespread transcriptomic dysregulation of genes relevant to atrial homeostasis and conduction. Our results highlight the contribution of rare coding and structural variants to AF, including genetic links between AF and cardiomyopathies, and expand our understanding of the rare variant architecture for this common arrhythmia.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
Summary results for the main analyses have been made available through the Cardiovascular Disease Knowledge Portal (https://cvd.hugeamp.org), including single variant association statistics (https://personal.broadinstitute.org/ryank/Choi_Jurgens_2023_AF_WES_WGS_2022_summary_stats_singlevariants_moderatehigh_MAC20.tsv.gz), gene-based rare coding variant association statistics (https://personal.broadinstitute.org/ryank/Choi_Jurgens_2023_AF_WES_WGS_2022_summary_stats_genebased_LOFpDM_cMAC20_withFirth.tsv.gz) and gene-based rare structural variant statistics (https://personal.broadinstitute.org/ryank/Choi_Jurgens_2023_AF_WES_WGS_2022_summary_stats_genebased_SVc45_cMAC10_withFirth.tsv.gz). Access to individual-level UK Biobank data, both phenotypic and genetic, is available to approved researchers through application on the UK Biobank website (https://www.ukbiobank.ac.uk). The UK Biobank exome sequencing data can be found in the UK Biobank showcase portal (https://biobank.ndph.ox.ac.uk/showcase/label.cgi?id=170). Additional information about registration for access to the data is available at http://www.ukbiobank.ac.uk/register-apply. Use of UK Biobank data was approved under application number 17488. TOPMed genomic data and pre-existing Parent study phenotypic data are made available to the scientific community in study-specific accessions in the database of Genotypes and Phenotypes (dbGaP; https://www.ncbi.nlm.nih.gov/gap/advanced_search/?TERM=topmed) and in the NHLBI BioData Catalyst cloud platform (https://biodatacatalyst.nhlbi.nih.gov/). Individual-level data (including raw sequencing data) for several of the CCDG-WES sub-cohorts have been deposited to dbGAP/AnVIL under restricted access (ENGAGE-TIMI: phs002774; PEGASUS-TIMI: phs002243; MGB: phs002018; MGH_AF: phs001062; TMDU: phs002985; BioVu: phs001624; SWISS_AF: phs002242; DECAF/TCAI: phs001546; VAFAR: phs000997; UCSF_AF: phs001933; GENAF: phs001547; GGAF: phs001725). Raw and processed RNA-seq data have been deposited at the NCBI Gene Expression Omnibus under accession number GSE225290. Other datasets used in this manuscript include the Human Genome assembly GRCh38 (https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_000001405.26), the dbNSFP database (v.4.1a) (https://sites.google.com/site/jpopgen/dbNSFP), gnomAD exomes (v.2.1) (https://gnomad.broadinstitute.org/downloads) and Ensembl (release 95) (https://www.ensembl.org/info/data/index.html).
Code availability
For most datasets, data collection and data pre-processing were performed centrally, and therefore no commercial software was needed to collect data specific to the present study (UK Biobank, TOPMed-CCDG, MyCode, deCODE). Pre-processing of sequencing data for the CCDG-WES and FOURIER datasets was performed using various versions of the Genome Analysis Toolkit (https://github.com/broadinstitute/gatk/releases), as described in Supplementary Note. QC of individual-level data was performed using Hail (v.0.2) (https://hail.is), PLINK (v.2.0.a) (https://www.cog-genomics.org/plink/2.0) and KING (v.2.2.5) (https://www.kingrelatedness.com/Download.shtml). Variant annotation was performed using VEP (v.95) (https://github.com/Ensembl/ensembl-vep) including the LOFTEE plug-in (https://github.com/Ensembl/ensembl-vep) run within Hail, and using AnnotSV (v.3.0.6) (https://github.com/lgmgeo/AnnotSV). Association tests were performed using an adaptation of the R package GENESIS (v.2.18) (https://rdrr.io/bioc/GENESIS/man/GENESIS-package.html), which we have previously made available through a GitHub repository (https://github.com/seanjosephjurgens/UKBB_200KWES_CVD). All analyses in R were run using R version 4.0 (https://www.r-project.org). Parts of the analyses were done by the NHLBI BioData Catalyst Ecosystem empowered by Terra and Gen3. For RNA-seq analyses, all libraries were sequenced on an Illumina NovaSeq machine; sequenced reads were aligned to the human genome (GRCh38) using STAR (v.2.7.9a) (https://github.com/alexdobin/STAR); differential expression analysis was performed with DESeq2 (v.1.30.1) (https://github.com/thelovelab/DESeq2); and pathway enrichment analysis was carried out with Metascape (https://metascape.org/gp/index.html#/menu/msbio).
References
Kornej, J., Börschel, C. S., Benjamin, E. J. & Schnabel, R. B. Epidemiology of atrial fibrillation in the 21st Century: novel methods and new insights. Circ. Res. 127, 4–20 (2020).
Benjamin, E. J. et al. Impact of atrial fibrillation on the risk of death: the Framingham Heart Study. Circulation 98, 946–952 (1998).
Tanaka, Y. et al. Trends in cardiovascular mortality related to atrial fibrillation in the United States, 2011 to 2018. J. Am. Heart Assoc. 10, e020163 (2021).
Weng, L. C. et al. Heritability of atrial fibrillation. Circ. Cardiovasc. Genet. 10, e001838 (2017).
Roselli, C. et al. Multi-ethnic genome-wide association study for atrial fibrillation. Nat. Genet. 50, 1225–1233 (2018).
Nielsen, J. B. et al. Biobank-driven genomic discovery yields new insight into atrial fibrillation biology. Nat. Genet. 50, 1234–1239 (2018).
Roselli, C., Rienstra, M. & Ellinor, P. T. Genetics of atrial fibrillation in 2020: GWAS, genome sequencing, polygenic risk, and beyond. Circ. Res. 127, 21–33 (2020).
Wang, Q. et al. Rare variant contribution to human disease in 281,104 UK Biobank exomes. Nature 597, 527–532 (2021).
Gudbjartsson, D. F. et al. A frameshift deletion in the sarcomere gene MYL4 causes early-onset familial atrial fibrillation. Eur. Heart J. 38, 27–34 (2017).
Lubitz, S. A. et al. Whole exome sequencing in atrial fibrillation. PLoS Genet. 12, e1006284 (2016).
Taliun, D. et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature 590, 290–299 (2021).
Sudlow, C. et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015).
Szustakowski, J. D. et al. Advancing human genetics research and drug discovery through exome sequencing of the UK Biobank. Nat. Genet. 53, 942–948 (2021).
Choi, S. H. et al. Association between Titin loss-of-function variants and early-onset atrial fibrillation. JAMA 320, 2354–2364 (2018).
Ahlberg, G. et al. Rare truncating variants in the sarcomeric protein titin associate with familial and early-onset atrial fibrillation. Nat. Commun. 9, 4316 (2018).
Choi, S. H. et al. Monogenic and polygenic contributions to atrial fibrillation risk: results from a national biobank. Circ. Res. 126, 200–209 (2020).
Marian, A. J. & Braunwald, E. Hypertrophic cardiomyopathy: genetics, pathogenesis, clinical manifestations, diagnosis, and therapy. Circ. Res. 121, 749–770 (2017).
McNally, E. M. & Mestroni, L. Dilated cardiomyopathy: genetic determinants and mechanisms. Circ. Res. 121, 731–748 (2017).
Rankin, J. & Ellard, S. The laminopathies: a clinical review. Clin. Genet. 70, 261–274 (2006).
James, C. A. et al. International evidence based reappraisal of genes associated with arrhythmogenic right ventricular cardiomyopathy using the clinical genome resource framework. Circ. Genom. Precis. Med. 14, e003273 (2021).
Xhabija, B. & Kidder, B. L. KDM5B is a master regulator of the H3K4-methylome in stem cells, development and cancer. Semin. Cancer Biol. 57, 79–85 (2019).
Lebrun, N. et al. Novel KDM5B splice variants identified in patients with developmental disorders: functional consequences. Gene 679, 305–313 (2018).
Faundes, V. et al. Histone lysine methylases and demethylases in the landscape of human developmental disorders. Am. J. Hum. Genet. 102, 175–187 (2018).
Coste Pradas, J. et al. Identification of genes and pathways regulated by Lamin A in heart. J. Am. Heart Assoc. 9, e015690 (2020).
Ji, W. et al. De novo damaging variants associated with congenital heart diseases contribute to the connectome. Sci. Rep. 10, 7046 (2020).
Audain, E. et al. Integrative analysis of genomic variants reveals new associations of candidate haploinsufficient genes with congenital heart disease. PLoS Genet. 17, e1009679 (2021).
Zaidi, S. et al. De novo mutations in histone-modifying genes in congenital heart disease. Nature 498, 220–223 (2013).
Carey, D. J. et al. The Geisinger MyCode community health initiative: an electronic health record-linked biobank for precision medicine research. Genet. Med. 18, 906–913 (2016).
Gudbjartsson, D. F. et al. Large-scale whole-genome sequencing of the Icelandic population. Nat. Genet. 47, 435–444 (2015).
Norland, K. et al. Sequence variants with large effects on cardiac electrophysiology and disease. Nat. Commun. 10, 4803 (2019).
Halldorsson, B. V. et al. The sequences of 150,119 genomes in the UK Biobank. Nature 607, 732–740 (2022).
van Hengel, J. et al. Mutations in the area composita protein αT-catenin are associated with arrhythmogenic right ventricular cardiomyopathy. Eur. Heart J. 34, 201–210 (2013).
Li, J. et al. Loss of αT-catenin alters the hybrid adhering junctions in the heart and leads to dilated cardiomyopathy and ventricular arrhythmia following acute ischemia. J. Cell Sci. 125, 1058–1067 (2012).
Roselli, C. et al. Meta-analysis of genome-wide associations and polygenic risk prediction for atrial fibrillation in more than 180,000 cases. Nat. Genet. (in the press).
Garg, V. et al. GATA4 mutations cause human congenital heart defects and reveal an interaction with TBX5. Nature 424, 443–447 (2003).
Zhang, W. et al. GATA4 mutations in 486 Chinese patients with congenital heart disease. Eur. J. Med. Genet. 51, 527–535 (2008).
Yang, Y. Q. et al. GATA4 loss-of-function mutations in familial atrial fibrillation. Clin. Chim. Acta 412, 1825–1830 (2011).
van Ouwerkerk, A. F. et al. Epigenetic and transcriptional networks underlying atrial fibrillation. Circ. Res. 127, 34–50 (2020).
Raudvere, U. et al. g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update). Nucleic Acids Res. 47, W191–W198 (2019).
Lombardi, R. et al. Nuclear plakoglobin is essential for differentiation of cardiac progenitor cells to adipocytes in arrhythmogenic right ventricular cardiomyopathy. Circ. Res. 109, 1342–1353 (2011).
Mahajan, R. & Wong, C. X. Obesity and metabolic syndrome in atrial fibrillation: cardiac and noncardiac adipose tissue in atrial fibrillation. Card. Electrophysiol. Clin. 13, 77–86 (2021).
Nyström, P. K. et al. Obesity, metabolic syndrome and risk of atrial fibrillation: a Swedish, prospective cohort study. PLoS ONE 10, e0127111 (2015).
Harper, A. R. et al. Common genetic variants and modifiable risk factors underpin hypertrophic cardiomyopathy susceptibility and expressivity. Nat. Genet. 53, 135–142 (2021).
Tadros, R. et al. Shared genetic pathways contribute to risk of hypertrophic and dilated cardiomyopathies with opposite directions of effect. Nat. Genet. 53, 128–134 (2021).
Tucker, N. R. et al. Transcriptional and cellular diversity of the human heart. Circulation 142, 466–482 (2020).
Zhou, Q. et al. Inhibition of the histone demethylase Kdm5b promotes neurogenesis and derepresses Reln (reelin) in neural stem cells from the adult subventricular zone of mice. Mol. Biol. Cell 27, 627–639 (2016).
Li, X. et al. Histone demethylase KDM5B is a key regulator of genome stability. Proc. Natl Acad. Sci. USA 111, 7096–7101 (2014).
Song, H., Conte, J. V., Foster, A. H., McLaughlin, J. S. & Wei, C. Increased p53 protein expression in human failing myocardium. J. Heart Lung Transplant. 18, 744–749 (1999).
Zhang, T., Yong, S. L., Tian, X. L. & Wang, Q. K. Cardiac-specific overexpression of SCN5A gene leads to shorter P wave duration and PR interval in transgenic mice. Biochem. Biophys. Res. Commun. 355, 444–450 (2007).
Weeke, P. et al. Whole-exome sequencing in familial atrial fibrillation. Eur. Heart J. 35, 2477–2483 (2014).
Jiang, F. et al. The mechanosensitive Piezo1 channel mediates heart mechano-chemo transduction. Nat. Commun. 12, 869 (2021).
Luo, R. et al. Identification of potential candidate genes and pathways in atrioventricular nodal reentry tachycardia by whole-exome sequencing. Clin. Transl. Med. 10, 238–257 (2020).
Bingen, B. O. et al. Atrium-specific Kir3.x determines inducibility, dynamics, and termination of fibrillation by regulating restitution-driven alternans. Circulation 128, 2732–2744 (2013).
Grubb, S., Calloe, K. & Thomsen, M. B. Impact of KChIP2 on cardiac electrophysiology and the progression of heart failure. Front. Physiol. 3, 118 (2012).
Chen, Y. H. et al. KCNQ1 gain-of-function mutation in familial atrial fibrillation. Science 299, 251–254 (2003).
Yamada, N. et al. Mutant KCNJ3 and KCNJ5 potassium channels as novel molecular targets in bradyarrhythmias and atrial fibrillation. Circulation 139, 2157–2169 (2019).
Pikkarainen, S., Tokola, H., Kerkelä, R. & Ruskoaho, H. GATA transcription factors in the developing and adult heart. Cardiovasc. Res. 63, 196–207 (2004).
Delmar, M. & McKenna, W. J. The cardiac desmosome and arrhythmogenic cardiomyopathies: from gene to disease. Circ. Res. 107, 700–714 (2010).
Harris, S. P. et al. Hypertrophic cardiomyopathy in cardiac myosin binding protein-C knockout mice. Circ. Res. 90, 594–601 (2002).
Vikhorev, P. G. et al. Abnormal contractility in human heart myofibrils from patients with dilated cardiomyopathy due to mutations in TTN and contractile protein genes. Sci. Rep. 7, 14829 (2017).
Heijman, J., Voigt, N., Nattel, S. & Dobrev, D. Cellular and molecular electrophysiology of atrial fibrillation initiation, maintenance, and progression. Circ. Res. 114, 1483–1499 (2014).
Orr, N. et al. A mutation in the atrial-specific myosin light chain gene (MYL4) causes familial atrial fibrillation. Nat. Commun. 7, 11303 (2016).
Vad, O. B. et al. Loss-of-function variants in cytoskeletal genes are associated with early-onset atrial fibrillation. J. Clin. Med. 9, 372 (2020).
Pinto, Y. M. et al. Proposal for a revised definition of dilated cardiomyopathy, hypokinetic non-dilated cardiomyopathy, and its implications for clinical practice: a position statement of the ESC working group on myocardial and pericardial diseases. Eur. Heart J. 37, 1850–1858 (2016).
Seferović, P. M. et al. Heart failure in cardiomyopathies: a position paper from the Heart Failure Association of the European Society of Cardiology. Eur. J. Heart Fail. 21, 553–576 (2019).
McDonagh, T. A. et al. 2021 ESC Guidelines for the diagnosis and treatment of acute and chronic heart failure. Eur. Heart J. 42, 3599–3726 (2021).
Weischenfeldt, J., Symmons, O., Spitz, F. & Korbel, J. O. Phenotypic impact of genomic structural variation: insights from and for human disease. Nat. Rev. Genet. 14, 125–138 (2013).
Beyter, D. et al. Long-read sequencing of 3,622 Icelanders provides insight into the role of structural variants in human diseases and other traits. Nat. Genet. 53, 779–786 (2021).
Aguirre, M., Rivas, M. A. & Priest, J. Phenome-wide burden of copy-number variation in the UK Biobank. Am. J. Hum. Genet. 105, 373–383 (2019).
Tsai, C. T. et al. Genome-wide screening identifies a KCNIP1 copy number variant as a genetic predictor for atrial fibrillation. Nat. Commun. 7, 10190 (2016).
Jurgens, S. J. et al. Analysis of rare genetic variation underlying cardiometabolic diseases and traits among 200,000 individuals in the UK Biobank. Nat. Genet. 54, 240–250 (2022).
Flannick, J. et al. Exome sequencing of 20,791 cases of type 2 diabetes and 24,440 controls. Nature 570, 71–76 (2019).
Thorolfsdottir, R. B. et al. Coding variants in MYZAP and RPL3L increase risk of atrial fibrillation. Commun. Biol. 1, 68 (2018).
Ghoussaini, M. et al. Open Targets Genetics: systematic identification of trait-associated genes using large-scale genetics and functional genomics. Nucleic Acids Res. 49, D1311–D1320 (2021).
Christophersen, I. E. et al. Large-scale analyses of common and rare variants identify 12 new loci associated with atrial fibrillation. Nat. Genet. 49, 946–952 (2017).
Sabatine, M. S. et al. Evolocumab and clinical outcomes in patients with cardiovascular disease. N. Engl. J. Med. 376, 1713–1722 (2017).
Kelly, M. A. et al. Leveraging population-based exome screening to impact clinical care: the evolution of variant assessment in the Geisinger MyCode research project. Am. J. Med. Genet. C Semin. Med. Genet. 187, 83–94 (2021).
Jónsson, H. et al. Whole genome characterization of sequence diversity of 15,220 Icelanders. Sci. Data 4, 170115 (2017).
Backman, J. D. et al. Exome sequencing and analysis of 454,787 UK Biobank participants. Nature 599, 628–634 (2021).
McLaren, W. et al. The Ensembl Variant Effect Predictor. Genome Biol. 17, 122 (2016).
Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
Liu, X., Wu, C., Li, C. & Boerwinkle, E. dbNSFP v3.0: a one-stop database of functional predictions and annotations for human nonsynonymous and splice-site SNVs. Hum. Mutat. 37, 235–241 (2016).
Geoffroy, V. et al. AnnotSV: an integrated tool for structural variations annotation. Bioinformatics 34, 3572–3574 (2018).
Gogarten, S. M. et al. Genetic association testing using the GENESIS R/Bioconductor package. Bioinformatics 35, 5346–5348 (2019).
Zhao, Z. et al. UK Biobank whole-exome sequence binary phenome analysis with robust region-based rare-variant test. Am. J. Hum. Genet. 106, 3–12 (2020).
Zhou, W. et al. Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Nat. Genet. 50, 1335–1341 (2018).
Tang, Z. Z. & Lin, D. Y. MASS: meta-analysis of score statistics for sequencing studies. Bioinformatics 29, 1803–1805 (2013).
Heinze, G. A comparative investigation of methods for logistic regression with separated or nearly separated data. Stat. Med. 25, 4216–4226 (2006).
Povysil, G. et al. Rare-variant collapsing analyses for complex traits: guidelines and applications. Nat. Rev. Genet. 20, 747–759 (2019).
Leyton-Mange, J. S. et al. Rapid cellular phenotyping of human pluripotent stem cell-derived cardiomyocytes using a genetically encoded fluorescent voltage sensor. Stem Cell Reports 2, 163–170 (2014).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Zhou, Y. et al. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat. Commun. 10, 1523 (2019).
Acknowledgements
We gratefully thank all CCDG, TOPMed, UK Biobank, MyCode and deCODE participants, as this study would not have been possible without their contributions. This work was supported by grants from the National Institutes of Health (NIH) to P.T.E. (1RO1HL092577, K24HL105780 and R01HL157635) and S.A.L. (1R01HL139731 and R01HL157635). This work was also supported by a grant from the American Heart Association (AHA) to P.T.E. (18SFRN34110082 and 961045) and to S.A.L. (18SFRN34250007). P.T.E. is supported by MAESTRIA (965286). D.D. is supported by R01 HL138737 and T32 HL139439. L.-C.W. is supported by an AHA Strategically Focused Research Networks (SFRN) postdoctoral fellowship (18SFRN34110082) and NIH grant (1R01HL139731). J.P.P. is supported by the NIH (K08HL159346) and a Sarnoff Cardiovascular Research Foundation Scholar Award. S.J.J. was supported by the Junior Clinical Scientist Fellowship (03-007-2022-0035) from the Dutch Heart Foundation and by an Amsterdam UMC Doctoral Fellowship. S.H.C. is supported by the BioData Ecosystem fellowship program. C.M.H. is supported by an NIH grant (R01HL141901). L.X. is supported by an AHA Career Development Award 20CDA35260081. D.F. is supported by the NSW Health and Heart Foundation. S.R. is supported by HL43680, R35HL135818 and HL113338. N.A.M. is supported by NIH (K08HL153950). M.H.C. is supported by R01HL137927, R01HL147148 and R01HL089856. E.J.B. is supported by 75N92019D00031, R01HL092577 and 1R01HL128914 and the American Heart Association (AF AHA_18SFRN34110082). K.L.L. is supported by R01HL092577 and AHA 18SFRN34230127. J.G.S. was supported by grants from the Swedish Heart Lung Foundation (2019-0526), the Swedish Research Council (2021-02273), the European Research Council (ERC-STG-2015-679242), Gothenburg University, Skåne University Hospital, governmental funding of clinical research within the Swedish National Health Service, a generous donation from the Knut and Alice Wallenberg foundation to the Wallenberg Center for Molecular Medicine in Lund and funding from the Swedish Research Council (Linnaeus grant Dnr 349-2006-237, Strategic Research Area Exodiab Dnr 2009-1039) and Swedish Foundation for Strategic Research (Dnr IRC15-0067) to the Lund University Diabetes Center. S.M.D. is supported by IK2-CX001780; this publication does not represent the views of the Department of Veterans Affairs or the United States Government. P.K. was partially supported by European Union BigData@Heart (grant agreement EU IMI 116074), AFFECT-AF (grant agreement 847770) and MAESTRIA (grant agreement 965286), the British Heart Foundation (PG/17/30/32961, PG/20/22/35093 and AA/18/2/34218), German Centre for Cardiovascular Research supported by the German Ministry of Education and Research (DZHK), Deutsche Forschungsgemeinschaft (Ki 731/4-1), and Leducq Foundation. C.M.A. was supported by by the National Heart, Lung and Blood Institute (HL-093613 and HL116690) for AF endpoint confirmation in WGHS. D.C. is supported by the Canadian Institutes of Health Research. M.K. is supported by grants from the Swiss National Science Foundation (grant numbers 33CS30_148474, 33CS30_177520, 32473B_176178, 32003B_197524), the Swiss Heart Foundation, the Foundation for Cardiovascular Research Basel and the University of Basel. T.T. was supported in part by JSPS KAKENHI (grant number JP18H02804). Y.E. was supported by JSPS KAKENHI (grant number 17K07251). J.D.R. is supported by the Marianne Barrie Philanthropic Fund, the Heart and Stroke Foundation of Canada and the Cardiac Arrhythmia Network of Canada (CANet). Z.W.M.L. is supported by UBC Cardiology Academic Partnership Plan Pilot Project Funding. M.B.S. is supported by the NIH (R01HL155197). J.B., M.K.C., J.D.S. and D.R.V.W. are supported by the NIH (R01HL111314 and P01HL158505) and by the AHA 18SFRN34110067. R.S. has received funding from the European Research Council under the European Union’s Horizon 2020 Research and Innovation Programme under the grant agreement no. 648131, from the European Union’s Horizon 2020 Research and Innovation Programme under grant agreement no. 847770 (AFFECT-EU) and German Center for Cardiovascular Research (DZHK e.V.) (81Z1710103); German Ministry of Research and Education (BMBF 01ZX1408A) and ERACoSysMed3 (031L0239). M.S.O. is supported by The Hallas-Møller Emerging Investigator Novo Nordisk grant (NNF17OC0031204). L.S.B.J. is supported by the Swedish Society for Medical Research and the Swedish Heart and Lung Foundation (2019-0354). N.S. is supported by NIH R01HL141989, AHA 19SFRN34830063 and the Laughlin family. M.R. acknowledges support from the Netherlands Cardiovascular Research Initiative: an initiative supported by the Netherlands Heart Foundation, CVON 2014–9: “Reappraisal of Atrial Fibrillation: interaction between hyperCoagulability, Electrical remodelling, and Vascular destabilization in the progression of AF (RACE V)”. Additional acknowledgements for specific study sites are presented in Supplementary Note.
Author information
Authors and Affiliations
Consortia
Contributions
S.H.C, S.J.J., S.A.L. and P.T.E. conceived and designed the study. S.H.C., S.J.J., V.N.M., N.A.M., L.-C.W., J.P.P., G.J., C.R. and J.L.H. performed data curation and data processing for the discovery datasets. S.H.C. and S.J.J. performed the main statistical and bioinformatic analyses in the discovery datasets. A.S.v.F., A.N., A.T., B.G., D.R.V.W., D.D., D.H., E.Z.S., G.E.D., H.C., J.L.A., J.A.B., J.B., J.E.H., J.D.S., J.C.B., K.Y., L.S.B.J., L.R., L.J.G., L.C.K., M.K., M.P., N.A.N., N.L.S., P.M.N., P.v.d.H., Q.S.W., R.L.J., R.S., R.J., R.A.J.S., S. Knight, T.F., T.W.B., Y.-I.M., Z.T.Y., Z.W.M.L., C.R.B., A.A., B.M.P., C.M.A., D.E.A., D.M.R., D.I.C., D.J.R., D.C., D.D.M., D.F., E.J.B., E.B., G.M.M., I.E.C., J.G.S., J.D.R., L.M.R., M.B.S., M.H.C., M.J.C., M.R., M.K.C., M.S.O., M.F.S., N.S., P.K., R.J.F.L., S.N., S.M., S.M.D., S. Kaab, S.R.H., S.R., S.H.S., T.T., Y.E., C.T.R., M.S.S., K.L.L., S.A.L. and P.T.E. were involved in site-specific sample ascertainment, phenotype data collection, DNA collection for sequencing, data processing or supervision for the TOPMed-CCDG, CCDG and FOURIER datasets. N.G. and S.G. were involved in the coordination of sequencing at the Broad Institute for various discovery datasets. V.N.M., N.A.M., L.-C.W., J.P.P., C.R., M.D.C., V.N. and X.W. contributed critically to the analysis plan. C.M.H. performed data curation and bioinformatic analyses for the Geisinger MyCode dataset, with contributions from Regeneron Genetics Center authors, as specified in Supplementary Note. G.S., D.O.A. and D.F.G. were responsible for data curation and bioinformatic analyses in the deCODE dataset. H.H. and K.S. were responsible for coordination and supervision of the deCODE dataset. L.X., M.C.H. and H.M. were responsible for the functional studies of KDM5B, including CRISPR knockout, optical mapping and RNA-seq analyses. K.L.L., S.A.L. and P.T.E. supervised the overall study. S.H.C., S.J.J. and P.T.E. wrote the manuscript. All authors critically revised and approved the manuscript.
Corresponding author
Ethics declarations
Competing interests
P.T.E. has received sponsored research support from Bayer AG, Bristol Myers Squibb, Pfizer and Novo Nordisk, and he has consulted for Bayer. S.A.L. is a full-time employee of Novartis Institutes of BioMedical Research as of 18 July 2022. S.A.L. previously received sponsored research support from Bristol Myers Squibb, Pfizer, Boehringer Ingelheim, Fitbit, Medtronic, Premier and IBM and has consulted for Bristol Myers Squibb, Pfizer, Blackstone Life Sciences and Invitae. B.M.P. serves on the Steering Committee of the Yale Open Data Access Project funded by Johnson & Johnson. M.H.C. has received grant funding from GSK and Bayer, and speaking or consulting fees from AstraZeneca, Illumina and Genentech. C.M.H. receives research support from Tempus Labs, outside the scope of the present work. S.M.D. receives in kind research support from Novo Nordisk and personal consulting fees, outside the scope of the current research. D.C. receives consulting fees from Roche Diagnostics and Trimedics, and speaker fees from Servier and BMS/Pfizer. M.K. receives consulting fees from Roche Diagnostics. N.A.M. reports involvement in clinical trials with Amgen, Ionis, Pfizer, Novartis and AstraZeneca without personal fees, payments or increase in salary. C.T.R. provides consultancies with Anthos, Bayer, Bristol Myers Squibb, Boehringer Ingelheim, Daiichi Sankyo, Janssen and Pfizer. M.S.S. provides Consultancies with Amgen, Anthos Therapeutics, AstraZeneca, Bristol Myers Squibb, DalCor, Dr. Reddy’s Laboratories, IFM Therapeutics, Intarcia, Merck, Moderna, Novo Nordisk and Silence Therapeutics. R.S. has received lecture fees and advisory board fees from BMS/Pfizer outside this work. L.M.R. is a consultant for the TOPMed Administrative Coordinating Center through WeStat. M.K. reports personal fees from Bayer, Böhringer Ingelheim, Pfizer BMS, Daiichi Sankyo, Medtronic, Biotronik, Boston Scientific, Johnson & Johnson and Roche, and grants from Bayer, Pfizer, Boston Scientific, BMS, Biotronik and Daiichi Sankyo. L.-C.W. has received research support from IBM to the Broad Institute. C.R. is supported by a grant from Bayer to the Broad Institute focused on the development of therapeutics for cardiovascular disease. The remaining authors declare no competing interests.
Peer review
Peer review information
Nature Genetics thanks Julien Barc, Rafik Tadros and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Forest plots for novel associations from gene-based analyses of loss-of-function and predicted-deleterious missense variants.
This forest plot shows the contributions of each cohort and the meta-analysis results for gene-based analysis of TTN, MYBPC3, LMNA, PKP2, and KDM5B; center measures show odds ratios, while error bars represent 95% confidence intervals. The odds ratio and 95% confidence interval of each study were estimated from Firth’s logistic regression and the meta-analysis odds ratios and confidence intervals were estimated from an inverse-variance weighted fixed effect approach. The P-values of the meta-analysis were computed from a score-based meta-analysis approach; P-values are two-sided and unadjusted for multiple-testing. For TTN, all datasets contributed (52,416 cases and 267,772 controls); for the other genes, TOPMed-CCDG, CCDG-WES and UKBB contributed (51,019 cases and 253,267 controls). The cMAC shows the cumulative number of alternative alleles observed in the meta-analysis. cMAC, cumulative minor allele count; Combined, meta-analysis results.
Extended Data Fig. 2 Heatmaps for the associations between loss-of-function and predicted-deleterious missense variants for novel AF genes and a range of cardiovascular phenotypes in the UK Biobank.
a, Heatmap for four binary cardiovascular phenotypes. b Heatmap for six quantitative cardiovascular endophenotypes. A red color indicates OR > 1 for a binary phenotype or beta > 0 for a quantitative trait, while blue indicates OR < 1 for a binary phenotype or beta < 0 for a quantitative trait. Nominally significant associations at P < 0.05 are indicated with an asterisk (*). Data were extracted from a previous UK Biobank PheWAS described in Jurgens et al.71. Results for binary phenotypes were derived from logistic mixed-effects models, while results for quantitative traits were derived from linear mixed-effects models; P-values are two-sided and unadjusted for multiple-testing.
Extended Data Fig. 3 Forest plots for novel associations from single variant analyses of high/moderate impact variants.
This forest plot shows the contributions of each cohort and meta-analysis results for single variants rs147301839_C, rs147972626_A, rs202011870_C, rs928785947_G and rs182906685_A; center measures show odds ratios, while the error bars represent 95% confidence intervals. The odds ratio and 95% confidence interval of each study were estimated from Firth’s logistic regression and the meta-analysis odds ratios and confidence intervals were estimated from an inverse-variance weighted fixed effect approach. The P-values of the meta-analysis were computed from a score-based meta-analysis approach; P-values are two-sided and unadjusted for multiple testing. For MYZAP and RPL3L, all datasets contributed (52,416 cases and 267,772 controls); for SHPXD2A, CCDG-WES and UKBB contributed (39,853 cases and 227,461 controls); for FAM189A2, TOPMed-CCDG, CCDG-WES, and UKBB contributed (51,019 cases and 253,267 controls); for ZFC3H1, TOPMed and UKBB contributed (23,428 cases and 213,715 controls). The MAC shows the number of alternative alleles observed in the meta-analysis. MAC, minor allele count; MAF, minor allele frequency; Combined, meta-analysis results.
Extended Data Fig. 4 Forest plots for loss-of-function and predicted-deleterious missense variants in FAM189A2, CTNNA3 and GATA4.
This forest plot shows the contributions of each cohort and meta-analysis results for gene-based analysis of LOF and deleterious missense variants in FAM189A2, CTNNA3, and GATA4; center measures show odds ratios, while the error bars represent 95% confidence intervals. The odds ratio and 95% confidence interval of each study were estimated from Firth’s logistic regression, and the meta-analysis odds ratios and confidence intervals were estimated from an inverse-variance weighted fixed effect approach. The P-values of the meta-analysis were computed from a score-based meta-analysis approach; P-values are two-sided and unadjusted for multiple-testing. For CTNNA3, all datasets contributed (52,416 cases and 267,772 controls); for the other genes, TOPMed-CCDG, CCDG-WES, and UKBB contributed (51,019 cases and 253,267 controls); the cMAC shows the number of alternative alleles observed in the meta-analysis. cMAC, cumulative minor allele count; Combined, meta-analysis results.
Extended Data Fig. 5 Percentage of carriers among cases stratified by age at AF diagnosis including rare coding and structural variants among WGS samples.
Each square represents the percentage of carriers among AF cases across within samples with WGS from TOPMed-CCDG. The age at AF diagnosis was stratified by ≤35, ≤45, ≤55, ≤65, and any. The numbers at the bottom describe the total number of carriers and the total number of cases from AF cases whose age at AF diagnosis are available. The center measures represent proportions, while the error bars represent 95% binomial confidence intervals. TTNpsi90%: variants located at exons with percent spliced-in index > 90% for TTN within left ventricular tissue.
Extended Data Fig. 6 Combined assessment of risk conferred by rare variants and a common variant polygenic score among WGS samples.
a-d, In each panel, the y-axis of the bar charts shows the odds ratios as center measure (with error bars showing 95% confidence interval) for a given group as compared to individuals without rare variants within the lowest quintile of the polygenic risk score (PRS); the x-axis shows different groups (stratified by rare variants and PRS strata). a and c show results from all ancestry individuals, while b and d show results for individuals of European (EUR) ancestry only. PRS strata are split by low PRS (bottom quintile of distribution), intermediate PRS (2nd to 4th quintile of distribution) and high PRS (top quintile of PRS distribution). Odds ratios are computed from Firth’s logistic regression.
Extended Data Fig. 7 Power calculations for sub-threshold cardiomyopathy genes.
This plot shows the estimated power (y-axis) at different numbers of atrial fibrillation (AF) cases (x-axis) for a range of well-known cardiomyopathy genes that reached two-sided P < 0.05 (unadjusted for multiple-testing) in our gene-based analysis of LOF and pDM variants. We assumed these genes represent true AF genes with as yet limited power; we further assumed a 1:3 case:control setting. We used the odds ratio (OR) estimates (computed from inverse-variance-weighted meta-analysis of Firth’s regression results) and the estimated cumulative minor allele frequencies (cMAF; computed among controls) as input for these calculations. The legend shows the gene names and corresponding OR estimates and control cMAF estimates from our data; each gene is annotated with a different color.
Extended Data Fig. 8 Immunofluorescence microscopy of KDM5B in left atrial tissue.
a, Immunofluorescence images of human left atrial tissue section stained for KDM5B (red), MYOM1 (green) and DAPI (blue). Scale bar: left, 100 μm; right 25 μm. b, Immunofluorescent images of human iPSC-derived atrial cardiomyocytes stained for KDM5B (green), MYOM1 (yellow) and DAPI (blue). Note: Human tissue samples originated from non-failing deceased donors that were rejected for heart transplantation by Myocardial Applied Genetics Network. Three tissue sections from the same patient were imaged, and all showed the representative pattern. Scale bar: 20 μm.
Extended Data Fig. 9 Electrophysiological studies of KDM5B loss in pluripotent-induced atrial cardiomyocytes using siRNA knockdown.
a, Mean ± s.e.m. of KDM5B mRNA levels in human iPSC-derived atrial cardiomyocytes transfected without (Empty) or with scramble control (CTL) or KDM5B (KDM5B_KD) siRNAs. n = 6 per group. P-values are derived from Tukey’s multiple comparisons test after one-way ANOVA; P-values are two-sided and are summary-adjusted as per Tukey’s honest significance test. b, Top, examples of action potentials in Empty (black tracing), CTL (grey) or KDM5B_KD (green) obtained with optical mapping; bottom, mean ± s.e.m. of action potential durations at 20% (APD20, left), 50% (APD50, middle) or 80% (APD80, right) of repolarization. n = 23 (Empty), 24 (CTL) or 20 (KDM5B_KD) from 3 independent experiments. P-values are derived from Tukey’s multiple comparisons test after one-way ANOVA, where one-way ANOVA was performed within each duration group (that is, 20%, 50%, and 80% of ADP), respectively; P-values are two-sided and are summary-adjusted (within each duration group) as per Tukey’s honest significance test.
Supplementary information
Supplementary Information
Supplementary Note and Supplementary Figs. 1–15
Supplementary Table 1
Supplementary Tables 1–22
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Choi, S.H., Jurgens, S.J., Xiao, L. et al. Sequencing in over 50,000 cases identifies coding and structural variation underlying atrial fibrillation risk. Nat Genet 57, 548–562 (2025). https://doi.org/10.1038/s41588-025-02074-9
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s41588-025-02074-9