Main

Bradyarrhythmias are common cardiac rhythm abnormalities characterized by pathological slowing of heart rate or electrical conduction and represent a major public health problem1,2. Bradyarrhythmias can lead to syncope, heart failure or sudden cardiac death. The only established therapy for serious bradyarrhythmias is pacemaker (PM) implantation, which carries the risk of complications and morbidity.

Bradyarrhythmias have largely been considered diseases of aging, with fibrosis of the conduction system implicated as a major precipitant. However, the molecular causes of bradyarrhythmias are poorly understood, despite evidence supporting an inherited basis3. Some conduction system disorders manifest early in life or aggregate in families, with candidate gene approaches implicating ion channel or ion channel-associated genes4,5,6. Genome-wide association studies (GWAS) have focused on very broad bradyarrhythmia definitions (for example, PM3) and have had limited power to detect robust associations. One previous GWAS of sinus node dysfunction (SND)—the inability of the sinoatrial node to generate sufficient electrical impulses—identified six loci including a rare susceptibility variant in MYH6, a gene encoding a myosin heavy chain isoform7. There have been no GWAS to date for distal conduction disease (DCD), which reflects disease distal to the sinus node, such as a block within the atrioventricular node or His-Purkinje system8. The recent availability of genetic data from multiple large-scale studies may enable the detection of common and rare variations underlying bradyarrhythmias with greater physiologic specificity.

Here we performed multi-ancestry meta-analyses of GWAS and rare-variant burden tests for SND, DCD and conditions necessitating PM to elucidate the genetic variation underlying these distinct and clinically relevant conditions.

Results

Figure 1 summarizes the study design and depicts the anatomical regions affected by SND and DCD. We first performed common variant analyses, including 1.3 million participants for SND and DCD. We conducted additional analyses examining restrictive definitions of SND or DCD (onset at age ≤75 years resulting in PM) and a combined outcome of PM for SND or DCD. In total, ten studies were included. Depending on the phenotype, 89–94% of cases and 88–89% of referents were of European ancestry. Baseline characteristics are given in Supplementary Tables 15. Manhattan and Miami plots are shown in Fig. 2 and Supplementary Figs. 1 and 2. Quantile–quantile (Q–Q) plots did not suggest systematic test statistic inflation (λ < 1.10; Supplementary Fig. 3).

Fig. 1: Study design.
figure 1

a, Typical anatomical regions in which conduction tissue affected by SND and DCD are localized in the heart. A dual-chamber PM is also demonstrated. Sample sizes are shown by common variant GWAS and rare variant association tests (RVAT) for all three outcomes. b, Overview of common and rare variant analyses for SND, DCD and PM implantation. Common-variant GWAS were performed in ten collaborating studies with genotyped and imputed data, and rare variant burden testing was performed for the same phenotypes in two studies with whole-exome sequencing. The results were combined in meta-analyses, including up to 1.3 million individuals for GWAS and 471k individuals for rare variant association testing. For common variant loci reaching genome-wide significance, follow-up evaluations included analyses of cardiac gene expression profiles, predicted transcriptomes, pleiotropic associations, genetic correlations, Mendelian randomization (MR), polygenic risk scores (PRS) derivation and relations with risk of PM implantation and phenome-wide association study (PheWAS). The interaction of rare variants and polygenic risk was further evaluated.

Fig. 2: Manhattan plot for bradyarrhythmias.
figure 2

ac, GWAS results are shown separately for SND (a; n = 1,258,554), DCD (b; n = 1,314,957) and PM implantation (c; n = 1,304,231). Two-sided P values (on −log10 scale) for each association test between variants and bradyarrhythmias from fixed-effect meta-analyses of multi-ancestry individuals are shown on the y axis. Genome-wide significant association loci (P < 5 × 10−8 after Bonferroni correction; dashed line) are annotated with the name of the gene closest to the index variant.

Common variants associated with SND

We identified 13 genome-wide significant loci (P < 5 × 10−8) associated with SND in 9,511 cases and 1,249,043 referents (Fig. 2, Supplementary Table 6 and Supplementary Fig. 1). In conditional analyses, we observed a secondary signal at two loci (PITX2 and SCN5A/SCN10A; Supplementary Table 7). At the SCN5A/SCN10A locus, both the index variant and the independent variant in the conditional analysis were intronic within SCN5A.

The restrictive SND definition (4,940 cases) yielded six of the above-mentioned genome-wide significant loci and an additional locus close to MTHFSD. Generally, significant index variants were either shared or exhibited high linkage disequilibrium (LD; r2 > 0.6 in 1000 Genomes European samples) between inclusive and restrictive SND, with larger effect sizes for the restrictive definition. Inclusive and restrictive SND were highly genetically correlated (rg = 1.03, P = 1.49 × 10−221; Supplementary Table 8).

Among 13 SND loci, 4 (CCDC141, PITX2, ZFHX3 and SCN5A/SCN10A) have been reported in previous SND GWAS9. Index variants between the two studies were in moderate-to-high LD (r2 = 0.60–1.00) for three loci but distinct for the SCN5A/SCN10A locus (r2 = 0.006). Seven loci have been previously associated with atrial fibrillation (AF)10,11,12, with stronger effects for AF at the PITX2 and ZFHX3 loci and stronger or similar effects for SND at five loci (Supplementary Table 9). We report co-associations between SND and previous GWAS of cardiovascular traits13,14,15,16,17,18, including pulse and systolic blood pressure, PR interval, electrocardiogram morphology, heart rate or heart rate recovery from exercise, heart failure and stroke (Supplementary Table 10).

Significant expression quantitative trait loci (eQTL) are presented in Supplementary Table 11. We observed likely shared causal variants (posterior probabilities (PP) >0.80 in colocalization analyses) for SND and cardiac expression for genes CEP68, CAMK2D, FAM13B and CACUL1. In transcriptome-wide analysis (TWAS), a higher risk of SND was associated with higher predicted cardiac expression of CEP68 and lower predicted cardiac expression of FAM13B, RP11-325L7.2, SRRT, GIGYF1, CACUL1, ZKSCAN1, PRRX1 and SCN10A (Supplementary Table 12).

Common variants associated with DCD

We identified 31 genome-wide significant loci associated with DCD in 37,798 cases and 1,263,549 referents (Fig. 2, Supplementary Table 6 and Supplementary Fig. 2). In conditional analyses, we identified additional signals at the TTN, SCN5A/SCN10A, TBX20 and STN1 loci (Supplementary Table 7). In contrast to SND, the index variant for DCD at the SCN5A/SCN10A locus was intronic within SCN10A, and we observed a weaker independent signal for a synonymous variant within SCN5A.

The restrictive DCD definition yielded six significant loci. Index variants were identical or in moderate LD (r2 > 0.5) between inclusive and restrictive DCD, with larger effect sizes for the restrictive definition. Inclusive and restrictive DCD were highly genetically correlated (rg = 0.91, P = 1.11 × 10−108; Supplementary Table 8).

Next, we evaluated co-associations between DCD and previous GWAS of cardiovascular traits (Supplementary Tables 9 and 10). In total, 11 and 10 of the DCD index variants or their proxies have been previously associated with PR interval or QRS duration, respectively. Index variants or their proxies at several loci have also been associated with AF (six loci, including four with discordant effect directions for DCD and AF), Brugada syndrome and other cardiovascular diseases (hypertrophic cardiomyopathy, heart failure and stroke) in previous GWAS.

In genetic colocalization analyses, we observed likely shared causal variants (PP > 0.80) for DCD and the cardiac expression of 13 genes (Supplementary Table 11). In TWAS, a higher risk of DCD was associated with higher predicted cardiac expression of FOLH1, FKBP7, STRN, MOB3C, C6orf106 and CXCR6 and lower predicted cardiac expression of COQ8A, BCL2, MYOZ1, SCN10A, SYNE2, METTL21B and PLEC (Supplementary Table 12).

Common variants associated with PM implantation

We performed a GWAS meta-analysis combining all 28,899 cases with PM for any bradyarrhythmia and identified 21 genome-wide significant loci. Among these loci, nine overlapped with SND, ten overlapped with DCD and six were not discovered in either subtype-specific analysis (Fig. 2, Supplementary Fig. 4 and Supplementary Table 6). Four loci (CCDC141, SCN5A/SCN10A, CAMK2D and TBX20) were shared among SND, DCD and PM. In conditional analyses, we identified secondary signals in four loci (PLEKHA3, PITX2, ROS1 and TBX20; Supplementary Table 7).

In 14 of 21 PM loci, index variants or their proxies have been previously associated with at least one electrocardiogram (ECG) parameter (including six with heart rate, six with PR interval and three with QRS duration; Supplementary Table 10). In addition, index variants or their proxies in 10 of 21 loci have been previously associated with AF (Supplementary Table 9), and three loci (CCDC141, PITX2 and ZFHX3) have been associated with SND.

In genetic colocalization analyses of PM and eQTLs of cardiac expression, we observed likely shared causal variants (PP > 0.80) at the CEP68, CAMK2D, FAM13B and C6orf106 loci (Supplementary Table 11). In TWAS, a higher risk of PM was associated with higher and lower predicted cardiac expression levels of seven and nine genes, respectively (Supplementary Table 12).

Common variant genetic correlation analyses

We observed a moderate genetic correlation between SND and DCD (rg = 0.63, P = 3.62 × 10−25; Supplementary Table 8 and Supplementary Fig. 5). Correlation was higher for restrictive SND and DCD (rg = 0.85, P = 1.18 × 10−25).

We also evaluated genome-wide correlations between bradyarrhythmias and ECG/vectorcardiographic features. SND (rg = −0.48, P = 3.23 × 10−22), DCD (rg = −0.15, P = 5.76 × 10−6) and PM (rg = −0.31, P = 1.94 × 10−25) were genetically correlated with resting heart rate, whereas DCD was genetically correlated with conduction times including the P-wave duration (rg = 0.36, P = 3.62 × 10−4), PR interval (rg = 0.40, P = 1.39 × 10−29) and QRS duration (rg = 0.32, P = 3.18 × 10−11). DCD was also genetically correlated with the frontal QRS-T angle (fQRSTa) (rg = 0.24, P = 1 × 10−4), a vectorcardiographic measure of global electrical heterogeneity.

For each genetically correlated bradyarrhythmia and ECG trait pair, we further evaluated the concordance of effect directions for associations of the index variants identified for each member of the pair. Consistent with weak-to-moderate overall genetic correlations, a moderate proportion (23–67%) of index variants for bradyarrhythmias had concordant associations for respective ECG traits at nominal significance (P < 0.05; Supplementary Table 13). A moderate proportion (12–50%) of index variants for ECG traits also showed concordant associations with bradyarrhythmias at nominal (P < 0.05) significance.

Common variant polygenic risk score (PRS) analyses

We derived PRSs for SND, DCD and PM in meta-analyses excluding UK Biobank (UKBB) individuals. We then evaluated associations between each PRS and incident PM in 327,702 unrelated UKBB participants with no PM at baseline and observed that, over median 11.2 (Q1–Q3, 10.5–11.7) years, increasing tertile of all three PRSs was associated with incident PM (n = 2,183 events, P = 1.9 × 10−5 to 4.4 × 10−10; Fig. 3a). Compared to the bottom tertile of the SND, DCD and PM PRS, the top tertile had hazard ratios for PM of 1.25 (95% confidence interval (CI): 1.13–1.39), 1.40 (1.26–1.55) and 1.37 (1.24–1.52), respectively (Supplementary Table 14).

Fig. 3: Associations of polygenic risk scores for bradyarrhythmias with outcomes in unrelated UKBB participants.
figure 3

a, Cumulative incidence of PM implantation in UKBB participants stratified by PRSs for SND, DCD and PM implantation. The shaded area surrounding each line refers to the two-sided 95% CI. PRSs were constructed using the clumping and thresholding method separately for each phenotype (nvariants for SND PRS = 28, nvariants for DCD = 57 and nvariants for PM = 51). A total of 327,702 unrelated participants without a history of PM implantation at study enrollment were included in the analyses. Participants were stratified into three groups based on the tertiles of residuals of each PRS after adjustment by the first ten PCs. The statistical significance of the associations between tertiles of each PRS and PM incidence was evaluated using analysis of variance comparing a Cox proportional hazards model with only sex, age and genotyping array as predictors and Cox proportional hazards models with each PRS tertile as an additional predictor. The median follow-up time was 11.2 (Q1–Q3, 10.5–11.7) years. b, Associations of PRS residuals (after adjustment by the first ten PCs) for bradyarrhythmias with a wider set of outcomes based on the phecode system in 350,872 unrelated individuals in the UKBB. The associations were tested by logistic regression, and the P values were based on two-sided tests. Only outcomes with at least one significant association after Bonferroni correction are shown (P < 3.76 × 10−5), and significant associations are highlighted with black borders. A total of 42 outcomes were significantly associated with at least one bradyarrhythmia-related PRS. CHF, congestive heart failure; AV, atrioventricular; NOS, not otherwise specified; NEC, not elsewhere classified.

We then performed a PheWAS by examining associations between each PRS and 1,329 phecode traits19,20. A PRS for SND was associated with multiple traits, including AF, SND, mitral valve disease, valve disorders, cerebrovascular disease, nonhypertensive heart failure and appendicitis (Fig. 3b and Supplementary Table 15). A PRS for PM was associated with a largely overlapping set of cardiovascular conditions. In contrast, despite a higher number of susceptibility loci, a PRS for DCD was only associated with atrioventricular block and bundle branch block. The SND and PM PRS had much stronger associations with AF and other tachyarrhythmias compared to DCD, but the SND PRS also appeared more strongly associated with valve disorders compared to the DCD or PM PRS.

Causal associations

In a bidirectional Mendelian randomization (MR) screen, we examined potential causes of bradyarrhythmias (Supplementary Table 16 and Supplementary Note) and used causal analysis using summary effect estimates (CAUSE, ref. 21) to verify the associations among factors without substantial evidence of directional pleiotropy (Fig. 4 and Supplementary Table 17). For SND, we identified higher AF liability and slower pulse rate as potential causal factors when applying CAUSE (P < 0.017)21. For DCD, we identified higher height, weight and systolic blood pressure as potential causal factors when applying CAUSE (P < 0.017). For PM, higher height, higher systolic blood pressure, and slower pulse rate reached a nominal level of significance (0.0083 < P < 0.05), while higher AF liability, higher weight and higher coronary artery disease liability all reached significance (P < 0.0083) in CAUSE.

Fig. 4: Causal links of bradyarrhythmias with potential causes.
figure 4

ac, In each panel, a bubble plot shows results from the MR screen modeling the given bradyarrhythmia as the outcome, using the weighted median method. a, Results for SND. b, Results for DCD. c, Results for PM. The y axis represents the signed −log10 of the P value from the MR analysis, where each bubble represents a different disease/trait (exposure or outcome), and −log10(P) is signed by the direction of the MR effect estimate; the diseases/traits are ordered by their signed −log10(P) from high (left) to low (right). All plotted P values are two-sided. The full red and blue lines represent the Bonferroni-corrected significance level (P < 0.05/(74 × 2) tests) for a given bradyarrhythmia, while the dotted lines represent P < 0.01. Traits/diseases reaching two-sided Bonferroni significance in the weighted median screen are annotated with their names; traits/diseases annotated in bold with a red or blue color also passed all subsequent filters and reached significance using CAUSE (Methods), while traits/diseases annotated in black passed MR–Egger intercept filtering (two-sided P > 0.025) but showed only a nominal association using CAUSE (one-sided P < 0.05); traits/diseases annotated with smaller font in gray either failed MR–Egger intercept filtering (P < 0.025) or showed no evidence of causality using CAUSE (one-sided P > 0.05). BMI, body mass index; SBP, systolic blood pressure; CAD, coronary artery disease.

As part of our bidirectional MR screen, we assessed bradyarrhythmias as exposures for other outcomes. We did not find evidence of bidirectional or implausible causal relationships involving the exposures nominated for causal effects on bradyarrhythmias. We did find evidence for stroke as a potential consequence of SND liability and stroke, hematocrit and hemoglobin concentration as potential consequences of PM liability, each without significant MR–Egger intercepts (Supplementary Fig. 6), although none were verified by CAUSE (P > 0.05). We did not observe substantial genetic evidence for the potential consequences of DCD liability.

Common variation and cell-type enrichment in the human heart

Using stratified LD score regression (s-LDSC) with single-nucleus RNA-sequencing (snRNA-seq) data from adult human myocardial samples22, we observed enriched GWAS heritability near cardiomyocyte-specific genes for DCD, which may be explained by the inclusion of nonspecific intraventricular conduction delay in DCD, or by the origin of the conduction system from a cardiomyocyte lineage (Extended Data Fig. 1 and Supplementary Table 18)23. In contrast, no individual cell type reached significance for SND or PM.

Rare-variant burden test for bradyarrhythmias

Next, we performed rare variant (minor allele frequency (MAF) <0.1%) burden testing in 460,000 whole-exome sequenced individuals meta-analyzed across UKBB and Mass General Brigham Biobank (MGB; Fig. 1). We included 1,766 SND cases, 12,422 DCD cases, 5,645 PM cases and 459,047 referents. Various rare variant masks were assessed and combined in a layered approach using the Cauchy distribution test (Methods). While the meta-analyses showed some evidence for inflation for SND (λ95% = 1.25) and PM (λ95% = 1.14), cohort-specific test statistics were well-calibrated (Supplementary Fig. 7).

We observed an exome-wide significant (Cauchy P < 2.7 × 10−6) burden of rare protein-disrupting variants in LMNA for SND and in three genes for DCD (LMNA, SMAD6 and SCN5A; Fig. 5 and Supplementary Table 19a). Participants with PM carried a higher burden of rare protein-disrupting variants in LMNA, TTN, MYBPC3 and SCN5A.

Fig. 5: Rare-variant association tests.
figure 5

ac, Gene-level results from rare variant burden tests are shown separately for SND (a; n = 460,813), DCD (b; n = 471,469) and PM implantation (c; n = 464,692). Two-sided Cauchy P values (on −log10 scale) from fixed-effect meta-analysis for each gene are shown on the y axis. Dashed lines indicate exome-wide significance thresholds (P < 2.7 × 10−6 after Bonferroni correction). Significant associations are presented in red, and gene names are annotated.

We performed a further examination of individual variation classes (Supplementary Fig. 8 and Supplementary Table 19b). In LMNA, only missense variants and the combination of missense and loss-of-function (LOF) variants were significantly enriched for all phenotypes (P = 1.4 × 10−6 to 0.08 for SND, 2.4 × 10−10 to 5.9 × 10−5 for DCD and 1.6 × 10−22 to 6.5 × 10−4 for PM). Additionally, LOF variants showed nominal enrichment when examining all exons in patients with DCD or PM (P = 0.02 and 0.006, respectively), or canonical transcripts in PM patients (P = 0.03).

For other exome-wide significant genes (TTN, MYBPC3, SCN5A and SMAD6), LOF variants had larger effect sizes compared with missense variants (odds ratios for LOF variants in canonical transcripts = 2.1–19.3; odds ratios for missense variants in canonical transcripts = 0.8–10.8; Supplementary Fig. 9 and Supplementary Table 19c). In SCN5A and SMAD6, however, we observed additional significant or suggestive enrichment of missense variants (Cauchy P values for missense variants in the canonical transcript = 1.4 × 10−4 to 7.8 × 10−4 for SCN5A and 1.2 × 10−5 for SMAD6 in DCD). In addition, for every exome-wide significant gene, effect estimates of missense variants increased commensurately with the proportion of bioinformatics tools predicting a damaging/deleterious effect. Findings were generally consistent across all eight tested tissue-specific masks.

We then performed replication in the All of Us Research Program (853 SND cases, 4,221 DCD cases, 1,147 PM cases and 152,304 referents)24,25. We excluded samples from Massachusetts (to avoid sample overlap with MGB) and used a representative mask (for example, missense or LOF) for each gene–phenotype pair based on findings from the discovery dataset (Supplementary Note). We observed significant replication (P < 0.05/8 gene–phenotype pairs = 0.00625) for rare protein-disrupting variation in LMNA, SMAD6 and SCN5A with DCD and for MYBPC3 with PM (Supplementary Table 20). At a nominal level (P < 0.05 with the concordant direction of effect), we also replicated evidence for LMNA and TTN with PM.

SMAD6 has been previously associated with congenital cardiovascular malformations that may necessitate invasive cardiac procedures26,27. However, among UKBB participants, LOF variants in SMAD6 remained similarly associated with DCD before (odds ratio = 3.4, P = 1.2 × 10−6) and after excluding 24,683 participants with congenital heart disease, cardiac surgery or stenosis or regurgitation of the aortic, tricuspid or mitral valves (odds ratio = 3.5, P = 3.5 × 10−6), supporting a more direct role in DCD rather than an effect mediated by postprocedural conduction block.

We then assessed associations with PM among carriers of a rare LOF or damaging missense variant in any of the five rare variant genes of interest among 305,633 unrelated UKBB participants (Fig. 6a and Supplementary Table 21). A total of 2.11% (95% CI: 1.76–2.54%) of rare variant carriers had prevalent or incident PM compared with 0.98% (0.94–1.01%) of noncarriers. Among rare variant carriers (Fig. 6b), the proportion with PM was highest in those with LMNA mutations (6.59%, 4.09–10.36) and lowest in those with TTN mutations (1.48%, 1.05–2.06%). When restricted to LOF variants, the proportion of participants with PM was greater for a subset of the genes, albeit with less precision (Supplementary Fig. 10).

Fig. 6: PM implantations in carriers of protein-disrupting variants among 305,633 unrelated UKBB participants.
figure 6

a, Proportion of unrelated UKBB participants with PM implantations at study enrollment or during follow-up in participants who were carriers of a protein-disrupting variant (a LOF variant or a missense variant predicted to be pathogenic by at least 80% of bioinformatics tools) in any of the five genes (LMNA, SCN5A, MYBPC3, SMAD6 and TTN) that were significantly associated with at least one bradyarrhythmia phenotype in rare variant association tests. b, Proportion of participants with PMs in participants who were carriers of protein-disrupting variants in each gene mentioned above. c, Proportion of participants with PMs among carriers of protein-disrupting variants (right) and noncarriers (left), stratified by the tertiles of a PRS for PM implantation adjusted by the first ten PCs. The error bars in ac correspond to binomial 95% CIs calculated using the Agresti–Coull method. Two-sided P values for the PRS were derived with logistic regression using PM implantation as the outcome and the PRS tertiles, sex and age at study enrollment as predictors. Numbers in parentheses are the total sample sizes in genetic categories.

Overlap and interaction between rare and common variations associated with bradyarrhythmias

Among genes with a significant burden of rare protein-disrupting variants in bradyarrhythmias, two genes (SCN5A and TTN) overlapped with GWAS loci. In contrast, no genome-wide significant common variant signal was observed in the loci containing LMNA, MYBPC3 or SMAD6.

We then evaluated the utility of rare variant burden tests in prioritizing causative genes in GWAS loci (regions within ±1 Mb of index variants). Excepting the exome-wide significant SCN5A and TTN, we observed no rare variant signals at a suggestive P-value threshold (Supplementary Table 22). However, the well-known cardiomyopathy genes MYH7 (Cauchy P = 6.6 × 10−3 for DCD, 0.28 for SND and 3.1 × 10−6 for PM) and NKX2-5 (Cauchy P = 1.6 × 10−3 for DCD, 0.51 for SND and 0.07 for PM) were still most strongly prioritized in their respective DCD GWAS loci. In contrast, we observed minimal rare variant signals in some well-established arrhythmia loci, such as the locus containing PITX2 (Cauchy P = 0.07–0.56 for SND, DCD and PM).

We also assessed for common variant associations with ECG traits in the regions of genes identified in rare variant association testing. We observed genome-wide significant associations with at least one ECG trait in the loci of LMNA, TTN, SCN5A and MYPBC3. At the SMAD6 locus, the most significant common variant association was with the PR interval (P = 2.9 × 10−6; Supplementary Table 23).

We then evaluated the interaction of rare and common variation in 305,633 unrelated UKBB participants. Among 5,255 carriers of a protein-disrupting variant (LOF or predicted pathogenic missense variant), a higher tertile of a PM PRS was associated with a higher likelihood of PM during study enrollment or follow-up (P = 7.7 × 10−4; Fig. 6c and Supplementary Table 24). We observed evidence for positive interaction between rare variant status and PRS tertile (P = 0.04).

Integration of in silico approaches from rare and common variant analyses

To prioritize potentially causal genes, we evaluated multiple lines of evidence for all genes within ±500 kb from each index variant—nearest genes, polygenic priority score (PoPS)28, variants to genes (V2G) scores from Open Targets Genetics29, significant cardiac eQTL with colocalization probability ≥0.8, significant cardiac TWAS and suggestive evidence from rare variant association testing. Evidence for each gene is given in Supplementary Table 25. We observed more than one supporting line of in silico evidence for 29 genes in 13 loci for SND, 63 genes in 31 loci for DCD and 38 genes in 21 loci for PM.

Discussion

We performed large-scale meta-analyses of bradyarrhythmias in >1.3 million individuals, including 30,000 cases from ten studies across multiple continents. We identified 13 common variant loci for SND, 31 loci for DCD and 21 loci for PM. Rare-variant association testing in 460,000 participants uncovered five genes influencing susceptibility to bradyarrhythmias. Most associations we identified are new, consistent with expectations, and implicate ion channel function, cardiac development, cellular homeostasis and sarcomeric components in the pathogenesis of bradyarrhythmias. Cardiomyocyte-specific genes contributed significantly to DCD heritability, whereas cell-specific enrichments were less evident for SND and PM.

Our findings support and extend previous analyses by expanding sample sizes, increasing precision for specific bradyarrhythmia subtypes and exploring the shared genetics of cardiac arrhythmias. Although rare familial forms of isolated conduction system disease exist, only ~5% have an identifiable mutation30, most commonly in cardiac ion channels (refs. 6,31,32). Familial conduction diseases can co-occur with cardiomyopathy, in which case mutations are more commonly found in cardiac transcription factors33 or structural genes30. Here we observed robust associations between distinct bradyarrhythmias and rare variations in SCN5A, LMNA, MYH7 and MYBPC3. Moreover, we identified an association between DCD and protein-disrupting variants in SMAD6, which was replicated in an independent dataset. SMAD6 is a structurally distinct member of the SMAD family of proteins within the transforming growth factor-β pathway and a preferential inhibitor of bone morphogenic protein responses34. Although protein-disrupting mutations in SMAD6 have previously been linked with congenital cardiac malformations26,27, the DCD association was robust to the exclusion of participants with congenital or structural heart disease, suggesting that functioning SMAD6 is required for normal development or maintenance of the distal conduction system.

In addition to rare variants, our findings substantially expand our understanding of the contribution of common genetic variation to bradyarrhythmias. Here we identified common variant association signals in known monogenic bradyarrhythmia loci (HCN4 for SND and SCN5A/SCN10A for SND and DCD), replicated four previously reported loci for SND (CCDC141, SCN5A/SCN10A, PITX2 and ZFHX3) and identified multiple new and partly distinct common variant associations for SND and DCD. Despite the convergence of some rare and common variant signals, the GWAS loci were much more numerous. In addition to lower statistical power, the lack of rare variant association signals may also stem from the underrepresentation of individuals with severe bradyarrhythmias given attained age at sample collection (that is, interquartile range = 50–63 in UKBB and 38–66 in MGB) or depletion of damaging variants in the general population due to premature mortality or embryonic lethality, as previously suggested for the essential transcription factor PITX2 (ref. 35). Thus, common variant analyses, enabled by much larger sample sizes, may facilitate broader biological insights.

Although the loci we identified span a wide range of cardiac biology, our results generally highlight a critical role of cardiac ion channels in the development of bradyarrhythmias. We identified several associations implicating ion channel genes, including SCN5A, SCN10A, HCN4, CAMK2D and RNF207. At the SCN5A/SCN10A locus, we observed spatially distinct localization of primary signals within SCN5A for SND and within SCN10A for DCD. Mutations in SCN5A are known causes of familial SND, long QT syndrome type 3 and Brugada syndrome31. Noncoding enhancers within SCN10A have been shown to regulate SCN5A expression36. The locus containing HCN4, a gene that encodes an ion channel responsible for spontaneous sinus PM activity and a cause of familial sinus bradycardia6, was associated with SND. The ion channel-related proteins RNF207, a delayed rectifier and voltage-gated potassium channel regulator, and CAMK2D, a kinase that regulates myocyte calcium homeostasis and excitation–contraction coupling, do not appear to have been previously associated with bradyarrhythmias. The locus containing RNF207 is associated with the QT interval37,38, and knockdown of RNF207 in zebrafish has been reported to result in reduced conduction velocity and occasional 2:1 atrioventricular block39.

Common variations in genes relevant to cardiac development and cellular homeostasis appear to broadly influence the risk of bradyarrhythmias. In addition to the known SND gene CCDC141, involved in the centrosomal function and neural migration, we identify additional loci including CEP68 and ITGA9, encoding proteins involved in centrosomal cohesion and cell–cell and cell–matrix adhesion, respectively, and GIGYF1, participating in insulin-like growth factor 1 signaling. Protein-disrupting mutations in GIGYF1 have been recently implicated in type 2 diabetes, adverse metabolic health and clonal mosaicism40,41, consistent with our observation that higher predicted expression of GIGYF1 is associated with lower risk of SND. Broadly, our findings suggest that genes involved in cellular function and maintenance appear to influence bradyarrhythmia risk at multiple levels of the cardiac conduction system.

Activation or repression of cardiac development and cell fate programs, which have been reported for tachyarrhythmias such as AF and supraventricular tachycardia, also appear important for bradyarrhythmias. Both NKX2-5 and TBX20 encode transcription factors important for the appropriate development of the cardiac septum, which houses critical components of the conduction system. We also observed an association between SND and PITX2, a well-known AF susceptibility locus critically involved in the promotion of correct left–right differentiation in the developing heart and specification of pulmonary vein myocardial sleeves42. Several other new associations, such as ERBB4 and BAG3, primarily with DCD and PM, implicate potentially abnormal cardiac developmental programs as risk factors for bradyarrhythmias.

Despite some common signals, our analyses also highlight distinct genetic mechanisms underlying bradyarrhythmias originating from the sinus node versus the distal conduction system. Of the 13 significant loci identified for SND and 31 loci identified for DCD, only 4 (CAMK2D, CCDC141, SCN5A/SCN10A and TBX20) were shared. Genome-wide correlations with electrocardiographic traits also yielded distinct association profiles, as we observed genetic correlations between cardiac conduction times and DCD but not SND. Moreover, we observed a greater overlap between AF loci and SND versus DCD. Several loci specific to DCD appear to be involved in diverse processes such as cellular apoptosis (for example, BCL2), cellular metabolism (for example, PPARGC1A) and inflammation (for example, IL25).

Our findings raise several considerations for future research efforts and translation of insights into the management of human disease. First, our observation of the substantial contribution of cardiomyocyte genetics to DCD heritability supports current consensus recommendations to maintain a high index of suspicion for a potential cardiomyopathic process even among individuals presenting with apparently isolated conduction system disease31. Second, the identification of specific genetic contributors to bradyarrhythmia subtypes represents an important initial step toward the future development and validation of genetic risk stratification tools. Third, although we used complementary approaches to identify candidate genes, we acknowledge that the potential role of several genes remains poorly defined. Future experimental work to elucidate specific mechanisms by which candidate genes may affect bradyarrhythmia development is warranted. Fourth, especially given the current absence of medical treatments for bradyarrhythmias, our findings prioritize specific genes and respective pathways (for example, SCN5A and SMAD6) and potentially causal risk factors (for example, weight and blood pressure), whose modification may ultimately prove capable of preventing and treating bradyarrhythmias. Fifth, MR analyses examining bradyarrhythmia trait-related variants as exposures were limited in statistical power; future work based on larger sample sizes is needed to understand the potential causal associations of bradyarrhythmias with other outcomes.

Our results should be interpreted in the context of the study design. First, to maximize statistical power for relatively rare diseases, we used all samples for discovery. Generally consistent variant effect sizes across multiple datasets support the validity of our findings, although future replication in even larger external studies is warranted. Second, participants of some cohorts, such as the UKBB, may be healthier than average. Third, study participants were predominantly of European ancestry. We anticipate that larger multi-ancestry biorepositories will enable the assessment of the generalizability of our findings. Fourth, although we used ECG-based diagnoses in the three cohorts in which they were available, we primarily leveraged diagnostic and procedural codes to define SND and DCD, an approach with limited sensitivity for bradyarrhythmia conditions, which may be asymptomatic. Nevertheless, our findings demonstrated a very high correlation between effect sizes from a meta-analysis of all studies versus studies including ECG-based diagnoses (Supplementary Fig. 11). In addition, we observed consistent results in our restrictive analyses requiring PM, which is less subject to misclassification. Fifth, indications for PM and practices regarding device type (for example, single-chamber versus dual-chamber) vary across settings, and analyses stratified by device type are outside the scope of our study. Sixth, to maximize bradyarrhythmia cases, our DCD definition included nonspecific intraventricular conduction delays, which may indicate disease within the specialized conduction system (for example, relatively balanced delay within the bundle branches)43, versus conduction delay at the level of cardiomyocytes. Refined subtyping of DCD may reveal pattern-specific biological insights in the future. Finally, although we report comprehensive genetic discovery analyses, future work is needed to understand the underlying biological pathways and causality of individual genes in the identified loci.

In summary, we performed common variant genetic association testing in ten study populations across two continents, representing >1.3 million individuals and 30,000 cases with either SND, DCD or PM. We identified 13 genome-wide significant loci for SND, 31 loci for DCD and 21 loci for PM, with most being new. Rare-variant burden testing identified five genes, including the new association of SMAD6 with DCD. Our findings suggest that variations in multiple genetic pathways including ion channel function, cardiac developmental programs, sarcomeric structural components and cellular homeostasis are critical to the development of bradyarrhythmias.

Methods

Study sample of common variant GWAS

We performed meta-analyses of common variant GWAS with at least 100 cases from ten participating studies, and we present methodological details of data collection, genotyping, imputation and quality controls in the Supplementary Note and Supplementary Table 26. All participants provided verbal or written consent, and participating studies were approved by their respective ethics committees or institutional review boards.

Phenotype definitions

All phenotypes were determined using relevant International Classification of Diseases, Ninth or Tenth Revision (ICD-9 or ICD-10) codes, presence of relevant procedural codes, medical history or standard 12-lead electrocardiogram data in a study-specific manner as described in the Supplementary Note and Supplementary Table 27. SND was defined as the presence of a diagnosis of sinoatrial node dysfunction or sick sinus syndrome. DCD was defined as atrioventricular nodal disease or more distal conduction disease, including first-degree AV block, second-degree atrioventricular (AV) block, third-degree AV block, bundle branch block or combinations of those conditions. PM included codes corresponding to PM implantation, replacement, removal or interrogation. In secondary analyses, we defined more restrictive early onset SND and DCD definitions focusing on cases with disease-onset age before 75 years and a history of PM implantation. We excluded individuals with valvular heart disease, cardiac surgery or myocardial infarction at or before the time of bradyarrhythmia diagnosis because these conditions may secondarily cause conduction disease.

Common variant GWAS

Ancestry-specific common variant GWAS were performed separately in each participating study site. Common variant genetic association testing assumed an additive genetic model, and study-specific statistical models are described in Supplementary Table 26.

Post-GWAS quality controls were performed centrally using EasyQC (v11.4)44. We removed variants with invalid or mismatched alleles from the reference file (1000 Genomes p1v3 European/African/Admixed American samples), duplicates, variants with poor imputation quality (INFO < 0.3), rare variants (MAF × number of cases × INFO < 10) or variants with invalid summary statistics. To ensure the variant position was consistent across studies, we used LiftOver45 to align each variant to Genome Reference Consortium Human Build 37 positions before meta-analysis.

Meta-analysis of common variant GWAS

We performed separate meta-analyses of six studies of SND (five SND restrictive), nine studies of DCD (five DCD restrictive) and nine studies of PM using a fixed-effects approach implemented in METAL46. To control for inflation due to population structure, we applied genomic control to all studies. We removed variants present in only a single study and insertion–deletion variants to avoid mismatch across studies. We set the GWAS significance threshold at P = 5 × 10−8, and we only report the top (index) variants within a ±1 Mb range or a peak region (if appropriate) with at least one supportive variant nearby (P < 1 × 10−6). We used the qqman47 packages in R v4.0 to generate Manhattan and Q–Q plots.

We applied the conditional and joint analysis approach in genome-wide complex trait analysis (GCTA)-cojo (v1.93.2beta) to the summary statistics to identify independent signals at identified loci48. Based on the suggestions from GCTA) developers (that is, minimum reference sample size >4,000 individuals), we specifically conditioned on all index variants with LD information from 322,987 unrelated European individuals in the UKBB who contributed to all examined phenotypes in this analysis. We included all biallelic hard-call transformed (probability >0.8) common single-nucleotide polymorphisms with a MAF ≥ 0.01.

Effect on gene expression and pleiotropic associations

We assessed the association of index variants and their proxies (r2 ≥ 0.6 in 1000 Genomes p3v5 European participants in a 1-Mb window) with gene expression in two heart tissues (right atrial appendage and left ventricle) from Genotype-Tissue Expression (GTEx, v8)49. We report all significant eQTL with q value < 0.05. We assessed the PP of shared causal variants between GWAS and eQTL results, using the coloc R package50. The testing region comprised all variants in both the eQTL (minimum to maximum position of significant eQTL of the gene) and GWAS (minimum to maximum position of the top GWAS significant variant within ±250 kb) regions and an additional ±5 kb.

We also performed TWAS to test associations between predicted gene expression in the aforementioned two heart tissues and each phenotype using the elastic-net model in S-Predixcan (MetaXcan package, v0.7.4)51. We considered expressed genes significant based on a P-value threshold of 0.05 divided by the total number of tested genes.

We accessed the National Human Genome Research Institute-European Bioinformatics Institute (NHGRI-EBI) GWAS Catalog (accessed on 11 January 2023) to explore whether index variants or their proxies were previously reported for other cardiovascular disorders13. Additionally, we compared associations for bradyarrhythmias and AF based on summary-level data from large AF meta-analyses10,11,12. To examine overall genetic links among bradyarrhythmias and with electrocardiographic and vectorcardiographic endophenotypes as reported in previous GWAS52,53,54,55,56,57, we performed genetic correlation analyses, using LDSC with European LD scores from 1000 Genomes provided by LDSC (v1.0.1) package58,59.

To further clarify the genetic association between bradyarrhythmias and other traits, we performed a meta-analysis without individuals from UKBB and derived PRS for SND, DCD and PM without ambiguous variants (A/T or C/G) or variants only available in one study, using the clumping and thresholding method (P-value cut-off = 5 × 10−8, r2 = 0.5 in 1000 Genome p3v5 European participants, window size = 2 Mb) in PLINK. Due to the high genome-wide correlations of bradyarrhythmia subtypes but distinct locus architecture, we limited PRS to genome-wide significant variants in each GWAS meta-analysis to reduce pleiotropic associations. We calculated PRS by summing the product of effect sizes and allele dosages in the top loci and evaluated the associations of the PRS with incident PM implantation using Cox proportional hazards models with survival time from study enrollment to PM implantation or censoring as the time scale and sex and age as additional covariates. We removed close relatives, prioritizing the inclusion of PM cases. In addition to the exclusion criteria applied to our main analysis (Supplementary Table 27), participants with prevalent SND, DCD or PM at study enrollment were excluded from incident disease analyses, leaving a total of 327,702 unrelated UKBB European individuals (2,183 with PM) for analyses, with a median of 11.2 (Q1–Q3, 10.5–11.7) years of follow-up. Individual-level PRS in these analyses were the residuals of each PRS after adjusting for the first ten principal components (PCs).

We additionally performed logistic regression to assess the associations between SND, DCD and PM PRS residuals (PC adjusted) with 1,329 disease outcomes (with at least 100 cases), defined using v1.2 of the phecode map19,20. Models were fit within 350,872 unrelated UKBB European individuals with complete information on genetic and disease status (regardless of their bradyarrhythmia status at study enrollment). Models were adjusted for age at enrollment, sex and genotyping array. Significant P-value thresholds for genetic correlation and PRS association tests were adjusted for multiple testing using the Bonferroni correction.

Causal associations from bidirectional MR analysis

For each of the bradyarrhythmia GWAS, we performed a bidirectional MR screen for 74 other bradyarrhythmia-related traits to test causality between the given bradyarrhythmia and relevant cardiovascular traits. Because UKBB data are typically included in all contemporary, well-powered GWAS, we used bradyarrhythmia GWAS meta-analyses that excluded UKBB for all MR analyses to minimize sample overlap in two-sample MR. The GWASs for the other 74 traits encompassed a list of relevant phenotypes, including AF10, heart failure60, coronary artery disease61, type 2 diabetes62, chronic kidney disease63, thyroid disease64, smoking, alcohol use65, stroke66 and additional commonly measured quantitative traits (including blood pressure, anthropometry and laboratory values)67. The GWAS summary statistics were chosen such that they were largely of European ancestry, and if European-only summary statistics were available, those were used (this was chosen to make the LD structure most comparable to the bradyarrhythmia GWAS). Furthermore, studies were preferably chosen so that FinnGen was not included in the GWAS to keep sample overlap to a reasonable minimum for two-sample MR. An exception was stroke, for which we used a GWAS that included FinnGen and several non-European datasets, as other publicly available GWAS were relatively underpowered.

The bidirectional MR screen was performed to test genetic liability to each of the three bradyarrhythmia traits (SND, DCD and PM) against genetic liability to 74 other traits (Supplementary Table 16). For each trait pair, genetic liability to the bradyarrhythmia trait was in turn evaluated as the outcome and as the exposure. In total, there were 222 trait pairs (74 for each bradyarrhythmia trait) and 444 tests. For any given MR comparison, we first harmonized summary statistics by (1) lifting over to GRCh37 if on a different build, (2) removing ambiguous variants and InDels, (3) removing variants with <70% of the total case numbers contributing to the bradyarrhythmia GWAS, (4) removing variants with MAF < 1% in either study (if present in the summary statistics), (5) removing variants with imputation accuracy <0.3 in either study (if present in the summary statistics), (6) aligning effect and reference alleles and (7) restricting to variants also present in the LD reference (built from 5K random European ancestry samples from UKBB). After the harmonization, before MR, variants were filtered and pruned based on genome-wide significance (P < 5 × 10−8) and r2 < 0.0005 taking 10-Mb windows. The initial screening step was performed using the weighted median method implemented in the R-package TwoSampleMR (v0.5.6). The weighted median method may give more robust results than the inverse-variance-weighted approach in the case of outliers68. Any signal, for a given bradyarrhythmia GWAS, with P < 3 × 10−4 (0.05/(2 × 74 traits)) was determined significant in the screening phase and proceeded further.

In a second phase, we removed associations with significant MR–Egger intercept (P < 0.025), to exclude results potentially driven by major directional pleiotropy bias in the initial screen68. In the final confirmatory phase, we used the sophisticated mixture model in CAUSE (v1.2.0)21. In short, CAUSE assesses whether GWAS data for two traits are consistent with a causal effect by fitting and comparing two nested models. These include a ‘sharing’ model that allows only a pleiotropic pathway and a ‘causal’ model that additionally estimates a causal pathway. These models are compared using the expected log pointwise posterior density, and a one-sided P value is computed from a z test comparing the ‘causal’ model to the ‘sharing’ model. For step 1 of CAUSE (estimating nuisance parameters), we used default parameters that include using 1 million random genome-wide markers for parameter estimation. For step 2 of CAUSE (estimating causal effects), we used filtered and pruned variants (two-sided P < 0.001 and r2 < 0.0005 over 10-Mb windows) and otherwise default parameters. Significance for CAUSE was corrected for the number of comparisons that reached the final phase (that is, passed all screening and previous filters) for the given exposure/outcome set. For instance, if five potential risk factors were identified for a bradyarrhythmia trait in the screening phase, of which one was removed due to failing the MR–Egger filter, then the significance level for CAUSE was set to P < 0.05/4 for that set.

Cell-type enrichment with s-LDSC

To identify relevant cell types for bradyarrhythmia GWAS, we used s-LDSC69 as described in ref. 70. Using snRNA-seq data from ref. 22, we defined cell-type-specific gene expression profiles by collapsing nuclei into nine major cell types from the human heart. We tested for differentially expressed genes in each cell type compared to all other cell types by summing gene expression counts for each combination of individual, cell type and chamber across all nuclei to create a pseudo-bulk expression profile. If a given combination of individual, cell type and chamber had less than 20 nuclei, it was omitted. Lowly expressed genes were removed using the function filterByExpr() in edgeR71. After DESeq2 normalization72, differential expression testing was performed using the limma-voom framework73 with a design of ~0 + cell_type + individual + chamber and extracting an explicit contrast comparing expression in each cell type to all other cell types. For each cell type, we defined the cell-type-specific profile as the top 10% most upregulated genes based on the t statistic from the differential expression test.

We then annotated SNPs near cell-type-specific genes by building a 100-kb window on either side of the transcribed region of each gene annotated to a particular cell type, as described in ref. 70 All gene coordinates were based on the GRCh38 gene reference used in the snRNA-seq data analysis. To test for enriched heritability in regions near cell-type-specific genes, we mapped GWAS summary statistics to GRCh38 using LiftOver45 and ran s-LDSC with our cell-type-specific annotations along with the baseline model69 using the previously derived 1000 Genomes European ancestry LD reference. To account for the nine cell types tested for each GWAS trait, we applied a Bonferroni significance cut-off by setting significance at 0.05/9 = 0.0056.

Rare-variant association tests

We performed exome-wide rare variant burden tests from whole-exome-sequencing data for SND, DCD and PM in UKBB and MGB (nSND cases, UKBB = 803; nSND cases, MGB = 963; nDCD cases, UKBB = 9,379; nDCD cases, MGB = 3,043; nPM cases, UKBB = 4,091; nPM cases, MGB = 1,554; nreferents, UKBB = 414,360; nreferents, MGB = 44,687). Details of the datasets, quality controls and variant annotations are described in the Supplementary Note and Supplementary Table 28. Within both datasets, we used REGENIE (v3.1.3)74 to perform burden tests across various rare variant masks. The REGENIE null model (step 1) was fit using genome-wide autosomal common variants from genotype array data, adjusting for sex, age, age2, sequencing batch, ancestral PCs 1–4, as well as any component from 5 to 20 if associated with any of the outcomes. In step 2, a logistic regression model was used to test for the association between rare variant burdens and the outcomes; an approximate Firth’s regression—with back-computation of s.e.—was used for associations reaching nominal P < 0.05 (ref. 74). The same covariates were applied for step 2, additionally adding the REGENIE PRS as a fixed effect67,74. Study-specific results were subsequently meta-analyzed using a fixed-effects inverse-variance weighted meta-analysis approach. Before meta-analysis, any results with <10 alternative allele carriers were removed, while after meta-analysis, mask results with <20 alternative allele carriers were removed; this was done to avoid spurious results from low allele count.

For each gene, up to 198 different masks were tested for association with a given outcome. The different masks were based on 2 filters for MAF (<0.1% and <0.001%), 11 filters based on variant annotation (LOF, missense with different predicted-deleteriousness cutoffs40, LOF + missense) and 9 filters based on affected transcripts (canonical transcript, all exons, seven tissue-specific transcripts as determined by pext values). We used a layered approach to combine the many mask-phenotype P values into a single gene–phenotype P value using the Cauchy distribution test75. The Cauchy distribution test allows for valid aggregation of multiple, potentially correlated test statistics into a single omnibus test statistic. Details on this pipeline, including details on the various filters and Cauchy combination layers, are presented in Extended Data Fig. 2.

We used two approaches to identify bradyarrhythmia-related genes. First, we identified exome-wide significant genes with Cauchy P < 2.7 × 10−6 for each of the phenotypes. For all exome-wide significant genes, we further explored the most relevant variation classes contributing to the Cauchy test (that is, the variant class with the lowest nominal P value). Second, we evaluated the intersection of rare and common variation by examining the Cauchy P values of all genes within ±1 Mb of index variants in all GWAS loci. Suggestive significance thresholds for these genes were set by correcting P values for the number of tested genes across GWAS loci for each phenotype.

We performed additional sensitivity analyses for LOF variation in SMAD6, evaluating their association with DCD among UKBB participants using logistic regression and identical covariates to the discovery analyses. Analyses were performed separately in all whole-exome sequenced UKBB participants and in participants without prevalent or incident congenital heart disease, cardiac surgery, or stenosis or regurgitation of the aortic, tricuspid or mitral valves.

To verify the findings from our rare-variant analysis, we examined representative rare variant masks for each of the identified gene–phenotype pairs within the All of Us Research Program. To avoid potential issues of sample overlap between this dataset and MGB, we excluded any individual with a ZIP code from Massachusetts. After applying phenotypic inclusion and exclusion criteria similar to those used in discovery analyses, including additional All of Us-specific data sources (Supplementary Table 29), we then restricted to unrelated participants with available electronic health records and whole-genome sequencing data, leaving 853 SND cases, 4,221 DCD cases, 1,147 PM cases and 152,304 referents. Firth’s logistic regression models were used to assess the associations between rare variant masks and bradyarrhythmia outcomes. Details regarding the All of Us cohort, sample selection, quality control, phenotyping, rare variant masking and association analyses are described in the Supplementary Note.

Finally, we calculated estimates of the proportion of unrelated UKBB participants who had prevalent or incident PM. We derived estimates separately for participants with and without protein-disrupting variants (defined as LOF variants or missense variants predicted to be damaging/deleterious by over 80% of bioinformatics tools) and further stratified analyses by the tertiles of a PRS for PM adjusted for the first ten PCs. We used the Agresti–Coull method to calculate binomial 95% CIs.

Aggregating gene scores from various gene features

To determine the nearest gene, we sorted all genes within ±500 kb from our index variants and determined the closest genes (between the transcription start site and our index variant) and assigned gene annotation (that is, index variant overlap with gene body).

To calculate PoPS scores, we first performed gene region-based analysis with MAGMA (v1.10)76 using the 1000 Genomes Phase 3 European subset as a reference dataset and then computed PoPS scores for 18,383 genes using the full set of features provided with PoPS (v0.2)28.

We accessed the overall V2G score from Open Targets Genetics29. The scores were the sum of weighted data, including in silico functional prediction, various QTL, interaction and distance.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.