Introduction

Rural regions in the U.S. have unique health challenges, including higher socioeconomic disparities1, greater comorbid burden2, and limited healthcare access3. As highlighted in Rural Healthy People 20304, rural residents face distinct health disparities in many areas, such as mental health5, substance use disorders6, obesity7, diabetes8, and preventive care9. These and other disparities contribute to the growing divide in rural-urban health outcomes and what has collectively been described as the rural mortality penalty10. This term refers to the higher mortality rates experienced in the U.S. by rural communities relative to urban communities, including shorter life expectancies11. This phenomenon contrasts with the longer life expectancy in rural areas observed before the mid-twentieth century12.

The COVID-19 pandemic has exacerbated the existing rural mortality penalty, resulting in even greater loss of life in rural areas since 202013. Higher rates of acute COVID-19 mortality have been observed in rural areas during the Delta and Omicron periods13, driven by factors such as limited access to healthcare, higher prevalence of comorbidities, and social determinants of health (SDOH). COVID-19 has increased rates of social isolation, substance use, and mental health disorders6,14, placing more strain on rural communities as rural hospital closures continue to increase15. These issues, combined with limited access to specialized and critical care resources in rural facilities, underscore the heightened risk of COVID-19-associated deaths in rural areas during the acute phase of infection, even after adjustment for background risk factors16.

While we know from prior work that COVID-19 has a profound impact on outcomes in the acute period after infection16, including among vaccinated persons17 and after accounting for therapeutic interventions18, the long-term impact of COVID-19 in rural communities and its impact on the rural mortality penalty has yet to be studied. This study seeks to understand differences in short- and long-term mortality among rural and urban dwellers with SARS-CoV-2 infection, controlling for demographic differences and comorbid burden, and to quantify differences in outcomes based on region and degree of rurality. We hypothesized that rural dwellers would have worse outcomes than their urban counterparts, even after adjusting for demographic differences, comorbid conditions, vaccination status, and infection time period. However, we hypothesized that these differences may attenuate with time. This study aims to systematically understand the rural-urban mortality gap during and after SARS-CoV-2 infection to inform targeted public health interventions addressing these disparities.

Results

After applying inclusion/exclusion criteria, our cohort included 3,082,978 SARS-CoV-2-infected patients (Table 1, Fig. 1). This included 2,466,473 urban (80%), 505,382 UAR (16.4%), and 111,123 NAR dwellers (3.6%), which is a similar distribution to the US distribution of urban-rural population density as of the 2020 Census ( ~ 20% rural)19. The sample includes patients from all US Census Divisions, with the highest representation from the East North Central and South Atlantic regions across all rurality groups (Supplementary Fig. 1). Median age was higher in nonurban-adjacent rural (NAR; 51 years) and urban-adjacent rural (UAR; 49 years) compared to urban areas (46 years). Most patients were white non-Hispanic across all areas, but this proportion was higher in rural (UAR 85%, NAR 91%) than in urban areas (67%). Urban areas had higher proportions of Black or African American non-Hispanic (14%) and Hispanic or Latino (10%) compared to UAR and NAR areas.

Fig. 1: Cohort Selection Flow Diagram.
figure 1

Figure 1 shows the inclusion and exclusion criteria applied to N3C release v188 (February 6, 2025) to reach the analytic dataset used in this study.

Table 1 Descriptive Statistics for Patients with a SARS-CoV-2 Infection between April 2020 and December 2022 by Rurality

The distribution of infections across different SARS-CoV-2 variant periods was consistent across all areas. Urban areas had a higher percentage of patients with primary (15%) and additional vaccination doses before infection (12%) compared to UAR (13% and 8.0%, respectively) and NAR (12% and 7.2%, respectively). Myocardial infarction, heart failure, vascular, cerebrovascular, rheumatic, metabolic, neurodegenerative, and renal diseases were more prevalent in rural areas, while pulmonary disease and HIV were more common in urban areas. Obesity prevalence was slightly higher among urban (39%) than rural dwellers (37%), but greater BMI missingness in rural residents (44% vs 33%) may contribute to underestimation. Tobacco usage and substance use disorder were higher among urban residents, while rural residents had lower Social Vulnerability Index (SVI) scores (UAR 0.43 SVI and NAR 0.37 SVI) relative to urban residents (0.48 SVI).

Kaplan-meier cumulative incidence

Kaplan-Meier cumulative incidence curves (Fig. 2) revealed a significant association between rurality and two-year cumulative mortality post-infection (p < 0.001), with a gradient risk by rurality. In month 1, the mortality difference between urban and UAR was 0.62% (95% confidence interval [CI]: 0.58%–0.68%), and between urban and NAR areas was 0.80% (0.71%–0.90%). These gaps widened over time, reaching 0.81% (0.74%–0.87%) and 1.00% (0.89%–1.12%) at month 3; 1.17% (1.10%–1.25%) and 1.46% (1.33%–1.60%) at one year; and 1.62% (1.54%–1.71%) and 1.86% (1.70%–2.02%) at two years for UAR and NAR relative to urban residents, respectively. This demonstrates a growing mortality gap, with rural areas experiencing higher absolute and relative mortality rates than urban areas.

Fig. 2: Kaplan-Meier cumulative incidence curves for two-year mortality following SARS-CoV-2 infection from April 2020 through December 2022 by rurality.
figure 2

Figure 2 shows Kaplan-Meier cumulative incidence from day 0 through day 730 in the period after diagnosis of COVID-19. Patients were included if they had a documented SARS-CoV-2 infection between April 1, 2020, and December 31, 2022, with follow-up through December 31, 2024. Patients are followed from their initial SARS-CoV-2 infection for up to 730 days, ending at 1) death, 2) end of observation window, or 3) censoring (e.g., end of available data). This analysis uses a three-level rural categorization: urban, urban-adjacent rural, and nonurban-adjacent rural, to assess long-term manifestations of COVID-19 based on proximity and commuting patterns to urban centers. Data are presented as Kaplan-Meier cumulative incidence estimates, with shaded bands representing 95% confidence intervals around the mean cumulative incidence function.

Cumulative and excess mortality

The cumulative mortality and associated excess mortality revealed a consistent trend of increasing rural-versus-urban mortality over time (Fig. 3, and Supplementary Data 1). Within the first month, the age-adjusted overall mortality was 1005.67 per 100,000 persons (95% CI: 995.99–1015.36). Urban areas had a lower mortality of 943.44 (932.75–954.13). Rural areas experienced significantly higher mortality, with UAR at 1213.09 (1188.35–1237.83) and NAR at 1249.57 (1197.36–1301.78), showing excess mortality of 269.65 (216.82–322.47) and 306.13 (201.67–410.59), respectively, per 100,000 persons relative to urban.

Fig. 3: Age-Standardized Cumulative Risk and Excess Mortality Following SARS-CoV-2 Infection by Rurality.
figure 3

Figure 3 shows age-adjusted differences in cumulative mortality risk (per 100,000) after SARS-CoV-2 infection, stratified by rurality. Risks were estimated at 1, 3, 12, and 24 months post-infection, with risk differences compared to the urban cohort. Point estimates are displayed with 95% confidence intervals, calculated assuming a Poisson distribution. Data are presented as age-standardized mean risk, with whiskers indicating 95% confidence intervals. Because these are population-level standardized estimates, individual data distributions are not shown. Estimates were standardized to the 2020 U.S. Standard Population to account for age differences. Full risks, rates, and confidence intervals are available in Supplementary Data 1.

This difference increased gradually at three months, one year, and two years after infection. By three months, age-adjusted overall mortality increased to 1387.50 (1376.12–1398.87); at one year, it rose to 2158.71 (2144.52–2172.91); and by two years, the overall mortality reached 2970.37 (2953.71–2987.04) per 100,000 persons. Two years post-infection, urban mortality stood at 2823.77 (2805.27–2842.27), while rural areas were much higher, with UAR at 3484.71 (3442.60–3526.82) and NAR at 3,436.15 (3348.90–3523.40) per 100,000 persons. The excess mortality at two years was 660.94 (570.78–751.09) for UAR and 612.38 (437.57–787.19) for NAR per 100,000 persons compared to the urban mortality rate. Crude and age-adjusted risk and rates per 100,000 person-months are provided in Supplementary Data 1.

Cox proportional hazards for post-COVID-19 death

To understand the progressive risk of death after SARS-CoV-2 infection over time (Fig. 4, Supplementary Data 2), we assessed one-month, three-month, one-year, and two-year mortality risk by rurality after adjusting for background risk at the time of infection. After IPW, we observed a stable adjusted mortality risk across all periods in unweighted-unadjusted and weighted-adjusted models. At one-month post-COVID-19, UAR was associated with an adjusted hazard ratio (aHR) of 1.19 (95% CI 1.16–1.22) for all-cause mortality, while NAR was associated with an aHR of 1.36 (1.30–1.42) relative to urban dwellers. We found a similar increased risk for rural dwellers at three months (UAR aHR 1.18 [1.15–1.20] and NAR aHR 1.31 [1.26–1.37]), one year (UAR aHR 1.18 [1.16–1.20] and NAR aHR 1.29 [1.25–1.34]), and two years (UAR aHR 1.19 [1.18–1.21] and NAR aHR 1.26 [1.22–1.29]) after infection relative to urban dwellers.

Fig. 4: Unadjusted and adjusted hazard ratio for all-cause mortality among patients with an acute SARS-CoV-2 infection from april 2020 through december 2022 by rurality.
figure 4

Figure 4 contains Cox Proportional Hazards models for all-cause mortality after an acute SARS-CoV-2 infection by rurality. Deaths are captured from month 0 (day 0) through month 24 (day 730) in the period after diagnosis of COVID-19 at intervals of A) one month, B) three months, C) one year, D) two years, E) two years among patients surviving the initial month after infection, F) two years among patients surviving the initial 3 months after infection, and G) two years among patients surviving the initial one year after infection. Adjustments were made for sex, age, race/ethnicity, SARS-CoV-2 variant-dominant period at the time of infection, vaccination status, Social Vulnerability Index, U.S. Census Division, and a history of obesity, hypertension, myocardial infarction, congestive heart failure, peripheral vascular disease, rheumatic disease cerebrovascular disease, dementia, chronic pulmonary disease, peptic ulcer disease, hemiplegia or paraplegia, diabetes, liver disease, renal disease, cancer, HIV, tobacco usage, and substance use disorder documented before COVID-19. Forest plots display hazard ratios (points) on a log scale, with horizontal lines representing 95% confidence intervals (error bars). The vertical line indicates the null value of 1.0. Full model specifications are available in Supplementary Data 2.

Older age, male sex, non-white race/ethnicity, and a history of cardiovascular, pulmonary, neurological, metabolic, rheumatic, hepatic, renal, or malignant disease, as well as HIV, tobacco use, and substance use disorder, were consistently associated with increased post-COVID-19 all-cause mortality (Supplementary Data 2). Patients who received SARS-CoV-2 vaccination exhibited a significantly reduced mortality risk at all observed time points post-COVID-19, with the greatest protective effect observed in the immediate post-COVID-19 period (e.g., month one versus two years). The Delta and ancestral variant dominant periods had a higher risk of death than the Alpha, Beta, and Gamma variant periods. In contrast, both Omicron periods had a lower risk. Patients residing in communities with higher SVI values had an increased risk of adverse outcomes up to two years post-infection (aHR 1.23 [1.21–1.26]). This means that for every one-unit increase in SVI, there is a 26% increase in the risk of the event occurring, even after adjusting for other factors. Since SVI ranges from 0 to 1, smaller increments in SVI (e.g., from 0.5 to 0.6) would correspond to a proportionally smaller increase in risk.

To evaluate the relative risk of death among patients who survived various periods post-COVID-19 (Fig. 4, Supplementary Data 2), we examined two-year survival among those who survived to landmarks at one month, three months, and one year post-infection. The association between rurality and two-year all-cause mortality was consistent among patients who survived the acute period of COVID-19. We observed a similar increased risk for rural dwellers who were alive after month one (UAR aHR 1.20 [1.18–1.22] and NAR aHR 1.20 [1.16–1.24]), month three (UAR aHR 1.21 [1.18–1.23] and NAR aHR 1.20 [1.16–1.25]), and one year (UAR aHR 1.22 [1.19–1.25] and NAR aHR 1.16 [1.10–1.22]) after initial infection relative to urban dwellers.

New-Onset Diagnoses Among Decedents Within Two Years of COVID-19

Among patients who died within two years of infection (Fig. 5, and Supplementary Data 3), rural dwellers exhibited lower documented rates of new-onset diagnoses across multiple systems—including respiratory, cardiovascular, immune, nervous, digestive, endocrine, musculoskeletal, integumentary, and genitourinary—as well as reduced diagnoses of neoplasms, constitutional symptoms, blood and blood-forming tissue abnormalities, metabolic homeostasis disturbances, and a combined variable indicating any post-COVID condition (p < 0.001). Rural residents had similar rates of SARS-CoV-2 reinfection and long COVID among decedents compared with urban residents (p = 0.11 and p = 0.15, respectively).

Fig. 5: New-Onset Post-COVID-19 Conditions and Events Among Patients Dying Within 2 Years by Rurality.
figure 5

Figure 5 provides new-onset conditions and events for patients with a documented SARS-CoV-2 infection between April 1, 2020, and December 31, 2022, who died within two years. Outcomes and conditions are documented following acute COVID-19 but before death by rurality. P values are calculated using two-sided Pearson’s chi-squared tests. Full results are available in Supplementary Data 3.

Post-COVID-19 mortality among patients hospitalized with acute COVID-19

Among the 230,342 patients hospitalized with COVID-19 (-3 to +14 days of infection), 178,542 (78%) were urban, 41,857 (18%) were UAR, and 9943 (4.3%) were NAR dwellers. Compared to the overall cohort, hospitalized rural patients were more likely to have several comorbid conditions (Supplementary Data 4) but were still less likely to be vaccinated. Those hospitalized had an age-adjusted overall two-year mortality risk of 11,416.18 (95% CI 11,291.81–11,540.55) per 100,000 persons, with an urban risk of 11,097.53 (10,959.82–11,235.23), UAR risk of 12,490.37 (12,174.17–12,806.56), and NAR risk of 12,905.96 (12,167.04–13,644.87). The excess mortality at two years among hospitalized patients was 1,392.84 (716.88–2,068.79) for UAR and 1,808.43 (335.24–3,281.61) for NAR per 100,000 persons compared to the urban mortality rate. Hospitalized rural dwellers had a higher mortality across all periods.

After IPW within the hospitalized cohort with good balance (Supplementary Fig. 3), we observed similar increased risk across all post-infection periods as in the overall cohort (Supplementary Data 6, Fig. 6). At one-month post-COVID-19, UAR was associated with an aHR of 1.18 (95% CI 1.15–1.22) for all-cause mortality, while NAR was associated with an aHR of 1.22 (1.15–1.29) relative to urban dwellers. We found a similar increased risk for rural dwellers at three months (UAR aHR 1.16 [1.13–1.19] and NAR aHR 1.22 [1.16–1.28]), one year (UAR aHR 1.16 [1.13–1.19] and NAR aHR 1.21 [1.16–1.26]), and two years (UAR aHR 1.16 [1.13–1.18] and NAR aHR 1.19 [1.15–1.24]) after infection relative to urban dwellers. We observed similar associations between rurality and two-year mortality in patients who survived to landmarks at one month, three months, and one year after infection. We also observed similar post-COVID conditions among hospitalized decedents, with rural residents having lower new-onset conditions observed (Supplementary Data 7).

Fig. 6: Unadjusted and adjusted hazard ratio for all-cause mortality among patients hospitalized after an acute SARS-CoV-2 infection from April 2020 through December 2022 by rurality.
figure 6

Figure 6 contains Cox Proportional Hazards models for all-cause mortality among patients hospitalized within two weeks (-3/ + 14 days) after an acute SARS-CoV-2 infection by rurality. Deaths are captured from month 0 (day 0) through month 24 (day 730) in the period after diagnosis of COVID-19 at intervals of A) one month, B) three months, C) one year, D) two years, E) two years among patients surviving the initial month after infection, F) two years among patients surviving the initial three months after infection, and G) two years among patients surviving the initial one year after infection. Adjustments were made for sex, age, race/ethnicity, SARS-CoV-2 variant-dominant period at the time of infection, vaccination status, Social Vulnerability Index, U.S. Census Division, and a history of obesity, hypertension, myocardial infarction, congestive heart failure, peripheral vascular disease, rheumatic disease cerebrovascular disease, dementia, chronic pulmonary disease, peptic ulcer disease, hemiplegia or paraplegia, diabetes, liver disease, renal disease, cancer, HIV, tobacco usage, and substance use disorder documented before COVID-19. Forest plots display hazard ratios (points) on a log scale, with horizontal lines representing 95% confidence intervals (error bars). The vertical line indicates the null value of 1.0. Full model specifications are available in Supplementary Data 6.

Post-COVID-19 Mortality Differences Stratified by Rurality

In models stratified by rural residency (Fig. 7, and Supplementary Data 8), we observed significant differences in risk across key demographic and clinical factors. Males in rural and urban settings had a higher two-year mortality risk than females, with rural males having an aHR of 1.21 (95% CI: 1.18–1.24) and urban males showing a slightly higher aHR of 1.25 (1.23–1.27). Age remained a strong determinant of mortality, with older populations showing steep increases in risk. In rural areas, individuals aged 75 and older had an aHR of 21.98 (20.88–23.14), compared to 18.27 (17.74–18.81) in urban areas.

Fig. 7: Adjusted Hazard Ratio for 2-Year Death Among Patients with an Acute SARS-CoV-2 Infection from April 2020 through December 2022 Stratified by Rural-Urban Residency.
figure 7

Figure 7 contains Cox Proportional Hazards models for all-cause mortality among patients with an acute SARS-CoV-2 infection from April 2020 through December 2022, stratified by rurality (rural residents on the left and urban residents on the right). Adjustments were made for sex, age, race/ethnicity, SARS-CoV-2 variant-dominant period at the time of infection, vaccination status, Social Vulnerability Index, U.S. Census Division, and a history of obesity, hypertension, myocardial infarction, congestive heart failure, peripheral vascular disease, rheumatic disease cerebrovascular disease, dementia, chronic pulmonary disease, peptic ulcer disease, hemiplegia or paraplegia, diabetes, liver disease, renal disease, cancer, HIV, tobacco usage, and substance use disorder documented before COVID-19. Forest plots display hazard ratios (points) on a log scale, with horizontal lines representing 95% confidence intervals (error bars). The vertical line indicates the null value of 1.0. Full model specifications are available in Supplementary Data 8.

Social vulnerability was associated with higher mortality in rural and urban settings, with SVI having an aHR of 1.29 (1.22–1.36) for rural residents and 1.31 (1.27–1.34) for urban residents. These trends persisted for patients with a COVID-19-associated hospitalization (Fig. 8, and Supplementary Data 9), except that social vulnerability was associated with higher all-cause mortality in rural than urban settings.

Fig. 8: Adjusted Hazard Ratio for 2-Year Death Among Patients Hospitalized After an Acute SARS-CoV-2 Infection from April 2020 through December 2022 Stratified by Rural-Urban Residency.
figure 8

Figure 8 contains Cox Proportional Hazards models for all-cause mortality among patients hospitalized within two weeks (-3/ + 14 days) after an acute SARS-CoV-2 infection, stratified by rurality (rural residents on the left and urban residents on the right). Adjustments were made for sex, age, race/ethnicity, SARS-CoV-2 variant-dominant period at the time of infection, vaccination status, Social Vulnerability Index, U.S. Census Division, and a history of obesity, hypertension, myocardial infarction, congestive heart failure, peripheral vascular disease, rheumatic disease cerebrovascular disease, dementia, chronic pulmonary disease, peptic ulcer disease, hemiplegia or paraplegia, diabetes, liver disease, renal disease, cancer, HIV, tobacco usage, and substance use disorder documented before COVID-19. Forest plots display hazard ratios (points) on a log scale, with horizontal lines representing 95% confidence intervals (error bars). The vertical line indicates the null value of 1.0. Full model specifications are available in Supplementary Data 9.

Post-COVID-19 mortality differences by dichotomous rurality

We conducted a secondary analysis using a binary rural indicator instead of the three-level rural classification scheme and found that the main findings were consistent with the original analysis. Detailed results of this analysis are available in Supplementary Data 10-11, which further support the robustness of the findings across different rural classification approaches.

COVID-19-Negative Comparison Cohort

To assess whether rural mortality disparities were specific to SARS-CoV-2 infection or reflected broader rural mortality differences, we examined a COVID-19-negative comparison cohort. This cohort included 4,153,216 individuals without documented SARS-CoV-2 infection, comprising 3,480,053 urban, 535,415 UAR, and 137,748 NAR residents (Supplementary Data 12). In models restricted to this cohort, two-year all-cause mortality was higher among rural dwellers (Supplementary Data 13), with UAR and NAR residents having aHRs of 1.17 (95% CI: 1.15–1.19) and 1.21 (1.17–1.25), respectively, compared to urban dwellers.

Interaction Between Rurality and COVID-19

To assess whether COVID-19 modified the rural mortality penalty, we included a rurality-by-COVID-19 interaction term in weighted Cox models that included both SARS-CoV-2-positive and SARS-CoV-2-negative patients (Supplementary Data 14). At two years, we observed an interaction aHR of 1.06 (95% CI, 1.04–1.08) for UAR and 1.06 (1.04–1.08) for NAR between rural residency and COVID-19. These findings suggest an interaction, with COVID-19 slightly increasing associations between rurality and death.

To quantify the mortality burden of SARS-CoV-2 infection, we calculated the age-adjusted excess mortality risk per 100,000 persons by rurality (Supplementary Data 15) among infected versus uninfected patients. Overall, COVID-19 was associated with an excess risk of 1668.4 (1668.4–1668.4) deaths per 100,000 persons. This risk was higher in UAR (1794.9 [1794.9–1794.9]) and NAR (1755.7 [1755.7–1755.7}) than among urban residents (1603.6 [1603.6–1603.6]).

In a combined model that included both SARS-CoV-2-positive and -negative individuals (Supplementary Data 16), but with no interaction term, rurality remained independently associated with higher two-year mortality (UAR aHR 1.21 [1.19–1.22]; NAR aHR 1.26 [1.23–1.29]). SARS-CoV-2 infection was a strong independent predictor of mortality, with an aHR of 1.63 (1.62–1.65) relative to those without documented infection, with the risk reducing with greater time from infection.

Reinfection as a Time-Varying Covariate

We also conducted a sensitivity analysis incorporating SARS-CoV-2 reinfection as a time-varying covariate to account for potential changes in risk over time (Supplementary Data 17). We observed 234,397 reinfections in our cohort with consistent findings to our main models. Rural dwellers had a similarly elevated mortality risk (UAR aHR 1.19 [1.17–1.21]; NAR aHR 1.26 [1.22–1.29]) relative to urban residents. SARS-CoV-2 reinfection was strongly associated with two-year mortality, with an aHR of 1.95 (1.89–2.00).

Discussion

Our study identified a rural-urban gradient in two-year all-cause mortality following SARS-CoV-2 infection, with adjusted risks 19% higher for urban-adjacent rural (UAR) and 26% higher for nonurban-adjacent rural (NAR) dwellers than urban residents, even after accounting for confounding factors and community vulnerability. This increased risk of death was present for both patients requiring hospitalization for COVID-19 and those managed solely as outpatients. Previous research focusing on acute outcomes at the individual16 and population levels20 has shown higher COVID-associated mortality in rural versus urban areas. Our data expand upon this work by showing that the COVID-exacerbated rural mortality penalty persists for at least two years after adjusting for background risk factors.

The rural mortality penalty in the US is a significant public health issue marked by disparities in mortality rates between rural and urban populations. This phenomenon, exacerbated by healthcare access, socioeconomic status, educational inequality, and lifestyle choices, has been the subject of much recent scholarship. Cosby et al. first described the evolution of the rural mortality penalty in 200821, exploring the underlying causes and identifying factors such as healthcare access, economic challenges, and educational disparities as key contributors to this penalty3.

The rural mortality penalty varies significantly across different regions in the US, demonstrating diverse health outcomes2, differences in life expectancy11, and marked regional22 and socioeconomic disparities1 across the rural spectrum. Additionally, several studies have analyzed lifespan variation across racial and ethnic groups and documented race-specific penalties, notably among rural Black and white populations23.

These persistent disparities underscore the urgent need for targeted public health strategies in rural communities. The significant rural-urban divide regarding health risks emphasizes addressing the underlying factors contributing to these disparities, ensuring equitable healthcare access, and improving health outcomes for all populations, irrespective of geographical location. While studies have assessed the role of place-based mortality12 and the characteristics of rural areas that contribute to higher mortality rates from preventable causes24, quantifying a COVID-specific rural mortality penalty that persists beyond the immediate acute phase has not, to our knowledge, been established. Our study examines this and highlights a COVID-associated rural mortality penalty extending two years after initial infection.

We examined a COVID-19-negative comparison cohort of over 4.1 million individuals to further contextualize the observed rural mortality disparities. Within this cohort, rural dwellers had modestly higher two-year all-cause mortality compared to urban residents, with adjusted risks 17% higher for UAR and 21% higher for NAR. These findings align with known patterns of elevated mortality in rural populations. The rural mortality gradient was slightly more pronounced in the COVID-19 cohort (19% UAR and 26% NAR).

In a combined model that included COVID-19-positive and -negative individuals, we observed a modest interaction between rurality and COVID-19. In subsequent analysis, SARS-CoV-2 infection was associated with a 63% increased risk of two-year mortality, and rural-urban differences in mortality persisted. Although the relative hazard associated with rurality remained stable, the absolute number of deaths attributable to COVID-19 was higher among rural residents. We also found a similar risk for rurality in a sensitivity analysis with SARS-CoV-2 reinfection as a time-varying covariate. These findings indicate that while rural-urban mortality differences exist independently of COVID-19, infection with SARS-CoV-2 exacerbates this gap.

Prior findings have demonstrated that COVID-19 treatments are effective in both urban and rural dwellers, yet mortality at 90 days is 36% greater risk among rural residents18. This fact and the persistence of the mortality penalty for rural dwellers so long after surviving an episode of COVID-19 begs the question as to why this disparity persists. Behavioral factors such as vaccine hesitancy may also have contributed, as prior research25,26,27 has shown greater skepticism toward COVID-19 vaccination among rural residents. Understanding and addressing the reasons for this penalty may ultimately improve health outcomes in rural areas beyond the COVID-19 pandemic.

Rural health outcome disparities have been reported for multiple conditions in addition to COVID-19, including colorectal cancer28 and diabetes8. Rural residency presents unique challenges in care coordination29. The lack of health care resources in rural areas is well documented. Even if rural dwellers can access primary care physicians and rehabilitation services, those professionals may lack access to information about care received at the referral center, leading to potentially worse outcomes.

We found that outcomes are worse for rural versus urban dwellers across all time points post-COVID. Still, paradoxically, rural dwellers have lower rates of new diagnoses of other post-COVID conditions after their SARS-CoV-2 infection. The combination of lower rates of new diagnoses and higher mortality suggests that under-diagnosis may occur in the rural population. This under-diagnosis hypothesis is consistent with rural communities’ well-known lack of healthcare resources compared to urban communities. While further work is needed to determine the factors contributing to the persistent mortality penalty suffered by survivors of acute COVID-19 in rural areas, the existence of this disparity highlights a broader crisis in rural healthcare.

Another important consideration in interpreting our findings is the role of social vulnerability. SVI is a composite of 16 variables in four broad areas (i.e., socioeconomic status, household characteristics, race/ethnicity minority status, and housing type and transportation)30. Our models adjusted for social vulnerability, however, social vulnerability may not be entirely distinct from rurality. Social vulnerability may be one of the deeper factors driving the rural effect on mortality, meaning that our inclusion of social vulnerability likely led to underestimating the impact of rural residency on post-infection mortality. High social vulnerability, which reflects cumulative disadvantage across socioeconomic and demographic factors, was associated with significantly higher mortality risks, particularly among rural dwellers. Recent research demonstrates that counties with high social vulnerability have higher rates of chronic diseases31, reduced access to healthcare32, and more barriers to preventive care33. The study underscores the compounded effects of living in a geographically isolated and socioeconomically disadvantaged area by adjusting for rurality and social vulnerability.

We also identified an important interaction between social vulnerability and rurality, further highlighting these compounded effects. However, this interaction did not substantially attenuate the rural effect when social vulnerability was excluded from the original models. As a result, we opted to present stratified models to better illustrate the differences in outcomes between rural and urban settings. The stratified analysis provides valuable insights into the differential impact of patient- and community-level factors on long-term mortality. Specifically, social vulnerability was associated with higher mortality in rural and urban dwellers, but higher in rural dwellers hospitalized with COVID-19.

Limitations

This study has several limitations. First, not all sites in N3C opt-in to PPRL, which means they did not have externally validated death records. Given that this study seeks to understand long-term differences in all-cause mortality, this resulted in the exclusion of approximately half of N3C-contributing sites that do not participate in PPRL for externally-validated mortality data34. At the patient level, data on key variables – such as 5-digit ZIP codes, comorbidities, and vaccination status – were missing for many patients in the N3C Enclave and were systematically missing from multiple sites. The reliance on EHR data, with its varying reporting practices across sites, introduces the potential for missingness bias, which is a well-known concern when dealing with real-world data35. We mitigated this by excluding patients with insufficient pre-COVID visit histories and employing robust statistical methods, including weighting and adjusting for site-specific random effects. However, unmeasured confounders, such as undocumented health behaviors and environmental exposures, may still influence our findings.

Second, referral-in bias poses another limitation, particularly for rural patients. This referral-in bias may take two forms. First, among those rural patients with COVID-19 requiring hospitalization, the more severely ill may have a greater tendency to be transferred to referral tertiary centers with a greater likelihood of contributing data to N3C than local community hospitals, potentially caring for less severely ill or comorbid COVID-19 patients. N3C data-contributing sites potentially have an enrichment of more severe COVID-19 cases among rural-dwelling patients. Despite adjusting for multiple comorbidities and factors associated with the severity of illness, residual confounding may persist from this potential referral-in bias. Rural residents requiring specialized care are often transferred to urban centers, which may result in a second form of post-referral-in bias or effect. Specifically, being discharged from an urban care setting into a rural residence at discharge may affect healthcare access, care coordination, and follow-up, thus influencing long-term mortality outcomes. These factors may translate into a real disadvantage in rural dweller outcomes, which are explanatory rather than a bias from confounding or a study limitation. The paradoxically lower incidence of new diagnoses in follow-up among rural dwellers is consistent with this mechanistic supposition. Nonetheless, we attempted to address this by including adjustments for rurality and degree of rurality (UAR vs. NAR) and using site-specific random effects. Still, residual long-term mortality differences may remain due to explanatory disparities in healthcare access and delivery, adversely affecting rural compared to urban dwellers.

Third, although the N3C includes a diverse population from various U.S. regions, including equivalent numbers of rural versus urban dwellers as the U.S. population, it may not fully capture the experiences of all rural populations, particularly those in underrepresented areas or healthcare systems that did not participate in the N3C. This limits the generalizability of our findings, especially in regions with limited healthcare access, where mortality rates could be higher.

Lastly, while we adjusted for a wide range of demographic, clinical, and variant-period factors, residual confounding by socioeconomic status, health behaviors, and environmental exposures–factors often not captured in EHR data – may still contribute to the observed rural-urban disparities. These limitations highlight the complexity of understanding the full scope of rural health disparities and emphasize the need for future research that addresses these unmeasured factors.

Despite these limitations, this study had several strengths – a large sample size, adjustments for background risk, and an extensive follow-up period, including all-cause mortality linkage – that contribute to our understanding of the persistent disparities in long-term mortality between rural and urban populations following SARS-CoV-2 infection.

In conclusion, rural dwellers have significantly higher mortality than urban dwellers after SARS-CoV-2 infection, and this poorer prognosis persists for at least two years. The increased long-term mortality for rural dwellers, coupled with a paradoxically lower incidence of new diagnoses during the same period, underscores the critical need for improved healthcare access and follow-up care in rural communities. Greater social vulnerability was also associated with higher mortality, further highlighting the importance of tailored public health strategies to address ongoing health disparities in rural populations.

Methods

This retrospective cohort study utilized data from the National Clinical Cohort Collaborative (N3C) COVID-19 Enclave36, a longitudinal electronic health record (EHR) repository of SARS-CoV-2-infected individuals. N3C is a next-generation registry37 containing EHR data contributed from sites across the U.S., including externally linked data through a process called privacy-preserving record linkage (PPRL)34. This includes de-duplication of patients across sites, externally validated death records, and linkage with external patient records and registries, including GenBank-linked SARS-CoV-2 viral variants, Medicaid/Medicare claims data, and SEER registry data38. The Regenstrief Institute serves as the linkage honest broker39. Over half of N3C sites (N = 50) participate in PPRL, with 36 supporting external mortality linkage.

The University of Nebraska Medical Center’s IRB (0176-21-EP) and the N3C Data Access Committee (RP-924490) approved the study, which adheres to RECORD reporting guidelines (https://www.equator-network.org/reporting-guidelines/record/). Additionally, each participating site’s IRB also approved this study. N3C operates under the authority of the National Institutes of Health IRB, with Johns Hopkins University serving as the central IRB (IRB00309495). No informed consent was obtained from individual patients because the study used a limited data set among sites opting into PPRL with enhanced mortality linkage, stripping direct identifiers in compliance with the HIPAA Privacy Rule.

Cohort definition

Persons were classified based on rurality, identified by mapping 5-digit ZIP Codes to the 2010 Rural-Urban Commuting Area (RUCA) codes40. We grouped RUCA codes into a binary urban-rural distinction (RUCA codes 1−3 are urban and 4-10 are rural). Further, we subclassified their degree of rurality into urban-adjacent rural (UAR) and nonurban-adjacent rural (NAR), using a previously described methodology based on commuting flows to urban clusters16,17, which aligns with the Federal Office for Rural Health Policy guidance on rural-qualifying ZIP Codes41.

We included weighting or adjustments for age at COVID-19 diagnosis, sex, race/ethnicity, receipt of any COVID-19 vaccination before incident COVID-19 (categorized as no documented vaccination before infection or completion of a primary series ( ≥ 2 mRNA doses or 1 viral vector dose, with or without additional doses), tobacco usage, obesity (both diagnosed and identified by body mass index >30), and comorbid conditions diagnosed by a provider, including substance use disorder, obesity, hypertension, and individual categories from the Charlson Comorbidity Index (CCI)42. We also adjusted for community risk using the Social Vulnerability Index (SVI)30, a composite measure of 15 social and environmental factors developed by the U.S. Centers for Disease Control and Prevention (CDC) and Agency for Toxic Substances and Disease Registry (ATSDR) mapped to U.S. residency at the county level, the nine U.S. Census divisions to understand the differential impact of rural residence by region43, and differences across SARS-CoV-2 variant-dominant periods based on predominant strain from CDC reporting (see Supplementary Methods for CDC variant-dominant period ascertainment process and timing), which included ancestral COVID-19; Alpha (B.1.1.7), Beta (B.1.351), Gamma (P.1); Delta (B.1.617.2); early Omicron (B.1.1.529, BA.2, BA.2.12.1); and later Omicron (BA.5, BQ.1.1, XBB.1.5)44.

These variables were selected a priori based on established associations with COVID-19 outcomes and their relevance to post-acute mortality risk. Individual-level variables such as age, comorbidities, and vaccination status were modeled at the patient level, while ecological variables such as SVI and Census Division were assigned based on ZIP Code or county of residence.

Our primary cohort was all patients infected during the study period (April 2020 – December 2022). To understand differences based on initial COVID-19 severity (i.e., patients hospitalized for COVID-19), we also assessed mortality differences among patients hospitalized within -3 to +14 days of their initial SARS-CoV-2 infection. This timeframe was selected to account for possible delays in post-admission testing and to include the typical incubation period of the virus45, thereby capturing hospitalizations directly associated with the acute phase of COVID-19. We excluded patients with no pre-COVID-19 visit history (required to understand pre-COVID risk factors), without a 5-digit ZIP Code, pediatric patients (<19 years at infection date), centenarians (100+ years at infection date), and those from sites not participating in PPRL (required to assess deaths occurring outside the contributing hospital system) or with limited data reporting in the condition domain. These exclusions were primarily due to limitations in site-level reporting, as many sites did not provide key structured data, such as ZIP Code or externally verified mortality data, necessary for cohort construction. We include detailed inclusion and exclusion criteria and site characteristics in the Supplementary Methods.

Outcome definition

We analyzed two-year all-cause mortality among patients infected with SARS-CoV-2 between April 1, 2020, and December 31, 2022, with follow-up until December 31, 2024. The data were extracted from N3C on February 2, 2025 (N3C Release 188). Mortality was assessed at intervals after initial SARS-CoV-2 infection at one month (30 days), three months (91 days), one year (365 days), and two years (730 days), using hospital-documented deaths and enhanced mortality records using PPRL34 that enabled linkage with the Social Security Administration’s Death Master File.

We employed a piecewise landmark analysis to explore long-term mortality trends further, focusing on patients who survived to the end of each critical time point: month one (>30 days), month three (>91 days), and year one (>365 days) post-infection. This method was chosen to provide a more detailed understanding of mortality risk among those who survived the immediate acute phase of COVID-19. By segmenting the analysis into these distinct intervals, we aimed to capture potential changes in mortality risk over time, particularly concerning rural residency, which may influence outcomes due to varying healthcare access and social determinants of health.

This piecewise approach complements the primary Cox proportional hazards models by allowing us to focus on specific post-acute phases, where the risk profile may differ significantly from the acute phase. We applied this granular analysis to the overall cohort and specifically to those hospitalized during the acute infection phase, providing insights into the persistence of risk in different subgroups over time.

Secondary outcomes

We explored new-onset intermediate events that occurred after SARS-CoV-2 infection but before death among those dying in the two years after COVID-19. New onset post-COVID-19 conditions were classified as those not documented using the same condition code in the pre-COVID period and occurring between SARS-CoV-2 infection and death. We assessed for SARS-CoV-2 reinfection (2 PCR or Ag tests at least 90 days apart), long COVID (ICD-10 U09.9), and the following phenotypic abnormalities using the Human Phenotype Ontology46 mapped to OMOP: respiratory system, cardiovascular system, immune system, nervous system, metabolism homeostasis, blood and blood-forming tissues, digestive system, endocrine system, musculoskeletal system, integument, and genitourinary system as well as constitutional symptoms and neoplasms47.

Statistical analysis

Mortality differences by rurality were assessed using Kaplan-Meier survival analysis with a log-rank test. We calculated mortality risks per 100,000 persons at 1-, 3-, 12-, and 24-months post-infection. Risks were calculated as cumulative incidence proportions, and 95% confidence intervals (95% CI) were derived under a Poisson assumption for the event count. Excess mortality was expressed as the risk difference between each rural-urban group and the overall cohort, standardized using the 2020 U.S. Standard Population. We report crude and age-adjusted rates using the 2020 U.S. Standard Population to account for differences in age distribution across rural-urban groups48. We also report event rates per person-months in the Supplementary Data.

We assessed associations between rurality and two-year mortality using multivariable Cox Proportional Hazards with inverse probability treatment weighting (IPW). Stabilized weights were calculated as the marginal probability of each rural group divided by the predicted probability from a multinomial logistic regression model. Weighting included age group, sex, race/ethnicity, SARS-CoV-2 variant-dominant period, and SARS-CoV-2 vaccination status. We chose these variables to represent universal confounders of rural residence and mortality risk, while avoiding covariates that might reflect downstream effects or inherent components of rurality itself (e.g., comorbid burden, social vulnerability). These downstream variables are potential mediators, not colliders, as they may be on the pathway between rurality and mortality. We excluded them from the propensity score model to avoid blocking these pathways in total-effect estimation, but included them in the outcome model when estimating the direct effect. This approach also maintains the temporal order between exposure and confounders, preventing the adjustment of meaningful rural health disparities.

In the weighted regression model, we adjusted for additional variables excluded from the propensity score model, including comorbid conditions, SVI, and Census Division, to estimate the direct association of rurality with mortality conditional on these pathways. This specification avoids overlap in covariates between the weighting and regression stages and maintains the doubly robust property49, whereby the association is consistently estimated if either the propensity score or the outcome model is correctly specified. We included a random effect for the data-contributing site to account for data-reporting differences.

We re-analyzed all tests in a cohort using a dichotomous rural-urban variable to assess the robustness of the findings with a more stable sample size across all periods. Additionally, we performed a sensitivity analysis that accounted for multiple infections by treating reinfections as a time-varying covariate in a case-level approach instead of a person-level approach based on the initial infection.

To assess whether the effect of rurality on mortality is specific to COVID-19, consistent across COVID-positive and COVID-negative cohorts, or exacerbated by COVID-19, we conducted additional analyses using a cohort of COVID-negative individuals from the same data-contributing sites. COVID-negative individuals were defined as those without a documented SARS-CoV-2 infection during the study period, with index dates assigned to align with the distribution of first positive dates in the primary cohort. We analyzed two-year mortality within the COVID-negative cohort alone and in a combined model including both cohorts, incorporating COVID-19 status as a covariate. To formally assess effect modification, we included an interaction term between rurality and COVID-19 status in the combined model. All methods and adjustments were consistent with the main analysis (excluding COVID-19 vaccinations and variant-dominant period), allowing for a comparison of rurality’s impact across cohorts.

The statistical tests performed were 2-sided at a significance level of p <  .05 and presented with 95% confidence intervals. The Supplementary Methods describe methodological details, including all computable phenotype definitions and an overview of the N3C environment and sampling approach. Supplementary Data 1-15 and Figs. 17 present full model specifications and sensitivity analyses. Statistical analyses were performed in R v4.1.3, Python, and SQL within the N3C Enclave, which is available on GitHub50.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.