Abstract
Genome-wide association studies provide a powerful means of identifying loci and genes contributing to disease, but in many cases, the related cell types/states through which genes confer disease risk remain unknown. Deciphering such relationships is important for identifying pathogenic processes and developing therapeutics. In the present study, we introduce sc-linker, a framework for integrating single-cell RNA-sequencing, epigenomic SNP-to-gene maps and genome-wide association study summary statistics to infer the underlying cell types and processes by which genetic variants influence disease. The inferred disease enrichments recapitulated known biology and highlighted notable cell–disease relationships, including γ-aminobutyric acid-ergic neurons in major depressive disorder, a disease-dependent M-cell program in ulcerative colitis and a disease-specific complement cascade process in multiple sclerosis. In autoimmune disease, both healthy and disease-dependent immune cell-type programs were associated, whereas only disease-dependent epithelial cell programs were prominent, suggesting a role in disease response rather than initiation. Our framework provides a powerful approach for identifying the cell types and cellular processes by which genetic variants influence disease.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
All postprocessed scRNA-seq data (except for AD; see below) are available through the original publications with PMIDs: 28091601, 33208946, 31316211, 31097668, 31042697, 31348891, 32832598, 31209336, 31604275, 33654293, 32403949 and 30355494. In addition, gene programs, enhancer–gene-linking annotations, supplementary data files and high-resolution figures are publicly available online at https://data.broadinstitute.org/alkesgroup/LDSCORE/Jagadeesh_Dey_sclinker. The AD scRNA-seq data30 are available exclusively at https://www.radc.rush.edu/docs/omics.htm per its data usage terms. This work used summary statistics from the UK Biobank study (http://www.ukbiobank.ac.uk). The summary statistics for UK Biobank used in this paper are available at https://data.broadinstitute.org/alkesgroup/UKBB. The 1000 Genomes Project Phase 3 data are available at ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/2013050. The baseline-LD annotations are available at https://data.broadinstitute.org/alkesgroup/LDSCORE. We provide a web interface to visualize the enrichment results for different programs used in our analysis at https://share.streamlit.io/karthikj89/scgenetics/www/scgwas.py.
Code availability
This work uses the S-LDSC software (https://github.com/bulik/ldsc) to process GWAS summary statistics as well as S-LDSC software and MAGMA v.1.08 (https://ctg.cncr.nl/software/magma) for post-hoc analysis. Code for constructing cell-type, disease-dependent and cellular process gene programs from scRNA-seq data and performing the healthy and disease-shared NMF can be found at https://github.com/karthikj89/scgenetics (https://doi.org/10.5281/zenodo.6516048)106. Code for processing gene programs and combining with enhancer–gene links can be found at https://github.com/kkdey/GSSG (https://doi.org/10.5281/zenodo.6513166)107.
References
Schizophrenia Working Group of the Psychiatric Genomics Consortium et al. Biological insights from 108 schizophrenia-associated genetic loci. Nature 511, 421–427 (2014).
Visscher, P. M. et al. 10 years of GWAS discovery: biology, function, and translation. Am. J. Hum. Genet. 101, 5–22 (2017).
Buniello, A. et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47, D1005–D1012 (2019).
Maurano, M. T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012).
Price, A. L., Spencer, C. C. A. & Donnelly, P. Progress and promise in understanding the genetic basis of common diseases. Proc. R. Soc. B Biol. Sci. 282, 20151684 (2015).
Shendure, J., Findlay, G. M. & Snyder, M. W. Genomic medicine—progress, pitfalls, and promise. Cell 177, 45–57 (2019).
Zeggini, E., Gloyn, A. L., Barton, A. C. & Wain, L. V. Translational genomics and precision medicine: moving from the lab to the clinic. Science 365, 1409–1413 (2019).
Hekselman, I. & Yeger-Lotem, E. Mechanisms of tissue and cell-type specificity in heritable traits and diseases. Nat. Rev. Genet. 21, 137–150 (2020).
Trynka, G. et al. Chromatin marks identify critical cell types for fine mapping complex trait variants. Nat. Genet. 45, 124–130 (2013).
Pickrell, J. K. Joint analysis of functional genomic data and genome-wide association studies of 18 human traits. Am. J. Hum. Genet. 94, 559–573 (2014).
Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).
Zhou, J. et al. Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk. Nat. Genet. 50, 1171–1179 (2018).
Zhu, X. & Stephens, M. Large-scale genome-wide enrichment analyses identify new trait-associated genes and pathways across 31 human phenotypes. Nat. Commun. 9, 4361 (2018).
Wang, Q. et al. A Bayesian framework that integrates multi-omics data and gene networks predicts risk genes from schizophrenia GWAS data. Nat. Neurosci. 22, 691–699 (2019).
Fang, H. et al. A genetics-led approach defines the drug target landscape of 30 immune-related traits. Nat. Genet. 51, 1082–1091 (2019).
Calderon, D. et al. Inferring relevant cell types for complex traits by using single-cell gene expression. Am. J. Hum. Genet. 101, 686–691 (2017).
Ongen, H. et al. Estimating the causal tissues for complex traits and diseases. Nat. Genet. 49, 1676–1683 (2017).
Finucane, H. K. et al. Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nat. Genet. 50, 621–629 (2018).
Ernst, J. et al. Systematic analysis of chromatin state dynamics in nine human cell types. Nature 473, 43–49 (2011).
Roadmap Epigenomics Consortium et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
Liu, Y., Sarkar, A., Kheradpour, P., Ernst, J. & Kellis, M. Evidence of reduced recombination rate in human regulatory domains. Genome Biol. 18, 193 (2017).
Fulco, C. P. et al. Activity-by-Contact model of enhancer-promoter regulation from thousands of CRISPR perturbations. Nat. Genet. 51, 1664–1669 (2019).
Nasser, J. et al. Genome-wide enhancer maps link risk variants to disease genes. Nature 593, 238–243 (2021).
Tanay, A. & Regev, A. Scaling single-cell genomics from phenomenology to mechanism. Nature 541, 331–338 (2017).
Tucker, N. et al. Transcriptional and cellular diversity of the human heart. Circulation 142, 466–482 (2020).
Travaglini, K. J. et al. A molecular cell atlas of the human lung from single-cell RNA sequencing. Nature 587, 619–625 (2020).
Kowalczyk, M. S. Census of immune cells. Human Cell Atlas Data Portal https://data.humancellatlas.org/explore/projects/cc95ff89-2e68-4a08-a234-480eca21ce79 (2018).
Sunkin, S. M. et al. Allen Brain Atlas: an integrated spatio-temporal portal for exploring the central nervous system. Nucleic Acids Res. 41, D996 (2013).
Habermann, A. C. et al. Single-cell RNA sequencing reveals profibrotic roles of distinct epithelial and mesenchymal lineages in pulmonary fibrosis. Sci. Adv. 6, eaba1972 (2020).
Mathys, H. et al. Single-cell transcriptomic analysis of Alzheimer’s disease. Nature 570, 332–337 (2019).
Jerby-Arnon, L. et al. A cancer cell program promotes T cell exclusion and resistance to checkpoint blockade. Cell 175, 984–997.e24 (2018).
Montoro, D. T. et al. A revised airway epithelial hierarchy includes CFTR-expressing ionocytes. Nature 560, 319–324 (2018).
Peng, Y.-R. et al. Molecular classification and comparative taxonomics of foveal and peripheral cells in primate retina. Cell 176, 1222–1237.e22 (2019).
Smillie, C. S. et al. Intra- and Inter-cellular rewiring of the human colon during ulcerative colitis. Cell 178, 714–730.e22 (2019).
Watanabe, K., Umićević Mirkov, M., de Leeuw, C. A., van den Heuvel, M. P. & Posthuma, D. Genetic mapping of cell type specificity for complex traits. Nat. Commun. 10, 3222 (2019).
Bryois, J. et al. Genetic identification of cell types underlying brain complex traits yields insights into the etiology of Parkinson’s disease. Nat. Genet. 52, 482–493 (2020).
Corces, M. R. et al. Single-cell epigenomic analyses implicate candidate causal variants at inherited risk loci for Alzheimer’s and Parkinson’s diseases. Nat. Genet. 52, 1158–1168 (2020).
Drokhlyansky, E. et al. The human and mouse enteric nervous system at single-cell resolution. Cell 182, 1606–1622.e23 (2020).
Leeuw, C. A., de, Mooij, J. M., Heskes, T. & Posthuma, D. MAGMA: generalized gene-set analysis of GWAS data. PLoS Comput. Biol. 11, e1004219 (2015).
Gazal, S. et al. Linkage disequilibrium dependent architecture of human complex traits shows action of negative selection. Nat. Genet. 49, 1421–1427 (2017).
Gazal, S., Marquez-Luna, C., Finucane, H. K. & Price, A. L. Reconciling S-LDSC and LDAK functional enrichment estimates. Nat. Genet. 51, 1202–1204 (2019).
Zheng, G. X. Y. et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049 (2017).
Stewart, B. J. et al. Spatio-temporal immune zonation of the human kidney. Science 365, 1461–1466 (2019).
Muus, C. et al. Single-cell meta-analysis of SARS-CoV-2 entry genes across tissues and demographics. Nat. Med. 27, 546–559 (2021).
Cheng, J. B. et al. Transcriptional programming of normal and inflamed human epidermis at single-cell resolution. Cell Rep. 25, 871–883 (2018).
Schirmer, L. et al. Neuronal vulnerability and multilineage diversity in multiple sclerosis. Nature 573, 75–82 (2019).
Braga, F. et al. A cellular census of human lungs identifies novel cell states in health and in asthma. Nat. Med. 25, 1153–1163 (2019).
Liao, M. et al. Single-cell landscape of bronchoalveolar immune cells in patients with COVID-19. Nat. Med. 26, 842–844 (2020).
Ulirsch, J. C. et al. Interrogation of human hematopoiesis at single-cell and single-variant resolution. Nat. Genet. 51, 683–693 (2019).
Chen, M.-H. et al. Trans-ethnic and ancestry-specific blood-cell genetics in 746,667 individuals from 5 global populations. Cell 182, 1198–1213.e14 (2020).
Biedermann, T., Skabytska, Y., Kaesler, S. & Volz, T. Regulation of T cell immunity in atopic dermatitis by microbes: the yin and yang of cutaneous inflammation. Front. Immunol. 6, 353 (2015).
Hennino, A. et al. Skin-infiltrating CD8+ T cells initiate atopic dermatitis lesions. J. Immunol. 178, 5571–5577 (2007).
Thériault, P., ElAli, A. & Rivest, S. The dynamics of monocytes and microglia in Alzheimer’s disease. Alzheimers Res. Ther. 7, 41 (2015).
Nuyts, A. H., Lee, W. P., Bashir-Dar, R., Berneman, Z. N. & Cools, N. Dendritic cells in multiple sclerosis: key players in the immunopathogenesis, key players for new cellular immunotherapies? Mult. Scler. 19, 995–1002 (2013).
Haschka, D. et al. Expansion of neutrophils and classical and nonclassical monocytes as a hallmark in relapsing–remitting multiple sclerosis. Front. Immunol. 11, 594 (2020).
Momeni, A. et al. Fingolimod and changes in hematocrit, hemoglobin and red blood cells of patients with multiple sclerosis. Am. J. Clin. Exp. Immunol. 8, 27–31 (2019).
Yeung, M. et al. Characterisation of mucosal lymphoid aggregates in ulcerative colitis: immune cell phenotype and TcR-γδ expression. Gut 47, 215–227 (2000).
Mouly, E. et al. The Ets-1 transcription factor controls the development and function of natural regulatory T cells. J. Exp. Med. 207, 2113–2125 (2010).
Mayassi, T. et al. Chronic inflammation permanently reshapes tissue-resident immunity in celiac disease. Cell 176, 967–981.e19 (2019).
Pandey, A. et al. Cloning of a receptor subunit required for signaling by thymic stromal lymphopoietin. Nat. Immunol. 1, 59–64 (2000).
Gao, P.-S. et al. Genetic variants in TSLP are associated with atopic dermatitis and eczema herpeticum. J. Allergy Clin. Immunol. 125, 1403–1407.e4 (2010).
Altin, J. A. et al. Ndfip1 mediates peripheral tolerance to self and exogenous antigen by inducing cell cycle exit in responding CD4+ T cells. Proc. Natl Acad. Sci. USA 111, 2067–2074 (2014).
Yip, K. H. et al. The Nedd4-2/Ndfip1 axis is a negative regulator of IgE-mediated mast cell activation. Nat. Commun. 7, 13198 (2016).
Villegas-Llerena, C., Phillips, A., Garcia-Reitboeck, P., Hardy, J. & Pocock, J. M. Microglial genes regulating neuroinflammation in the progression of Alzheimer’s disease. Curr. Opin. Neurobiol. 36, 74–81 (2016).
Efthymiou, A. G. & Goate, A. M. Late onset Alzheimer’s disease genetics implicates microglial pathways in disease risk. Mol. Neurodegener. 12, 43 (2017).
Luscher, B., Shen, Q. & Sahir, N. The GABAergic deficit hypothesis of major depressive disorder. Mol. Psychiatry 16, 383–406 (2011).
Mossakowska-Wójcik, J., A, O., M, T., J, S. & P, G. The importance of TCF4 gene in the etiology of recurrent depressive disorders. Prog. Neuropsychopharmacol. Biol. Psychiatry 80, 304–308 (2018).
Li, L. et al. Disruption of TCF4 regulatory networks leads to abnormal cortical development and mental disabilities. Mol. Psychiatry 24, 1235–1246 (2019).
Mbarek, H. et al. Genome-wide significance for PCLO as a gene for major depressive disorder. Twin Res. Hum. Genet. 20, 267–270 (2017).
Ciarimboli, G. et al. Proximal tubular secretion of creatinine by organic cation transporter OCT2 in cancer patients. Clin. Cancer Res. 18, 1101–1108 (2012).
Zhang, X. et al. Tubular secretion of creatinine and kidney function: an observational study. BMC Nephrol. 21, 108 (2020).
Cui, C., J, K., I, L., U, B. & D, K. Hepatic uptake of bilirubin and its conjugates by the human organic anion transporter SLC21A6. J. Biol. Chem. 276, 9626–9630 (2001).
Wang, X., Chowdhury, J. R. & Chowdhury, N. R. Bilirubin metabolism: applied physiology. Curr. Paediatr. 16, 70–74 (2006).
Barth, A. S. & Tomaselli, G. F. Cardiac metabolism and arrhythmias. Circ. Arrhythm. Electrophysiol. 2, 327–335 (2009).
Yamazaki, T. & Mukouyama, Y. Tissue specific origin, development, and pathological perspectives of pericytes. Front. Cardiovasc. Med. 5, 78 (2018).
Deckers, J., Hammad, H. & Hoste, E. Langerhans cells: sensing the environment in health and disease. Front. Immunol. 9, 93 (2018).
Hsieh, K. H., Chou, C. C. & Huang, S. F. Interleukin 2 therapy in severe atopic dermatitis. J. Clin. Immunol. 11, 22–28 (1991).
Kuleshov, M. V. et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 44, W90–W97 (2016).
Attie, A. D. & Scherer, P. E. Adipocyte metabolism and obesity. J. Lipid Res. 50, S395–S399 (2009).
Xia, B. Adipose tissue deficiency of hormone-sensitive lipase causes fatty liver in mice. PLoS Genet. 13, e1007110 (2017).
Rossi, S. et al. Inflammation inhibits GABA transmission in multiple sclerosis. Mult. Scler. 18, 1633–1635 (2012).
Cannella, B. et al. The neuregulin, glial growth factor 2, diminishes autoimmune demyelination and enhances remyelination in a chronic relapsing model for multiple sclerosis. Proc. Natl Acad. Sci. USA 95, 10100–10105 (1998).
Horstmann, L. et al. Inflammatory demyelination induces glia alterations and ganglion cell loss in the retina of an experimental autoimmune encephalomyelitis model. J. Neuroinflammation 10, 120 (2013).
Healy, L. M. et al. MerTK-mediated regulation of myelin phagocytosis by macrophages generated from patients with MS. Neurol. Neuroimmunol. Neuroinflamm. 4, e402 (2017).
Cignarella, F. et al. TREM2 activation on microglia promotes myelin debris clearance and remyelination in a model of multiple sclerosis. Acta Neuropathol. 140, 513–534 (2020).
Hemonnot, A.-L., Hua, J., Ulmann, L. & Hirbec, H. Microglia in Alzheimer disease: well-known targets and new opportunities. Front. Aging Neurosci. 11, 233 (2019).
Cromer, W. E., Mathis, J. M., Granger, D. N., Chaitanya, G. V. & Alexander, J. S. Role of the endothelium in inflammatory bowel diseases. World J. Gastroenterol. 17, 578–593 (2011).
Ruder, B., Atreya, R. & Becker, C. Tumour necrosis factor alpha in intestinal homeostasis and gut related diseases. Int. J. Mol. Sci. 20, 1887 (2019).
Graham, D. B. & Xavier, R. J. Pathway paradigms revealed from the genetics of inflammatory bowel disease. Nature 578, 527–539 (2020).
Bianco, A. M., Girardelli, M. & Tommasini, A. Genetics of inflammatory bowel disease from multifactorial to monogenic forms. World J. Gastroenterol. 21, 12296–12310 (2015).
Dixit, A. et al. Perturb-seq: dissecting molecular circuits with scalable single cell RNA profiling of pooled genetic screens. Cell 167, 1853–1866.e17 (2016).
Jin, X. et al. In vivo Perturb-Seq reveals neuronal and glial abnormalities associated with autism risk genes. Science 370, eaaz6063 (2020).
Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 15 (2018).
Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 16, 1289–1296 (2019).
Traag, V. A., Waltman, L. & van Eck, N. J. From Louvain to Leiden: guaranteeing well-connected communities. Sci. Rep. 9, 5233 (2019).
McInnes, L., Healy, J. & Melville, J. UMAP: Uniform Manifold Approximation and Projection for dimension reduction. Preprint at https://doi.org/10.48550/arXiv.1802.03426 (2020).
Lee, D. D. & Seung, H. S. Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999).
Auton, A. et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Kundaje, A. et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
Nasser, J. et al. Genome-wide enhancer maps link risk variants to disease genes. Nature 593, 238–243 (2021).
Dey, K. K. et al. SNP-to-gene linking strategies reveal contributions of enhancer-related and candidate master-regulator genes to autoimmune disease. Cell Genomics 2, 100145 (2022).
Hormozdiari, F. et al. Leveraging molecular quantitative trait loci to understand the genetic architecture of diseases and complex traits. Nat. Genet. 50, 1041–1047 (2018).
Storey, J. D. The positive false discovery rate: a Bayesian interpretation and the q-value. Ann. Stat. 31, 2013–2035 (2003).
van de Geijn, B. et al. Annotations capturing cell type-specific TF binding explain a large fraction of disease heritability. Hum. Mol. Genet. 29, 1057–1067 (2020).
The COVID-19 Host Genetics Initiative. The COVID-19 Host Genetics Initiative, a global initiative to elucidate the role of host genetic factors in susceptibility and severity of the SARS-CoV-2 virus pandemic. Eur. J. Hum. Genet. 28, 715–718 (2020).https://doi.org/10.1038/s41431-020-0636-6
Jagadeesh, K., Dey, K. K. & Mohan, R. karthikj89/scgenetics: v1.0.0. Zenodo https://doi.org/10.5281/zenodo.6516048 (2022).
Dey, K. K. & Jagadeesh, K. A. kkdey/GSSG: sclinker_NatGenet. Zenodo https://doi.org/10.5281/zenodo.6513166 (2022).
Acknowledgements
We thank L. Gaffney for assistance with preparing figures as well as S. Chen, C. Smillie, B. Eraslan, A. Jaiswal and the entire groups of A.L.P and A.R. for helpful scientific discussions. This work was funded through the National Institutes of Health (NIH) F32 Fellowship (to K.A.J.), NIH Pathway to Independence K99/R00 award K99HG012203 (to K.K.D), NHGRI Genomic Innovator award (R35HG011324), by Gordon and Betty Moore, the BASE Research Initiative at the Lucile Packard Children’s Hospital at Stanford University, NIH Pathway to Independence award (R00HG009917) (to J.M.E), NIH grants (nos. U01 HG009379, R01 MH101244, R37 MH107649, R01 HG006399, R01 MH115676 and R01 MH109978) to A.L.P. and Klarman Cell Observatory, HHMI, the Manton Foundation and NIH grant (no. 5U24AI118672) to A.R. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
Author information
Authors and Affiliations
Contributions
K.A.J., K.K.D., A.L.P. and A.R. designed the study. K.A.J. and K.K.D. developed statistical methodologies and performed all computational analyses. A.L.P. and A.R. provided expert guidance and feedback on analysis and results. D.T.M. interpreted biological signals and guided K.A.J. and K.K.D. on highlighting biological insights. K.A.J. and R.M. designed and developed the web interface to visualize the results. J.M.E. provided ABC mappings. S.G. provided guidance on enhancer–gene-linking strategies. R.J.X. provided guidance on biological interpretations. K.A.J., K.K.D, A.L.P. and A.R. wrote the manuscript with detailed input from D.T.M. and feedback from all authors.
Corresponding authors
Ethics declarations
Competing interests
A.R. is a co-founder and equity holder of Celsius Therapeutics and an equity holder in Immunitas and was an SAB member of Thermo Fisher Scientific, Syros Pharmaceuticals, Neogene Therapeutics and Asimov. From 1 August 2020, A.R. has been an employee of Genentech. The remaining authors declare no competing interests.
Peer review
Peer review information
Nature Genetics thanks Danielle Posthuma, Yukinori Okada, Rachel Brouwer and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Single-cell RNA-seq datasets.
UMAP embedding of scRNA-seq profiles (dots) colored by cell type annotations from 12 datasets (labels on top).
Extended Data Fig. 2 Standardized effect sizes of immune and brain cell type programs.
Standardized effect size (τ*) (dot size) and significance (-log10(P-value), dot color) of the heritability enrichment of immune (a,b) or brain (c) cell type programs (columns) for blood cell traits (a), immune disease traits (b), or neurological/psychological related traits (c), based on SNP annotations generated with the Roadmap∪ABC-immune (a,b) or Roadmap∪ABC-brain (c) enhancer-gene linking strategy. Numerical results are reported in Supplementary Data 1. Details for all traits analyzed are in Supplementary Table 2.
Extended Data Fig. 3 Linking cell type programs to diseases and traits across all analyzed tissues.
Magnitude (E-score, dot size) and significance (-log10(P-value), dot color) of the heritability enrichment of cell type programs (columns) from each of nine tissues (color code, legend) for GWAS summary statistics of diverse traits and diseases (rows), based on the Roadmap∪ABC enhancer-gene linking strategy for the corresponding tissue. Details for all traits analyzed are in Supplementary Table 2. See Data Availability for higher resolution version of this figure.
Extended Data Fig. 4 Cross trait analysis of cell type enrichments.
Pearson correlation coefficient (colorbar) between the cell type enrichment profiles of each pair of traits (rows, columns), clustered (dashed lines) hierarchically. Trait clusters labeled by their overall cell type enrichments.
Extended Data Fig. 5 Linking cellular process programs to relevant diseases and traits in each of six tissues.
Magnitude (E-score, dot size) and significance (-log10(P-value), dot color) of the heritability enrichment of cellular process programs (columns; obtained by NMF) in each of seven tissues (label on top) for traits relevant in that tissue (rows) using the Roadmap∪ABC strategy for the corresponding tissue. Details for all traits analyzed are in Supplementary Table 2.
Extended Data Fig. 6 Analysis of cell type programs using a non-tissue-specific enhancer-gene linking strategy.
Magnitude (E-score, dot size) and significance (-log10(P-value), dot color) of the heritability enrichment of immune (a), brain (b), lung (c), heart (d), colon (e), adipose (f) and skin (g) cell type programs (columns) for traits relevant in that tissue (rows) using a non-tissue-specific Roadmap∪ABC strategy. Details for all traits analyzed are in Supplementary Table 2.
Extended Data Fig. 7 Disease-dependent programs have low correlations with healthy and disease cell type programs.
Pearson correlation coefficient (color bar) of gene program membership vectors between healthy cell type, disease cell type and disease-dependent programs in scRNA-seq studies from a disease tissue (label on top) and the corresponding healthy tissue.
Extended Data Fig. 8 Disease specificity of disease-dependent programs.
Proportion of disease-dependent programs with a -log10(P-value) of enrichment score (p.E-score) > 3 in IBD, MS and asthma GWAS summary statistics (column) for disease-dependent programs from IBD, MS and asthma (columns), when combined with tissue-specific Roadmap∪ABC (row).
Extended Data Fig. 9 Analysis of disease-dependent programs using alternative Roadmap∪ABC enhancer-gene linking strategies.
Magnitude (E-score, dot size) and significance (-log10(P-value), dot color) of the heritability enrichment of disease-dependent programs (columns) in UC (colon cells) using Roadmap∪ABC-immune (a), asthma (lung cells) using Roadmap∪ABC-immune (b), and MS (brain cells) using Roadmap∪ABC-brain (c). Details for all traits analyzed are in Supplementary Table 2.
Extended Data Fig. 10 Analysis of disease-dependent programs across all tissues and traits.
Magnitude (E-score, dot size) and significance (-log10(P-value), dot color) of the heritability enrichment of disease-dependent programs (columns) from UC, MS, Alzheimer’s, asthma and pulmonary fibrosis (labels on top, color code, legend), for GWAS summary statistics of diverse traits and diseases (rows), based on the Roadmap∪ABC enhancer-gene linking strategy for the corresponding tissue. Details for all traits analyzed are in Supplementary Table 2. See Data Availability for higher resolution version of this figure.
Supplementary information
Supplementary Information
Supplementary Note, Tables 1 and 2, and Figs. 1–7.
Supplementary Data 1
Healthy cell-type program heritability enrichment results. Numerical values for E-score and significance are reported for all cell-type programs and traits analyzed.
Supplementary Data 2
Disease-dependent program heritability enrichment results. Numerical values for E-score and significance are reported for all disease-dependent programs and traits analyzed.
Supplementary Data 3
Cellular process program heritability enrichment results. Numerical values for E-score and significance are reported for all healthy, disease and shared cellular processes and traits analyzed.
Supplementary Data 4
List of genes driving each enrichment. Up to 50 genes with the strongest MAGMA gene score and membership in the gene program.
Supplementary Data 5
Heritability enrichment results from eQTL, PCHi-C and other alternative enhancer–gene-linking strategies. Numerical values for E-score and significance are reported for all traits analyzed with alternative enhancer–gene-linking strategies.
Supplementary Data 6
Heritability enrichment results from alternative approaches for constructing cell-type gene programs. Numerical values for E-score and significance are reported for all traits analyzed with the alternative cell-type programs.
Supplementary Data 7
MAGMA analysis with alternative input representations. Sensitivity/specificity index, s.e.m., average sensitivity and average specificity for various binarization thresholds (0.20–0.95) and continuous variable approaches (probability scale or −log(odds) of the probability scale), for the analysis of both five blood cell traits and four major categories of diseases/traits.
Supplementary Data 8
FUMA enrichments for blood cell traits and immune cell-type programs. Numerical values for beta, s.e.m. and P value for all cell types and traits analyzed.
Supplementary Data 9
MAGMA GSEA results for all cell-type programs. MAGMA scores across all traits analyzed.
Supplementary Data 10
Pathway enrichment analysis for each disease-dependent program. Gene overlap, P value and gene list for each of the enriched pathway ontology terms across KEGG, Wikipathways and Reactome.
Supplementary Data 11
Composition of cell types in each tissue. Proportion of cells observed for each cell type and condition in each of the single-cell datasets.
Supplementary Data 12
Correlation between disease-dependent and healthy cell-type program.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Jagadeesh, K.A., Dey, K.K., Montoro, D.T. et al. Identifying disease-critical cell types and cellular processes by integrating single-cell RNA-sequencing and human genetics. Nat Genet 54, 1479–1492 (2022). https://doi.org/10.1038/s41588-022-01187-9
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s41588-022-01187-9
This article is cited by
-
Conditional similarity triplets enable covariate-informed representations of single-cell data
BMC Bioinformatics (2025)
-
Utilizing sc-linker to integrate single-cell RNA sequencing and human genetics to identify cell types and driver genes associated with non-small cell lung cancer
BMC Cancer (2025)
-
Machine learning-driven identification of critical gene programs and key transcription factors in migraine
The Journal of Headache and Pain (2025)
-
Inferring gene regulatory networks from single-cell multiome data using atlas-scale external data
Nature Biotechnology (2025)
-
Integrating microbial GWAS and single-cell transcriptomics reveals associations between host cell populations and the gut microbiome
Nature Microbiology (2025)