这是indexloc提供的服务,不要输入任何密码
Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Translation efficiency covariation identifies conserved coordination patterns across cell types

Abstract

Characterizing shared patterns of RNA expression between genes across conditions has led to the discovery of regulatory networks and biological functions. However, it is unclear if such coordination extends to translation. In this study, we uniformly analyze 3,819 ribosome profiling datasets from 117 human and 94 mouse tissues and cell lines. We introduce the concept of translation efficiency covariation (TEC), identifying coordinated translation patterns across cell types. We nominate candidate mechanisms driving shared patterns of translation regulation. TEC is conserved across human and mouse cells and uncovers gene functions that are not evident from RNA or protein co-expression. Moreover, our observations indicate that proteins that physically interact are highly enriched for positive covariation at both translational and transcriptional levels. Our findings establish TEC as a conserved organizing principle of mammalian transcriptomes. TEC has potential as a predictive marker for gene function and may offer a framework for designing gene expression systems in synthetic biology and biotechnological applications.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: RiboBase: a comprehensive ribosome profiling database with thousands of experiments.
Fig. 2: TE defined using a compositional linear regression model is conserved across cell types and species.
Fig. 3: TE covariation is conserved between human and mouse.
Fig. 4: Genes associated with certain biological functions exhibit higher similarity patterns in TE than in RNA expression.
Fig. 5: TEC enables the prediction of gene functions.
Fig. 6: Physically interacting proteins display TEC.

Similar content being viewed by others

Data availability

Metadata about RiboBase can be found in Supplementary Table 1. Ribo files for the HeLa cell line are accessible via Zenodo at https://doi.org/10.5281/zenodo.15660080 (ref. 103). Full TEC and RNA co-expression matrices are accessible via Zenodo at https://doi.org/10.5281/zenodo.10373032 (ref. 127). A RiboFlow configuration file and processed ribo files for RBP knockout can be accessed via Zenodo at https://doi.org/10.5281/zenodo.11388478 (ref. 135). Sequencing data and ribo files for the RBP knockout experiments are available under GEO accession code GSE269734.

Code availability

The code and data used in this study are available via Zenodo at https://doi.org/10.5281/zenodo.10373032 (ref. 127) and via GitHub at https://github.com/CenikLab/TE_model. The code and data used to generate figures can be found via Zenodo at https://doi.org/10.5281/zenodo.15337774 (ref. 136) and via GitHub at https://github.com/CenikLab/coTE_paper.

References

  1. Tang, F. et al. mRNA-Seq whole-transcriptome analysis of a single cell. Nat. Methods 6, 377–382 (2009).

    Article  CAS  PubMed  Google Scholar 

  2. Nagalakshmi, U. et al. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science 320, 1344–1349 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Mortazavi, A., Williams, B. A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5, 621–628 (2008).

    Article  CAS  PubMed  Google Scholar 

  4. Schena, M., Shalon, D., Davis, R. W. & Brown, P. O. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270, 467–470 (1995).

    Article  CAS  PubMed  Google Scholar 

  5. Chen, K. H., Boettiger, A. N., Moffitt, J. R., Wang, S. & Zhuang, X. RNA imaging. Spatially resolved, highly multiplexed RNA profiling in single cells. Science 348, aaa6090 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  6. Combs, P. A. & Eisen, M. B. Sequencing mRNA from cryo-sliced Drosophila embryos to determine genome-wide spatial patterns of gene expression. PLoS ONE 8, e71820 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Achim, K. et al. High-throughput spatial mapping of single-cell RNA-seq data to tissue of origin. Nat. Biotechnol. 33, 503–509 (2015).

    Article  CAS  PubMed  Google Scholar 

  8. Langfelder, P. & Horvath, S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9, 559 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  9. Eisen, M. B., Spellman, P. T., Brown, P. O. & Botstein, D. Cluster analysis and display of genome-wide expression patterns. Proc. Natl Acad. Sci. USA 95, 14863–14868 (1998).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Skinnider, M. A., Squair, J. W. & Foster, L. J. Evaluating measures of association for single-cell transcriptomics. Nat. Methods 16, 381–386 (2019).

    Article  CAS  PubMed  Google Scholar 

  11. Stuart, J. M., Segal, E., Koller, D. & Kim, S. K. A gene-coexpression network for global discovery of conserved genetic modules. Science 302, 249–255 (2003).

    Article  CAS  PubMed  Google Scholar 

  12. Marcotte, E. M., Pellegrini, M., Thompson, M. J., Yeates, T. O. & Eisenberg, D. A combined algorithm for genome-wide prediction of protein function. Nature 402, 83–86 (1999).

    Article  CAS  PubMed  Google Scholar 

  13. DeRisi, J. L., Iyer, V. R. & Brown, P. O. Exploring the metabolic and genetic control of gene expression on a genomic scale. Science 278, 680–686 (1997).

    Article  CAS  PubMed  Google Scholar 

  14. Jansen, R., Greenbaum, D. & Gerstein, M. Relating whole-genome expression data with protein–protein interactions. Genome Res. 12, 37–46 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Szklarczyk, D. et al. The STRING database in 2023: protein–protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res. 51, D638–D646 (2023).

    Article  CAS  PubMed  Google Scholar 

  16. Tavazoie, S., Hughes, J. D., Campbell, M. J., Cho, R. J. & Church, G. M. Systematic determination of genetic network architecture. Nat. Genet. 22, 281–285 (1999).

    Article  CAS  PubMed  Google Scholar 

  17. Roth, F. P., Hughes, J. D., Estep, P. W. & Church, G. M. Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation. Nat. Biotechnol. 16, 939–945 (1998).

    Article  CAS  PubMed  Google Scholar 

  18. Nusinow, D. P. et al. Quantitative proteomics of the Cancer Cell Line Encyclopedia. Cell 180, 387–402 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Gonçalves, E. et al. Pan-cancer proteomic map of 949 human cell lines. Cancer Cell 40, 835–849 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  20. Ryan, C. J., Kennedy, S., Bajrami, I., Matallanas, D. & Lord, C. J. A compendium of co-regulated protein complexes in breast cancer reveals collateral loss events. Cell Syst. 5, 399–409 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Singh, G., Pratt, G., Yeo, G. W. & Moore, M. J. The clothes make the mRNA: past and present trends in mRNP fashion. Annu. Rev. Biochem. 84, 325–354 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Keene, J. D. & Tenenbaum, S. A. Eukaryotic mRNPs may represent posttranscriptional operons. Mol. Cell 9, 1161–1167 (2002).

    Article  CAS  PubMed  Google Scholar 

  23. Keene, J. D. RNA regulons: coordination of post-transcriptional events. Nat. Rev. Genet. 8, 533–543 (2007).

    Article  CAS  PubMed  Google Scholar 

  24. Li, G.-W., Burkhardt, D., Gross, C. & Weissman, J. S. Quantifying absolute protein synthesis rates reveals principles underlying allocation of cellular resources. Cell 157, 624–635 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Taggart, J. C. & Li, G.-W. Production of protein-complex components is stoichiometric and lacks general feedback regulation in eukaryotes. Cell Syst. 7, 580–589 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Amirbeigiarab, S. et al. Invariable stoichiometry of ribosomal proteins in mouse brain tissues with aging. Proc. Natl Acad. Sci. USA 116, 22567–22572 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Soto, I. et al. Balanced mitochondrial and cytosolic translatomes underlie the biogenesis of human respiratory complexes. Genome Biol. 23, 170 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Natan, E. et al. Cotranslational protein assembly imposes evolutionary constraints on homomeric proteins. Nat. Struct. Mol. Biol. 25, 279–288 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Li, G.-W., Oh, E. & Weissman, J. S. The anti-Shine–Dalgarno sequence drives translational pausing and codon choice in bacteria. Nature 484, 538–541 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Bertolini, M. et al. Interactions between nascent proteins translated by adjacent ribosomes drive homomer assembly. Science 371, 57–64 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Ozadam, H., Geng, M. & Cenik, C. RiboFlow, RiboR and RiboPy: an ecosystem for analyzing ribosome profiling data at read length resolution. Bioinformatics 36, 2929–2931 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Gerashchenko, M. V. & Gladyshev, V. N. Ribonuclease selection for ribosome profiling. Nucleic Acids Res. 45, e6 (2017).

    Article  PubMed  Google Scholar 

  33. Mohammad, F., Green, R. & Buskirk, A. R. A systematically-revised ribosome profiling method for bacteria reveals pauses at single-codon resolution. eLife 8, e42591 (2019).

  34. Ingolia, N. T., Ghaemmaghami, S., Newman, J. R. S. & Weissman, J. S. Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 324, 218–223 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Larsson, O., Sonenberg, N. & Nadon, R. Identification of differential translation in genome wide studies. Proc. Natl Acad. Sci. USA 107, 21487–21492 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. van den Boogaart, K. G., Filzmoser, P., Hron, K., Templ, M. & Tolosana-Delgado, R. Classical and robust regression analysis with compositional data. Math. Geosci. 53, 823–858 (2021).

    Article  Google Scholar 

  37. Quinn, T. P. et al. A field guide for the compositional analysis of any-omics data. Gigascience 8, giz107 (2019).

  38. Quinn, T. P., Richardson, M. F., Lovell, D. & Crowley, T. M.propr: an R-package for identifying proportionally abundant features using compositional data analysis. Sci. Rep. 7, 16252 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  39. Sudmant, P. H., Alexis, M. S. & Burge, C. B. Meta-analysis of RNA-seq expression data across species, tissues and studies. Genome Biol. 16, 287 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  40. Wang, Z.-Y. et al. Transcriptome and translatome co-evolution in mammals. Nature 588, 642–647 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Lu, P., Takai, K., Weaver, V. M. & Werb, Z. Extracellular matrix degradation and remodeling in development and disease. Cold Spring Harb. Perspect. Biol. 3, a005058 (2011).

  42. Artieri, C. G. & Fraser, H. B. Evolution at two levels of gene expression in yeast. Genome Res. 24, 411–421 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. McManus, C. J., May, G. E., Spealman, P. & Shteyman, A. Ribosome profiling reveals post-transcriptional buffering of divergent gene expression in yeast. Genome Res. 24, 422–430 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Breschi, A., Gingeras, T. R. & Guigó, R. Comparative transcriptomics in human and mouse. Nat. Rev. Genet. 18, 425–440 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Crow, M., Suresh, H., Lee, J. & Gillis, J. Coexpression reveals conserved gene programs that co-vary with cell type across kingdoms. Nucleic Acids Res. 50, 4302–4314 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Thoreen, C. C. et al. A unifying model for mTORC1-mediated regulation of mRNA translation. Nature 485, 109–113 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Wurth, L. et al. UNR/CSDE1 drives a post-transcriptional program to promote melanoma invasion and metastasis. Cancer Cell 36, 337 (2019).

    Article  CAS  PubMed  Google Scholar 

  48. Pierson, E. et al. Sharing and specificity of co-expression networks across 35 human tissues. PLoS Comput. Biol. 11, e1004220 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  49. Kershaw, C. J. et al. Translation factor and RNA binding protein mRNA interactomes support broader RNA regulons for posttranscriptional control. J. Biol. Chem. 299, 105195 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Hentze, M. W., Castello, A., Schwarzl, T. & Preiss, T. A brave new world of RNA-binding proteins. Nat. Rev. Mol. Cell Biol. 19, 327–341 (2018).

    Article  CAS  PubMed  Google Scholar 

  51. Liu, Y. The number of genes whose TE significantly correlates with an RBP’s expression. Zenodo https://doi.org/10.5281/zenodo.11359114 (2024).

  52. Korbel, J. O., Jensen, L. J., von Mering, C. & Bork, P. Analysis of genomic context: prediction of functional associations from conserved bidirectionally transcribed gene pairs. Nat. Biotechnol. 22, 911–917 (2004).

    Article  CAS  PubMed  Google Scholar 

  53. Szklarczyk, R. et al. WeGET: predicting new genes for molecular systems by weighted co-expression. Nucleic Acids Res. 44, D567–D573 (2016).

    Article  CAS  PubMed  Google Scholar 

  54. Zhang, M. et al. RNA-binding protein IMP3 is a novel regulator of MEK1/ERK signaling pathway in the progression of colorectal cancer through the stabilization of MEKK1 mRNA. J. Exp. Clin. Cancer Res. 40, 200 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Bodén, M. & Bailey, T. L. Associating transcription factor-binding site motifs with target GO terms and target genes. Nucleic Acids Res. 36, 4108–4117 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  56. Machanick, P. & Bailey, T. L. MEME-ChIP: motif analysis of large DNA datasets. Bioinformatics 27, 1696–1697 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Eichhorn, S. W. et al. mRNA destabilization is the dominant effect of mammalian microRNAs by the time substantial repression ensues. Mol. Cell 56, 104–115 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Bartel, D. P. Metazoan microRNAs. Cell 173, 20–51 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Mecham, R. The Extracellular Matrix: An Overview (Springer Science & Business Media, 2011).

  60. Kagan, H. M. & Li, W. Lysyl oxidase: properties, specificity, and biological roles inside and outside of the cell. J. Cell. Biochem. 88, 660–672 (2003).

    Article  CAS  PubMed  Google Scholar 

  61. Kikuchi, A. et al. Structural basis for activation of DNMT1. Nat. Commun. 13, 7130 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  62. Wu, Y.-Y. et al. The hTERT-p50 homodimer inhibits PLEKHA7 expression to promote gastric cancer invasion and metastasis. Oncogene 42, 1144–1156 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Kurita, S., Yamada, T., Rikitsu, E., Ikeda, W. & Takai, Y. Binding between the junctional proteins afadin and PLEKHA7 and implication in the formation of adherens junction in epithelial cells. J. Biol. Chem. 288, 29356–29368 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Pulimeno, P., Paschoud, S. & Citi, S. A role for ZO-1 and PLEKHA7 in recruiting paracingulin to tight and adherens junctions of epithelial cells. J. Biol. Chem. 286, 16743–16750 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Jeung, H.-C. et al. PLEKHA7 signaling is necessary for the growth of mutant KRAS driven colorectal cancer. Exp. Cell. Res. 409, 112930 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Tavano, S. et al. Insm1 induces neural progenitor delamination in developing neocortex via downregulation of the adherens junction belt-specific protein Plekha7. Neuron 97, 1299–1314 (2018).

    Article  CAS  PubMed  Google Scholar 

  67. Sukonina, V. et al. FOXK1 and FOXK2 regulate aerobic glycolysis. Nature 566, 279–283 (2019).

    Article  CAS  PubMed  Google Scholar 

  68. Kobe, B. & Kajava, A. V. The leucine-rich repeat as a protein recognition motif. Curr. Opin. Struct. Biol. 11, 725–732 (2001).

    Article  CAS  PubMed  Google Scholar 

  69. Evans, R. et al. Protein complex prediction with AlphaFold-Multimer. Preprint at bioRxiv https://doi.org/10.1101/2021.10.04.463034 (2021).

  70. Carlsson, P. & Mahlapuu, M. Forkhead transcription factors: key players in development and metabolism. Dev. Biol. 250, 1–23 (2002).

    Article  CAS  PubMed  Google Scholar 

  71. Lambert, S. A. et al. The human transcription factors. Cell 172, 650–665 (2018).

    Article  CAS  PubMed  Google Scholar 

  72. Kustatscher, G. et al. Co-regulation map of the human proteome enables identification of protein functions. Nat. Biotechnol. 37, 1361–1371 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Szklarczyk, D. et al. STRING v10: protein–protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 43, D447–D452 (2015).

    Article  CAS  PubMed  Google Scholar 

  74. Shiber, A. et al. Cotranslational assembly of protein complexes in eukaryotes revealed by ribosome profiling. Nature 561, 268–272 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Ewing, R. M. et al. Large-scale mapping of human protein–protein interactions by mass spectrometry. Mol. Syst. Biol. 3, 89 (2007).

    Article  PubMed  PubMed Central  Google Scholar 

  76. Drew, K., Wallingford, J. B. & Marcotte, E. M. hu.MAP 2.0: integration of over 15,000 proteomic experiments builds a global compendium of human multiprotein assemblies. Mol. Syst. Biol. 17, e10016 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  77. Heider, M. R. et al. Subunit connectivity, assembly determinants and architecture of the yeast exocyst complex. Nat. Struct. Mol. Biol. 23, 59–66 (2016).

    Article  CAS  PubMed  Google Scholar 

  78. Kee, Y. et al. Subunit structure of the mammalian exocyst complex. Proc. Natl Acad. Sci. USA 94, 14438–14443 (1997).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  79. Lalanne, J.-B. et al. Evolutionary convergence of pathway-specific enzyme expression stoichiometry. Cell 173, 749–761 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Bicknell, A. A. et al. Attenuating ribosome load improves protein output from mRNA by limiting translation-dependent mRNA decay. Cell Rep. 43, 114098 (2024).

    Article  CAS  PubMed  Google Scholar 

  81. Liu, T.-Y. et al. Time-resolved proteomics extends ribosome profiling-based measurements of protein synthesis dynamics. Cell Syst. 4, 636–644 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  82. Wang, M., Herrmann, C. J., Simonovic, M., Szklarczyk, D. & von Mering, C. Version 4.0 of PaxDb: protein abundance data, integrated across model organisms, tissues, and cell-lines. Proteomics 15, 3163–3168 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  83. Piepoli, A. et al. The expression of leucine-rich repeat gene family members in colorectal cancer. Exp. Biol. Med. 237, 1123–1128 (2012).

    Article  CAS  Google Scholar 

  84. Liu, Y. et al. Identification of differential expression of genes in hepatocellular carcinoma by suppression subtractive hybridization combined cDNA microarray. Oncol. Rep. 18, 943–951 (2007).

    PubMed  Google Scholar 

  85. Chen, H. et al. miR-218 contributes to drug resistance in multiple myeloma via targeting LRRC28. J. Cell. Biochem. 122, 305–314 (2021).

    Article  CAS  PubMed  Google Scholar 

  86. Vander Heiden, M. G., Cantley, L. C. & Thompson, C. B. Understanding the Warburg effect: the metabolic requirements of cell proliferation. Science 324, 1029–1033 (2009).

    Article  Google Scholar 

  87. Liu, Y. et al. Histone H2AX promotes metastatic progression by preserving glycolysis via hexokinase-2. Sci. Rep. 12, 3758 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  88. Zheng, D. et al. Predicting the translation efficiency of messenger RNA in mammalian cells. Nat. Bio. https://doi.org/10.1038/s41587-025-02712-x (2025).

  89. Rodriguez, J. M. et al. APPRIS: annotation of principal and alternative splice isoforms. Nucleic Acids Res. 41, D110–D117 (2013).

    Article  CAS  PubMed  Google Scholar 

  90. Rao, S. et al. Genes with 5′ terminal oligopyrimidine tracts preferentially escape global suppression of translation by the SARS-CoV-2 Nsp1 protein. RNA 27, 1025–1045 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. Mills, E. W. & Green, R. Ribosomopathies: there’s strength in numbers. Science 358, eaan2755 (2017).

  92. Ozadam, H. et al. Single-cell quantification of ribosome occupancy in early mouse development. Nature 618, 1057–1064 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  93. VanInsberghe, M., van den Berg, J., Andersson-Rolf, A., Clevers, H. & van Oudenaarden, A. Single-cell Ribo-seq reveals cell cycle-dependent translational pausing. Nature 597, 561–565 (2021).

    Article  CAS  PubMed  Google Scholar 

  94. Benoit Bouvrette, L. P., Bovaird, S., Blanchette, M. & Lécuyer, E. oRNAment: a database of putative RNA binding protein target sites in the transcriptomes of model species. Nucleic Acids Res. 48, D166–D173 (2020).

    PubMed  Google Scholar 

  95. Krismer, K. et al. Transite: a computational motif-based analysis platform that identifies RNA-binding proteins modulating changes in gene expression. Cell Rep. 32, 108064 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  96. Van Nostrand, E. L. et al. A large-scale binding and functional map of human RNA-binding proteins. Nature 583, 711–719 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  97. Hou, Y., Xie, T., He, L., Tao, L. & Huang, J. Topological links in predicted protein complex structures reveal limitations of AlphaFold. Commun. Biol. 6, 1098 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  98. Burke, D. F. et al. Towards a structurally resolved human protein interaction network. Nat. Struct. Mol. Biol. 30, 216–225 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  99. Bryant, P., Pozzati, G. & Elofsson, A. Improved prediction of protein-protein interactions using AlphaFold2. Nat. Commun. 13, 1265 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  100. National Center for Biotechnology Information. SRA Tools. GitHub https://github.com/ncbi/sra-tools (2018).

  101. Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 17, 10–12 (2011).

    Article  Google Scholar 

  102. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  103. Liu, Y. HeLa ribosome profiling data. Zenodo https://doi.org/10.5281/zenodo.15660080 (2024).

  104. Gerashchenko, M. V. & Gladyshev, V. N. Translation inhibitors cause abnormalities in ribosome profiling experiments. Nucleic Acids Res. 42, e134 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  105. Wu, C. C.-C., Zinshteyn, B., Wehner, K. A. & Green, R. High-resolution ribosome profiling defines discrete ribosome elongation states and translational regulation during cellular stress. Mol. Cell 73, 959–970 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  106. Wolin, S. L. & Walter, P. Ribosome pausing and stacking during translation of a eukaryotic mRNA. EMBO J. 7, 3559–3569 (1988).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  107. Sharma, J. et al. A small molecule that induces translational readthrough of CFTR nonsense mutations by eRF1 depletion. Nat. Commun. 12, 4358 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  108. Tukey, J. W. The future of data analysis. Ann. Math. Stat. 33, 1–67 (1962).

    Article  Google Scholar 

  109. Zhang, X.-O., Yin, Q.-F., Chen, L.-L. & Yang, L. Gene expression profiling of non-polyadenylated RNA-seq across species. Genom. Data 2, 237–241 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  110. Yang, L., Duff, M. O., Graveley, B. R., Carmichael, G. G. & Chen, L.-L. Genomewide characterization of non-polyadenylated RNAs. Genome Biol. 12, R16 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  111. van den Boogaart, K. G. & Tolosano-Delgado, R. Analyzing Compositional Data with R (Springer, 2013).

  112. Cenik, C. et al. Integrative analysis of RNA, translation, and protein levels reveals distinct regulatory variation across humans. Genome Res. 25, 1610–1621 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  113. Greenacre, M. Compositional data analysis. Annu. Rev. Stat. Appl. 8, 271–299 (2021).

    Article  Google Scholar 

  114. Ramsköld, D., Wang, E. T., Burge, C. B. & Sandberg, R. An abundance of ubiquitously expressed genes revealed by tissue transcriptome sequence data. PLoS Comput. Biol. 5, e1000598 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  115. Csárdi, G., Franks, A., Choi, D. S., Airoldi, E. M. & Drummond, D. A. Accounting for experimental noise reveals that mRNA levels, amplified by post-transcriptional processes, largely determine steady-state protein levels in yeast. PLoS Genet. 11, e1005206 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  116. Schilder, B. M. & Skene, N. G. orthogene: An R package for easy mapping of orthologous genes across hundreds of species. R package version 3.21 https://doi.org/10.18129/B9.bioc.orthogene (2022).

  117. van den Boogaart, K. G. & Tolosana-Delgado, R. ‘compositions’: a unified R package to analyze compositional data. Comput. Geosci. 34, 320–338 (2008).

    Article  Google Scholar 

  118. Kim, S. ppcor: an R package for a fast calculation to semi-partial correlation coefficients. Commun. Stat. Appl. Methods 22, 665–674 (2015).

    PubMed  PubMed Central  Google Scholar 

  119. Berriz, G. F., Beaver, J. E., Cenik, C., Tasan, M. & Roth, F. P. Next generation software for functional trend analysis. Bioinformatics 25, 3043–3044 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  120. Buttrey, S. & Whitaker, L. TreeClust: an R package for tree-based clustering dissimilarities. R J. 7, 227 (2015).

    Article  Google Scholar 

  121. Wainberg, M. et al. A genome-wide atlas of co-essential modules assigns function to uncharacterized genes. Nat. Genet. 53, 638–649 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  122. Gene Ontology Consortium et al. The Gene Ontology knowledgebase in 2023. Genetics 224, iyad031 (2023).

  123. Philippe, L., van den Elzen, A. M. G., Watson, M. J. & Thoreen, C. C. Global analysis of LARP1 translation targets reveals tunable and dynamic features of 5′ TOP motifs. Proc. Natl Acad. Sci. USA 117, 5319–5328 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  124. Ballouz, S., Weber, M., Pavlidis, P. & Gillis, J. EGAD: ultra-fast functional analysis of gene networks. Bioinformatics 33, 612–614 (2017).

    Article  CAS  PubMed  Google Scholar 

  125. Carlson, M. org.Mm.eg.db: Genome wide annotation for mouse. R package version 3.21 https://doi.org/10.18129/B9.bioc.org.Mm.eg.db (2025).

  126. Carlson, M. org.Hs.eg.db: Genome wide annotation for human. R package version 3.21 https://doi.org/10.18129/B9.bioc.org.Hs.eg.db (2025).

  127. Liu, Y. Intermediate data for TE calculation. Zenodo https://doi.org/10.5281/zenodo.10373032 (2024).

  128. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  129. The UniProt Consortium. UniProt: the Universal Protein Knowledgebase in 2023. Nucleic Acids Res. 51, D523–D531 (2023).

    Article  Google Scholar 

  130. Hu, Y. et al. Paralog Explorer: a resource for mining information about paralogs in common research organisms. Comput. Struct. Biotechnol. J. 20, 6570–6577 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  131. Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  132. Ho, D., Imai, K., King, G. & Stuart, E. A. MatchIt: nonparametric preprocessing for parametric causal inference. J. Stat. Softw. https://doi.org/10.18637/jss.v042.i08 (2011).

  133. Sanson, K. R. et al. Optimized libraries for CRISPR–Cas9 genetic screens with multiple modalities. Nat. Commun. 9, 5416 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  134. Sanjana, N. E., Shalem, O. & Zhang, F. Improved vectors and genome-wide libraries for CRISPR screening. Nat. Methods 11, 783–784 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  135. Liu, Y. KO_validation_RiboBase. Zenodo https://doi.org/10.5281/zenodo.11388478 (2024).

  136. Yue, L. coTE_paper: code and to generate main figures. Zenodo https://doi.org/10.5281/zenodo.15337774 (2025).

Download references

Acknowledgements

We thank all contributors to metadata curation: H. Chiang, A. Hoffman, T. Tonn, A. Segura, C. Tante, E. Vasquez and L. Xu. We also thank Y. Shin and V. D. Chapman for their help with the experiments. We appreciate M. Miladi for providing critical feedback. The original text in this paper was written by the authors. A large language model was used to suggest edits for clarity and grammar (Open AI ChatGPT, https://chat.openai.com). The authors acknowledge the Texas Advanced Computing Center at The University of Texas at Austin (http://www.tacc.utexas.edu) for providing high-performance computing and storage resources that contributed to the research results reported within this paper.

Research reported in this publication was supported in part by the National Institute of General Medical Sciences of the National Institutes of Health (NIH) under award numbers R35GM150667 (C.C.) and R35GM138340 (E.S.C.). This work was also supported by NIH grant HD110096 (C.C.) and Welch Foundation grants F-2027-20230405 (C.C.) and F-2133-20230405 (E.S.C.). C.C. was a Cancer Prevention and Research Institute of Texas (CPRIT) Scholar in Cancer Research, supported by CPRIT grant RR180042.

Author information

Authors and Affiliations

Authors

Contributions

Y.L., I.H. and C.C. co-wrote the original manuscript. Y.L., I.H. and S.R. generated the figures for the manuscript. H.O., M.G. and J.C. downloaded all the data from the GEO and processed raw sequencing data. Y.L. and C.C. developed the TE calculation pipeline. J.C. and C.C. designed and implemented the winsorization method. Y.Z. performed the deduplication comparison. Y.L. and C.C. developed the TE covariation analysis and function prediction pipelines. Y.L., K.Q. and H.O. performed the quality control analysis for all sequencing data. Y.L. carried out covariation analysis, gene function prediction and AlphaFold2 analysis. I.H. conducted the RBP analysis. L.P., J.W., D.Z. and V.A. assessed the quality of TE measurements by developing machine learning approaches. H.O., J.W., D.Z., V.A., Q.Z. and E.S.C. provided suggestions for the manuscript. Y.L., Q.Z. and E.S.C. conducted the literature search to evaluate gene function predictions. S.R. conducted the experimental validation of LRRC28. S.R., I.H., V.G. and D.P. performed other experiments. C.C. provided study oversight, conceptualized the study and acquired funding. All authors approved the final manuscript.

Corresponding author

Correspondence to Can Cenik.

Ethics declarations

Competing interests

D.Z., J.W. and V.A. are employees of Sanofi and may hold shares and/or stock options in the company. H.O. is an employee of Sail Biomedicines. I.H. is an employee of Monoceros Biosystems. The remaining authors declare no competing interests.

Peer review

Peer review information

Nature Biotechnology thanks the anonymous reviewers for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Sequencing quality of ribosome profiling data.

a, Distribution of read counts for 2,195 human and 1,624 mouse ribosome profiling data in RiboBase. In all figure panels, the horizontal line corresponds to the median. The box represents the interquartile range and the whiskers extend to 1.5 times of it. b, Distribution plot similar to panel a for 1,282 human and 995 mouse ribosome profiling data with matched RNA-seq. c, Distribution of the proportion of read count aligned to transcripts, read counts with high-quality alignments, and the percentage of reads remaining after PCR deduplication, relative to the total number of reads from panel a. d, Similar plot as panel c for ribosome profiling with matched RNA-seq. e, The read length distribution of RPFs aligned to coding sequences for all human experiments. The color in the heatmap represents the z-score adjusted RPF counts (Methods). Each experiment where the percentage of RPFs mapping to CDS was greater than 70% and achieving sufficient coverage of the transcript (>= 0.1X) was annotated as QC-pass. f, Similar to panel a for mouse samples.

Extended Data Fig. 2 Quality control and RPF length selection.

a, RPFs shorter than 21 nucleotides were removed, then we identified the RPF length with the highest number of reads mapping to CDS to serve as the starting point. Subsequently, we compared one nucleotide longer or shorter than the first and chose the length with the most reads again. This looping process continued until at least 85% of the total CDS mapping RPFs were included. b, We compared the usable reads selected with two different boundary cutoffs (y-axis) and the proportion of these selected reads that map to the coding regions (x-axis) for each ribosome profiling experiment. c, The percentage of ribosome profiling experiments from GEO that pass or fail quality control (the percentage of RPFs mapping to CDS was greater than 70% and achieving at least 0.1X coverage of the transcript as QC pass).

Extended Data Fig. 3 Assessment of periodicity and data matching for TE estimation.

a-d, In ribosome profiling experiments from RiboBase, samples were classified according to distinct periodicity patterns (Methods). For all figure panels, we added error bars to represent the standard deviation across samples. Statistical significance was assessed using the Wilcoxon test, and the p-values were subsequently adjusted for all 33 comparisons using the Benjamini-Hochberg method. We considered the Group 1 pattern as indicative of the expected three-nucleotide periodicity patterns. Human samples that pass quality control (a), human samples that fail quality control (b), mouse samples that pass quality control (c), mouse samples that fail quality control (d). e, We calculated the coefficient of determination (R²) between a specific ribosome profiling experiment and its corresponding RNA-seq from RiboBase. Additionally, we determined the average R² for all other pairings for the same ribosome profiling sample with other RNA-seq data from the same study. The matching score represents the difference in R² values between these two (x-axis; Methods). f, A dashed line at 0.188 serves as the threshold to identify samples with poor matching (Methods). In each figure panel containing boxplots, the horizontal line corresponds to the median. The box represents the IQR and the whiskers extend to 1.5 times of it. g, Distribution of standard error of TE values across tissue and cell lines (y-axis) for genes with polyA and without polyA tails.

Extended Data Fig. 4 Detailed workflow of data processing for TE and TEC calculations.

a, We selected ribosome profiling data with matched RNA-seq and removed duplicated reads with identical positions and lengths (PCR-deduplication). We set the RPF read length range for individual samples with our dynamic cutoff and filtered out ribosome profiling experiments that failed quality control. After selecting high-quality samples, we reprocessed all these ribosome profiling experiments using the winsorization method with non-deduplicated data. We removed genes without polyA tails and kept genes with sufficient counts per million RPFs. After obtaining RPF counts from the coding regions for both ribosome profiling and RNA-seq, we performed CLR normalization and compositional linear regression, defining the residuals as TE for each gene in each sample. We averaged this sample-level TE based on cell lines and tissues. TEC is further calculated with rho scores38. To build an RNA co-expression matrix, we transformed CDS counts from RNA-seq experiments using CLR, averaged them based on cell lines and tissue, and calculated pairwise proportionalities (rho scores).

Extended Data Fig. 5 Spearman correlation between TE and protein abundance.

a, The correlation between protein abundance and clr-transformed RPF counts from ribosome profiling (left), clr-transformed read counts from RNA-seq (middle), or TE calculated with winsorized RPFs counts using the linear regression model (right). Individual dots indicate specific experiments colored according to study (68 samples from 11 studies-HEK293, 86 samples from 10 studies-HeLa, 58 samples from 4 studies-U2OS, 29 samples from 5 studies-A549, 5 samples from 2 studies-MCF7, 7 samples from 2 studies-K562, 10 samples from 2 studies-HepG2). In the boxplot, the horizontal line corresponds to the median. The box represents the IQR and the whiskers extend to 1.5 times of this range. b, TE was calculated with winsorized RPF counts without deduplication or with deduplication based on position and fragment length. The Spearman correlation coefficient between TE calculated with winsorized RPF counts and protein abundance82 (y-axis) was plotted against “delta correlation” (x-axis) defined by subtracting the correlation values obtained with PCR deduplication from those obtained with the method using winsorized RPF counts without deduplication.

Extended Data Fig. 6 PCR vs UMI deduplication comparison for GSE144140.

a, Metagene plots centered on the start codon for samples GSM4282032 (RPFs range: 28-36 nt), GSM4282033 (RPFs range: 28-36 nt range), and GSM4282034 (RPFs range: 26-35 nt range) were plotted using three different deduplication methods: non-deduplication (ND), UMI-deduplication (UMI), and PCR-deduplication (PCR). b, Correlation of gene counts for GSM4282032 between the three deduplication methods. A blue diagonal line represents a 1:1 ratio in all figure panels. Same analysis as panel b for GSM4282033c, and GSM4282034d.

Extended Data Fig. 7 Conservation of gene expression between human and mouse.

a, The relationship between the mean RNA expressions (clr-transformed counts) of 9,194 orthologous genes across two species is plotted. Dots represent genes in all figure panels. b, The variability of genes’ RNA expression was quantified with metric standard deviation (msd; Methods) across different cell lines and tissues in either human or mouse. To account for the correlation between mean RNA expression and its variability, we adjusted the msd values with their mean values (Methods). c, The scatter plot shows the adjusted msd values (y-axis; Methods) and the average TE across different cell types (x-axis) for human genes. d, Similar analysis as in panel c for mouse genes.

Extended Data Fig. 8 Evaluation TEC calculation methods and TEC patterns.

a, The AUROCs for biological functions were calculated using the similarity scores among genes at ribosome occupancy level determined by eight distinct methods with 1,794 human ribosome profiling data (Methods). In the boxplot, the horizontal line corresponds to the median. The box represents the IQR and the whiskers extend to the largest value within 1.5 times the IQR from the hinge. The dot in this figure represents the AUROC for human 5’ TOP mRNAs. b, TE values that were randomly reassigned from the original data for each gene (shuffled) and TEC was calculated. In the figure panel, we plotted the number of orthologous gene pairs within specified ranges. Each dot represents the aggregated log10-transformed counts of these gene pairs. The dashed line captures 95% of the data. c, Distribution of absolute TEC among 110 TOP motif-containing mRNAs123 and 83 transcripts targeted by CSDE1 (Supplementary Table 22 (ref. 47); Methods) in comparison to all 11,149 human genes as background. Statistical significance between the groups was assessed using a Wilcoxon two-tailed test.

Extended Data Fig. 9 TEC and RNA co-expression among genes with shared functions.

a, A comparison between the number of human GO terms that have AUROC of 0.8 or higher with either TEC or RNA co-expression. b, Motif enrichment in human GO terms. RNA binding proteins (RBPs) from oRNAment94 or Transite95 are indicated. P-values were corrected using the Holm method and those kmers with a p-value < 0.05 are shown. c, Venn diagram for mouse GO terms that achieve an AUROC of 0.8 or higher with proportionality scores (rho) among genes at either TE or RNA expression level. d, The AUROC plot was calculated with genes associated with mannosyltransferase activity in mice. e, The connections represent absolute rho values above 0.1 in either TE pattern alone (green) from d, in both RNA co-expression and TE pattern (blue), or RNA co-expression alone (gray). f, Motif enrichment in mouse GO terms. RNA binding proteins (RBPs) from oRNAment94 or Transite95 are indicated. P-values were corrected using the Holm method and those kmers with a p-value < 0.05 are shown. g, We summarized GO terms where genes exhibit greater similarity at the TE level than at the RNA expression level (AUROC with TEC > 0.8, and different AUROC between TEC and RNA co-expression > 0.1) in mice. We visualized the distribution of absolute rho score for gene pairs within each specific GO term (bottom; gene pairs with abs(rho) > 0.1) at the TE level.

Extended Data Fig. 10 3D structure of the interaction between LRRC28 with FOXK1.

a, AlphaFold2-multimer predicted binding between LRRC28 and FOXK1. Kinetic ECAR response of b, MCF-7 cell line (n = 6, stable overexpression) and c, HEK293T cell line (n = 6; stable overexpression) overexpressing LRRC28 or LRCC42 to 10 mM glucose and 100 mM 2-DG. Unpaired two-sided Student’s t-test, (MCF-7; measurement 4 p = 0.06, 5 p = 0.1, 6 p = 0.3 & HEK293T measurement 4 p = 0.6, 5 p = 0.8, 6 p = 0.4). Panels b & c show mean ± s.d.; n shows biological independent experiments.

Supplementary information

Supplementary Information

Supplementary Information and Figs. 1–7.

Reporting Summary

Supplementary Tables1–23

Twenty-three supplementary tables support the results in this paper. The data can also serve as the source data to repeat the analysis results. Supplementary Table 1: Curated metadata for human and mouse datasets in RiboBase. Supplementary Table 2: Quality control of human ribosome profiling data. Supplementary Table 3: Quality control of mouse ribosome profiling data. Supplementary Table 4: Quality control of human RNA-seq data. Supplementary Table 5: Quality control of mouse RNA-seq data. Supplementary Table 6: RPF boundaries for human ribosome profiling data. Supplementary Table 7: RPF boundaries for mouse ribosome profiling data. Supplementary Table 8: Non-poly(A) gene list for human. Supplementary Table 9: Non-poly(A) gene list for mouse. Supplementary Table 10: Human linear regression-based TE. Supplementary Table 11: Mouse linear regression-based TE. Supplementary Table 12: Homologous genes between human and mouse. Supplementary Table 13: AUROC values for human GO terms with either ribosome profiling or RNA-seq data. Supplementary Table 14: AUROC values for mouse GO terms with either ribosome profiling or RNA-seq data. Supplementary Table 15: Literature-supported prediction of gene exhibiting TEC with genes associated with specific human and mouse GO terms. Supplementary Table 16: All predictions of a new gene exhibiting TEC with genes associated with specific human GO terms. Supplementary Table 17: All predictions of a new gene exhibiting TEC with genes associated with specific mouse GO terms. Supplementary Table 18: pDockQ and ipTM+PTM scores between human LRRC proteins and members of the forkhead family transcription factors. Supplementary Table 19: Two-sided chi-square test on the direction of similarity (same or different) among human gene pairs from GO terms and the STRING database. Supplementary Table 20: gRNA sequences for RBP validation. Supplementary Table 21: Primer sequences for human LRRC28 and LRRC42. Supplementary Table 22: Human 5′ TOP mRNA list and CSDE1 target gene list. Supplementary Table 23: Example comparing TE calculation using the canonical log-ratio method and the linear regression approach.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, Y., Rao, S., Hoskins, I. et al. Translation efficiency covariation identifies conserved coordination patterns across cell types. Nat Biotechnol (2025). https://doi.org/10.1038/s41587-025-02718-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • DOI: https://doi.org/10.1038/s41587-025-02718-5

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing