Translation efficiency covariation identifies conserved coordination patterns across cell types

Liu, Yue; Rao, Shilpa; Hoskins, Ian; Geng, Michael; Zhao, Qiuxia; Chacko, Jonathan; Ghatpande, Vighnesh; Qi, Kangsheng; Persyn, Logan; Wang, Jun; Zheng, Dinghai; Zhong, Yochen; Park, Dayea; Sarinay Cenik, Elif; Agarwal, Vikram; Ozadam, Hakan; Cenik, Can

doi:10.1038/s41587-025-02718-5

Article
Published: 25 July 2025

Translation efficiency covariation identifies conserved coordination patterns across cell types

Nature Biotechnology (2025)Cite this article

5058 Accesses
5 Citations
72 Altmetric
Metrics details

Subjects

Abstract

Characterizing shared patterns of RNA expression between genes across conditions has led to the discovery of regulatory networks and biological functions. However, it is unclear if such coordination extends to translation. In this study, we uniformly analyze 3,819 ribosome profiling datasets from 117 human and 94 mouse tissues and cell lines. We introduce the concept of translation efficiency covariation (TEC), identifying coordinated translation patterns across cell types. We nominate candidate mechanisms driving shared patterns of translation regulation. TEC is conserved across human and mouse cells and uncovers gene functions that are not evident from RNA or protein co-expression. Moreover, our observations indicate that proteins that physically interact are highly enriched for positive covariation at both translational and transcriptional levels. Our findings establish TEC as a conserved organizing principle of mammalian transcriptomes. TEC has potential as a predictive marker for gene function and may offer a framework for designing gene expression systems in synthetic biology and biotechnological applications.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: RiboBase: a comprehensive ribosome profiling database with thousands of experiments.**

**Fig. 2: TE defined using a compositional linear regression model is conserved across cell types and species.**

**Fig. 3: TE covariation is conserved between human and mouse.**

**Fig. 4: Genes associated with certain biological functions exhibit higher similarity patterns in TE than in RNA expression.**

**Fig. 5: TEC enables the prediction of gene functions.**

**Fig. 6: Physically interacting proteins display TEC.**

RNA-based translation activators for targeted gene upregulation

Article Open access 26 October 2023

Predicting the translation efficiency of messenger RNA in mammalian cells

Article 25 July 2025

Coding, or non-coding, that is the question

Article Open access 25 July 2024

Data availability

Metadata about RiboBase can be found in Supplementary Table 1. Ribo files for the HeLa cell line are accessible via Zenodo at https://doi.org/10.5281/zenodo.15660080 (ref. ¹⁰³). Full TEC and RNA co-expression matrices are accessible via Zenodo at https://doi.org/10.5281/zenodo.10373032 (ref. ¹²⁷). A RiboFlow configuration file and processed ribo files for RBP knockout can be accessed via Zenodo at https://doi.org/10.5281/zenodo.11388478 (ref. ¹³⁵). Sequencing data and ribo files for the RBP knockout experiments are available under GEO accession code GSE269734.

Code availability

The code and data used in this study are available via Zenodo at https://doi.org/10.5281/zenodo.10373032 (ref. ¹²⁷) and via GitHub at https://github.com/CenikLab/TE_model. The code and data used to generate figures can be found via Zenodo at https://doi.org/10.5281/zenodo.15337774 (ref. ¹³⁶) and via GitHub at https://github.com/CenikLab/coTE_paper.

References

Tang, F. et al. mRNA-Seq whole-transcriptome analysis of a single cell. Nat. Methods 6, 377–382 (2009).
Article CAS PubMed Google Scholar
Nagalakshmi, U. et al. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science 320, 1344–1349 (2008).
Article CAS PubMed PubMed Central Google Scholar
Mortazavi, A., Williams, B. A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5, 621–628 (2008).
Article CAS PubMed Google Scholar
Schena, M., Shalon, D., Davis, R. W. & Brown, P. O. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270, 467–470 (1995).
Article CAS PubMed Google Scholar
Chen, K. H., Boettiger, A. N., Moffitt, J. R., Wang, S. & Zhuang, X. RNA imaging. Spatially resolved, highly multiplexed RNA profiling in single cells. Science 348, aaa6090 (2015).
Article PubMed PubMed Central Google Scholar
Combs, P. A. & Eisen, M. B. Sequencing mRNA from cryo-sliced Drosophila embryos to determine genome-wide spatial patterns of gene expression. PLoS ONE 8, e71820 (2013).
Article CAS PubMed PubMed Central Google Scholar
Achim, K. et al. High-throughput spatial mapping of single-cell RNA-seq data to tissue of origin. Nat. Biotechnol. 33, 503–509 (2015).
Article CAS PubMed Google Scholar
Langfelder, P. & Horvath, S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9, 559 (2008).
Article PubMed PubMed Central Google Scholar
Eisen, M. B., Spellman, P. T., Brown, P. O. & Botstein, D. Cluster analysis and display of genome-wide expression patterns. Proc. Natl Acad. Sci. USA 95, 14863–14868 (1998).
Article CAS PubMed PubMed Central Google Scholar
Skinnider, M. A., Squair, J. W. & Foster, L. J. Evaluating measures of association for single-cell transcriptomics. Nat. Methods 16, 381–386 (2019).
Article CAS PubMed Google Scholar
Stuart, J. M., Segal, E., Koller, D. & Kim, S. K. A gene-coexpression network for global discovery of conserved genetic modules. Science 302, 249–255 (2003).
Article CAS PubMed Google Scholar
Marcotte, E. M., Pellegrini, M., Thompson, M. J., Yeates, T. O. & Eisenberg, D. A combined algorithm for genome-wide prediction of protein function. Nature 402, 83–86 (1999).
Article CAS PubMed Google Scholar
DeRisi, J. L., Iyer, V. R. & Brown, P. O. Exploring the metabolic and genetic control of gene expression on a genomic scale. Science 278, 680–686 (1997).
Article CAS PubMed Google Scholar
Jansen, R., Greenbaum, D. & Gerstein, M. Relating whole-genome expression data with protein–protein interactions. Genome Res. 12, 37–46 (2002).
Article CAS PubMed PubMed Central Google Scholar
Szklarczyk, D. et al. The STRING database in 2023: protein–protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res. 51, D638–D646 (2023).
Article CAS PubMed Google Scholar
Tavazoie, S., Hughes, J. D., Campbell, M. J., Cho, R. J. & Church, G. M. Systematic determination of genetic network architecture. Nat. Genet. 22, 281–285 (1999).
Article CAS PubMed Google Scholar
Roth, F. P., Hughes, J. D., Estep, P. W. & Church, G. M. Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation. Nat. Biotechnol. 16, 939–945 (1998).
Article CAS PubMed Google Scholar
Nusinow, D. P. et al. Quantitative proteomics of the Cancer Cell Line Encyclopedia. Cell 180, 387–402 (2020).
Article CAS PubMed PubMed Central Google Scholar
Gonçalves, E. et al. Pan-cancer proteomic map of 949 human cell lines. Cancer Cell 40, 835–849 (2022).
Article PubMed PubMed Central Google Scholar
Ryan, C. J., Kennedy, S., Bajrami, I., Matallanas, D. & Lord, C. J. A compendium of co-regulated protein complexes in breast cancer reveals collateral loss events. Cell Syst. 5, 399–409 (2017).
Article CAS PubMed PubMed Central Google Scholar
Singh, G., Pratt, G., Yeo, G. W. & Moore, M. J. The clothes make the mRNA: past and present trends in mRNP fashion. Annu. Rev. Biochem. 84, 325–354 (2015).
Article CAS PubMed PubMed Central Google Scholar
Keene, J. D. & Tenenbaum, S. A. Eukaryotic mRNPs may represent posttranscriptional operons. Mol. Cell 9, 1161–1167 (2002).
Article CAS PubMed Google Scholar
Keene, J. D. RNA regulons: coordination of post-transcriptional events. Nat. Rev. Genet. 8, 533–543 (2007).
Article CAS PubMed Google Scholar
Li, G.-W., Burkhardt, D., Gross, C. & Weissman, J. S. Quantifying absolute protein synthesis rates reveals principles underlying allocation of cellular resources. Cell 157, 624–635 (2014).
Article CAS PubMed PubMed Central Google Scholar
Taggart, J. C. & Li, G.-W. Production of protein-complex components is stoichiometric and lacks general feedback regulation in eukaryotes. Cell Syst. 7, 580–589 (2018).
Article CAS PubMed PubMed Central Google Scholar
Amirbeigiarab, S. et al. Invariable stoichiometry of ribosomal proteins in mouse brain tissues with aging. Proc. Natl Acad. Sci. USA 116, 22567–22572 (2019).
Article CAS PubMed PubMed Central Google Scholar
Soto, I. et al. Balanced mitochondrial and cytosolic translatomes underlie the biogenesis of human respiratory complexes. Genome Biol. 23, 170 (2022).
Article CAS PubMed PubMed Central Google Scholar
Natan, E. et al. Cotranslational protein assembly imposes evolutionary constraints on homomeric proteins. Nat. Struct. Mol. Biol. 25, 279–288 (2018).
Article CAS PubMed PubMed Central Google Scholar
Li, G.-W., Oh, E. & Weissman, J. S. The anti-Shine–Dalgarno sequence drives translational pausing and codon choice in bacteria. Nature 484, 538–541 (2012).
Article CAS PubMed PubMed Central Google Scholar
Bertolini, M. et al. Interactions between nascent proteins translated by adjacent ribosomes drive homomer assembly. Science 371, 57–64 (2021).
Article CAS PubMed PubMed Central Google Scholar
Ozadam, H., Geng, M. & Cenik, C. RiboFlow, RiboR and RiboPy: an ecosystem for analyzing ribosome profiling data at read length resolution. Bioinformatics 36, 2929–2931 (2020).
Article CAS PubMed PubMed Central Google Scholar
Gerashchenko, M. V. & Gladyshev, V. N. Ribonuclease selection for ribosome profiling. Nucleic Acids Res. 45, e6 (2017).
Article PubMed Google Scholar
Mohammad, F., Green, R. & Buskirk, A. R. A systematically-revised ribosome profiling method for bacteria reveals pauses at single-codon resolution. eLife 8, e42591 (2019).
Ingolia, N. T., Ghaemmaghami, S., Newman, J. R. S. & Weissman, J. S. Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 324, 218–223 (2009).
Article CAS PubMed PubMed Central Google Scholar
Larsson, O., Sonenberg, N. & Nadon, R. Identification of differential translation in genome wide studies. Proc. Natl Acad. Sci. USA 107, 21487–21492 (2010).
Article CAS PubMed PubMed Central Google Scholar
van den Boogaart, K. G., Filzmoser, P., Hron, K., Templ, M. & Tolosana-Delgado, R. Classical and robust regression analysis with compositional data. Math. Geosci. 53, 823–858 (2021).
Article Google Scholar
Quinn, T. P. et al. A field guide for the compositional analysis of any-omics data. Gigascience 8, giz107 (2019).
Quinn, T. P., Richardson, M. F., Lovell, D. & Crowley, T. M.propr: an R-package for identifying proportionally abundant features using compositional data analysis. Sci. Rep. 7, 16252 (2017).
Article PubMed PubMed Central Google Scholar
Sudmant, P. H., Alexis, M. S. & Burge, C. B. Meta-analysis of RNA-seq expression data across species, tissues and studies. Genome Biol. 16, 287 (2015).
Article PubMed PubMed Central Google Scholar
Wang, Z.-Y. et al. Transcriptome and translatome co-evolution in mammals. Nature 588, 642–647 (2020).
Article CAS PubMed PubMed Central Google Scholar
Lu, P., Takai, K., Weaver, V. M. & Werb, Z. Extracellular matrix degradation and remodeling in development and disease. Cold Spring Harb. Perspect. Biol. 3, a005058 (2011).
Artieri, C. G. & Fraser, H. B. Evolution at two levels of gene expression in yeast. Genome Res. 24, 411–421 (2014).
Article CAS PubMed PubMed Central Google Scholar
McManus, C. J., May, G. E., Spealman, P. & Shteyman, A. Ribosome profiling reveals post-transcriptional buffering of divergent gene expression in yeast. Genome Res. 24, 422–430 (2014).
Article CAS PubMed PubMed Central Google Scholar
Breschi, A., Gingeras, T. R. & Guigó, R. Comparative transcriptomics in human and mouse. Nat. Rev. Genet. 18, 425–440 (2017).
Article CAS PubMed PubMed Central Google Scholar
Crow, M., Suresh, H., Lee, J. & Gillis, J. Coexpression reveals conserved gene programs that co-vary with cell type across kingdoms. Nucleic Acids Res. 50, 4302–4314 (2022).
Article CAS PubMed PubMed Central Google Scholar
Thoreen, C. C. et al. A unifying model for mTORC1-mediated regulation of mRNA translation. Nature 485, 109–113 (2012).
Article CAS PubMed PubMed Central Google Scholar
Wurth, L. et al. UNR/CSDE1 drives a post-transcriptional program to promote melanoma invasion and metastasis. Cancer Cell 36, 337 (2019).
Article CAS PubMed Google Scholar
Pierson, E. et al. Sharing and specificity of co-expression networks across 35 human tissues. PLoS Comput. Biol. 11, e1004220 (2015).
Article PubMed PubMed Central Google Scholar
Kershaw, C. J. et al. Translation factor and RNA binding protein mRNA interactomes support broader RNA regulons for posttranscriptional control. J. Biol. Chem. 299, 105195 (2023).
Article CAS PubMed PubMed Central Google Scholar
Hentze, M. W., Castello, A., Schwarzl, T. & Preiss, T. A brave new world of RNA-binding proteins. Nat. Rev. Mol. Cell Biol. 19, 327–341 (2018).
Article CAS PubMed Google Scholar
Liu, Y. The number of genes whose TE significantly correlates with an RBP’s expression. Zenodo https://doi.org/10.5281/zenodo.11359114 (2024).
Korbel, J. O., Jensen, L. J., von Mering, C. & Bork, P. Analysis of genomic context: prediction of functional associations from conserved bidirectionally transcribed gene pairs. Nat. Biotechnol. 22, 911–917 (2004).
Article CAS PubMed Google Scholar
Szklarczyk, R. et al. WeGET: predicting new genes for molecular systems by weighted co-expression. Nucleic Acids Res. 44, D567–D573 (2016).
Article CAS PubMed Google Scholar
Zhang, M. et al. RNA-binding protein IMP3 is a novel regulator of MEK1/ERK signaling pathway in the progression of colorectal cancer through the stabilization of MEKK1 mRNA. J. Exp. Clin. Cancer Res. 40, 200 (2021).
Article CAS PubMed PubMed Central Google Scholar
Bodén, M. & Bailey, T. L. Associating transcription factor-binding site motifs with target GO terms and target genes. Nucleic Acids Res. 36, 4108–4117 (2008).
Article PubMed PubMed Central Google Scholar
Machanick, P. & Bailey, T. L. MEME-ChIP: motif analysis of large DNA datasets. Bioinformatics 27, 1696–1697 (2011).
Article CAS PubMed PubMed Central Google Scholar
Eichhorn, S. W. et al. mRNA destabilization is the dominant effect of mammalian microRNAs by the time substantial repression ensues. Mol. Cell 56, 104–115 (2014).
Article CAS PubMed PubMed Central Google Scholar
Bartel, D. P. Metazoan microRNAs. Cell 173, 20–51 (2018).
Article CAS PubMed PubMed Central Google Scholar
Mecham, R. The Extracellular Matrix: An Overview (Springer Science & Business Media, 2011).
Kagan, H. M. & Li, W. Lysyl oxidase: properties, specificity, and biological roles inside and outside of the cell. J. Cell. Biochem. 88, 660–672 (2003).
Article CAS PubMed Google Scholar
Kikuchi, A. et al. Structural basis for activation of DNMT1. Nat. Commun. 13, 7130 (2022).
Article PubMed PubMed Central Google Scholar
Wu, Y.-Y. et al. The hTERT-p50 homodimer inhibits PLEKHA7 expression to promote gastric cancer invasion and metastasis. Oncogene 42, 1144–1156 (2023).
Article CAS PubMed PubMed Central Google Scholar
Kurita, S., Yamada, T., Rikitsu, E., Ikeda, W. & Takai, Y. Binding between the junctional proteins afadin and PLEKHA7 and implication in the formation of adherens junction in epithelial cells. J. Biol. Chem. 288, 29356–29368 (2013).
Article CAS PubMed PubMed Central Google Scholar
Pulimeno, P., Paschoud, S. & Citi, S. A role for ZO-1 and PLEKHA7 in recruiting paracingulin to tight and adherens junctions of epithelial cells. J. Biol. Chem. 286, 16743–16750 (2011).
Article CAS PubMed PubMed Central Google Scholar
Jeung, H.-C. et al. PLEKHA7 signaling is necessary for the growth of mutant KRAS driven colorectal cancer. Exp. Cell. Res. 409, 112930 (2021).
Article CAS PubMed PubMed Central Google Scholar
Tavano, S. et al. Insm1 induces neural progenitor delamination in developing neocortex via downregulation of the adherens junction belt-specific protein Plekha7. Neuron 97, 1299–1314 (2018).
Article CAS PubMed Google Scholar
Sukonina, V. et al. FOXK1 and FOXK2 regulate aerobic glycolysis. Nature 566, 279–283 (2019).
Article CAS PubMed Google Scholar
Kobe, B. & Kajava, A. V. The leucine-rich repeat as a protein recognition motif. Curr. Opin. Struct. Biol. 11, 725–732 (2001).
Article CAS PubMed Google Scholar
Evans, R. et al. Protein complex prediction with AlphaFold-Multimer. Preprint at bioRxiv https://doi.org/10.1101/2021.10.04.463034 (2021).
Carlsson, P. & Mahlapuu, M. Forkhead transcription factors: key players in development and metabolism. Dev. Biol. 250, 1–23 (2002).
Article CAS PubMed Google Scholar
Lambert, S. A. et al. The human transcription factors. Cell 172, 650–665 (2018).
Article CAS PubMed Google Scholar
Kustatscher, G. et al. Co-regulation map of the human proteome enables identification of protein functions. Nat. Biotechnol. 37, 1361–1371 (2019).
Article CAS PubMed PubMed Central Google Scholar
Szklarczyk, D. et al. STRING v10: protein–protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 43, D447–D452 (2015).
Article CAS PubMed Google Scholar
Shiber, A. et al. Cotranslational assembly of protein complexes in eukaryotes revealed by ribosome profiling. Nature 561, 268–272 (2018).
Article CAS PubMed PubMed Central Google Scholar
Ewing, R. M. et al. Large-scale mapping of human protein–protein interactions by mass spectrometry. Mol. Syst. Biol. 3, 89 (2007).
Article PubMed PubMed Central Google Scholar
Drew, K., Wallingford, J. B. & Marcotte, E. M. hu.MAP 2.0: integration of over 15,000 proteomic experiments builds a global compendium of human multiprotein assemblies. Mol. Syst. Biol. 17, e10016 (2021).
Article PubMed PubMed Central Google Scholar
Heider, M. R. et al. Subunit connectivity, assembly determinants and architecture of the yeast exocyst complex. Nat. Struct. Mol. Biol. 23, 59–66 (2016).
Article CAS PubMed Google Scholar
Kee, Y. et al. Subunit structure of the mammalian exocyst complex. Proc. Natl Acad. Sci. USA 94, 14438–14443 (1997).
Article CAS PubMed PubMed Central Google Scholar
Lalanne, J.-B. et al. Evolutionary convergence of pathway-specific enzyme expression stoichiometry. Cell 173, 749–761 (2018).
Article CAS PubMed PubMed Central Google Scholar
Bicknell, A. A. et al. Attenuating ribosome load improves protein output from mRNA by limiting translation-dependent mRNA decay. Cell Rep. 43, 114098 (2024).
Article CAS PubMed Google Scholar
Liu, T.-Y. et al. Time-resolved proteomics extends ribosome profiling-based measurements of protein synthesis dynamics. Cell Syst. 4, 636–644 (2017).
Article CAS PubMed PubMed Central Google Scholar
Wang, M., Herrmann, C. J., Simonovic, M., Szklarczyk, D. & von Mering, C. Version 4.0 of PaxDb: protein abundance data, integrated across model organisms, tissues, and cell-lines. Proteomics 15, 3163–3168 (2015).
Article CAS PubMed PubMed Central Google Scholar
Piepoli, A. et al. The expression of leucine-rich repeat gene family members in colorectal cancer. Exp. Biol. Med. 237, 1123–1128 (2012).
Article CAS Google Scholar
Liu, Y. et al. Identification of differential expression of genes in hepatocellular carcinoma by suppression subtractive hybridization combined cDNA microarray. Oncol. Rep. 18, 943–951 (2007).
PubMed Google Scholar
Chen, H. et al. miR-218 contributes to drug resistance in multiple myeloma via targeting LRRC28. J. Cell. Biochem. 122, 305–314 (2021).
Article CAS PubMed Google Scholar
Vander Heiden, M. G., Cantley, L. C. & Thompson, C. B. Understanding the Warburg effect: the metabolic requirements of cell proliferation. Science 324, 1029–1033 (2009).
Article Google Scholar
Liu, Y. et al. Histone H2AX promotes metastatic progression by preserving glycolysis via hexokinase-2. Sci. Rep. 12, 3758 (2022).
Article PubMed PubMed Central Google Scholar
Zheng, D. et al. Predicting the translation efficiency of messenger RNA in mammalian cells. Nat. Bio. https://doi.org/10.1038/s41587-025-02712-x (2025).
Rodriguez, J. M. et al. APPRIS: annotation of principal and alternative splice isoforms. Nucleic Acids Res. 41, D110–D117 (2013).
Article CAS PubMed Google Scholar
Rao, S. et al. Genes with 5′ terminal oligopyrimidine tracts preferentially escape global suppression of translation by the SARS-CoV-2 Nsp1 protein. RNA 27, 1025–1045 (2021).
Article CAS PubMed PubMed Central Google Scholar
Mills, E. W. & Green, R. Ribosomopathies: there’s strength in numbers. Science 358, eaan2755 (2017).
Ozadam, H. et al. Single-cell quantification of ribosome occupancy in early mouse development. Nature 618, 1057–1064 (2023).
Article CAS PubMed PubMed Central Google Scholar
VanInsberghe, M., van den Berg, J., Andersson-Rolf, A., Clevers, H. & van Oudenaarden, A. Single-cell Ribo-seq reveals cell cycle-dependent translational pausing. Nature 597, 561–565 (2021).
Article CAS PubMed Google Scholar
Benoit Bouvrette, L. P., Bovaird, S., Blanchette, M. & Lécuyer, E. oRNAment: a database of putative RNA binding protein target sites in the transcriptomes of model species. Nucleic Acids Res. 48, D166–D173 (2020).
PubMed Google Scholar
Krismer, K. et al. Transite: a computational motif-based analysis platform that identifies RNA-binding proteins modulating changes in gene expression. Cell Rep. 32, 108064 (2020).
Article CAS PubMed PubMed Central Google Scholar
Van Nostrand, E. L. et al. A large-scale binding and functional map of human RNA-binding proteins. Nature 583, 711–719 (2020).
Article PubMed PubMed Central Google Scholar
Hou, Y., Xie, T., He, L., Tao, L. & Huang, J. Topological links in predicted protein complex structures reveal limitations of AlphaFold. Commun. Biol. 6, 1098 (2023).
Article CAS PubMed PubMed Central Google Scholar
Burke, D. F. et al. Towards a structurally resolved human protein interaction network. Nat. Struct. Mol. Biol. 30, 216–225 (2023).
Article CAS PubMed PubMed Central Google Scholar
Bryant, P., Pozzati, G. & Elofsson, A. Improved prediction of protein-protein interactions using AlphaFold2. Nat. Commun. 13, 1265 (2022).
Article CAS PubMed PubMed Central Google Scholar
National Center for Biotechnology Information. SRA Tools. GitHub https://github.com/ncbi/sra-tools (2018).
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 17, 10–12 (2011).
Article Google Scholar
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Article CAS PubMed PubMed Central Google Scholar
Liu, Y. HeLa ribosome profiling data. Zenodo https://doi.org/10.5281/zenodo.15660080 (2024).
Gerashchenko, M. V. & Gladyshev, V. N. Translation inhibitors cause abnormalities in ribosome profiling experiments. Nucleic Acids Res. 42, e134 (2014).
Article PubMed PubMed Central Google Scholar
Wu, C. C.-C., Zinshteyn, B., Wehner, K. A. & Green, R. High-resolution ribosome profiling defines discrete ribosome elongation states and translational regulation during cellular stress. Mol. Cell 73, 959–970 (2019).
Article CAS PubMed PubMed Central Google Scholar
Wolin, S. L. & Walter, P. Ribosome pausing and stacking during translation of a eukaryotic mRNA. EMBO J. 7, 3559–3569 (1988).
Article CAS PubMed PubMed Central Google Scholar
Sharma, J. et al. A small molecule that induces translational readthrough of CFTR nonsense mutations by eRF1 depletion. Nat. Commun. 12, 4358 (2021).
Article CAS PubMed PubMed Central Google Scholar
Tukey, J. W. The future of data analysis. Ann. Math. Stat. 33, 1–67 (1962).
Article Google Scholar
Zhang, X.-O., Yin, Q.-F., Chen, L.-L. & Yang, L. Gene expression profiling of non-polyadenylated RNA-seq across species. Genom. Data 2, 237–241 (2014).
Article PubMed PubMed Central Google Scholar
Yang, L., Duff, M. O., Graveley, B. R., Carmichael, G. G. & Chen, L.-L. Genomewide characterization of non-polyadenylated RNAs. Genome Biol. 12, R16 (2011).
Article CAS PubMed PubMed Central Google Scholar
van den Boogaart, K. G. & Tolosano-Delgado, R. Analyzing Compositional Data with R (Springer, 2013).
Cenik, C. et al. Integrative analysis of RNA, translation, and protein levels reveals distinct regulatory variation across humans. Genome Res. 25, 1610–1621 (2015).
Article CAS PubMed PubMed Central Google Scholar
Greenacre, M. Compositional data analysis. Annu. Rev. Stat. Appl. 8, 271–299 (2021).
Article Google Scholar
Ramsköld, D., Wang, E. T., Burge, C. B. & Sandberg, R. An abundance of ubiquitously expressed genes revealed by tissue transcriptome sequence data. PLoS Comput. Biol. 5, e1000598 (2009).
Article PubMed PubMed Central Google Scholar
Csárdi, G., Franks, A., Choi, D. S., Airoldi, E. M. & Drummond, D. A. Accounting for experimental noise reveals that mRNA levels, amplified by post-transcriptional processes, largely determine steady-state protein levels in yeast. PLoS Genet. 11, e1005206 (2015).
Article PubMed PubMed Central Google Scholar
Schilder, B. M. & Skene, N. G. orthogene: An R package for easy mapping of orthologous genes across hundreds of species. R package version 3.21 https://doi.org/10.18129/B9.bioc.orthogene (2022).
van den Boogaart, K. G. & Tolosana-Delgado, R. ‘compositions’: a unified R package to analyze compositional data. Comput. Geosci. 34, 320–338 (2008).
Article Google Scholar
Kim, S. ppcor: an R package for a fast calculation to semi-partial correlation coefficients. Commun. Stat. Appl. Methods 22, 665–674 (2015).
PubMed PubMed Central Google Scholar
Berriz, G. F., Beaver, J. E., Cenik, C., Tasan, M. & Roth, F. P. Next generation software for functional trend analysis. Bioinformatics 25, 3043–3044 (2009).
Article CAS PubMed PubMed Central Google Scholar
Buttrey, S. & Whitaker, L. TreeClust: an R package for tree-based clustering dissimilarities. R J. 7, 227 (2015).
Article Google Scholar
Wainberg, M. et al. A genome-wide atlas of co-essential modules assigns function to uncharacterized genes. Nat. Genet. 53, 638–649 (2021).
Article CAS PubMed PubMed Central Google Scholar
Gene Ontology Consortium et al. The Gene Ontology knowledgebase in 2023. Genetics 224, iyad031 (2023).
Philippe, L., van den Elzen, A. M. G., Watson, M. J. & Thoreen, C. C. Global analysis of LARP1 translation targets reveals tunable and dynamic features of 5′ TOP motifs. Proc. Natl Acad. Sci. USA 117, 5319–5328 (2020).
Article CAS PubMed PubMed Central Google Scholar
Ballouz, S., Weber, M., Pavlidis, P. & Gillis, J. EGAD: ultra-fast functional analysis of gene networks. Bioinformatics 33, 612–614 (2017).
Article CAS PubMed Google Scholar
Carlson, M. org.Mm.eg.db: Genome wide annotation for mouse. R package version 3.21 https://doi.org/10.18129/B9.bioc.org.Mm.eg.db (2025).
Carlson, M. org.Hs.eg.db: Genome wide annotation for human. R package version 3.21 https://doi.org/10.18129/B9.bioc.org.Hs.eg.db (2025).
Liu, Y. Intermediate data for TE calculation. Zenodo https://doi.org/10.5281/zenodo.10373032 (2024).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Article CAS PubMed PubMed Central Google Scholar
The UniProt Consortium. UniProt: the Universal Protein Knowledgebase in 2023. Nucleic Acids Res. 51, D523–D531 (2023).
Article Google Scholar
Hu, Y. et al. Paralog Explorer: a resource for mining information about paralogs in common research organisms. Comput. Struct. Biotechnol. J. 20, 6570–6577 (2022).
Article CAS PubMed PubMed Central Google Scholar
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Article CAS PubMed PubMed Central Google Scholar
Ho, D., Imai, K., King, G. & Stuart, E. A. MatchIt: nonparametric preprocessing for parametric causal inference. J. Stat. Softw. https://doi.org/10.18637/jss.v042.i08 (2011).
Sanson, K. R. et al. Optimized libraries for CRISPR–Cas9 genetic screens with multiple modalities. Nat. Commun. 9, 5416 (2018).
Article CAS PubMed PubMed Central Google Scholar
Sanjana, N. E., Shalem, O. & Zhang, F. Improved vectors and genome-wide libraries for CRISPR screening. Nat. Methods 11, 783–784 (2014).
Article CAS PubMed PubMed Central Google Scholar
Liu, Y. KO_validation_RiboBase. Zenodo https://doi.org/10.5281/zenodo.11388478 (2024).
Yue, L. coTE_paper: code and to generate main figures. Zenodo https://doi.org/10.5281/zenodo.15337774 (2025).

Download references

Acknowledgements

We thank all contributors to metadata curation: H. Chiang, A. Hoffman, T. Tonn, A. Segura, C. Tante, E. Vasquez and L. Xu. We also thank Y. Shin and V. D. Chapman for their help with the experiments. We appreciate M. Miladi for providing critical feedback. The original text in this paper was written by the authors. A large language model was used to suggest edits for clarity and grammar (Open AI ChatGPT, https://chat.openai.com). The authors acknowledge the Texas Advanced Computing Center at The University of Texas at Austin (http://www.tacc.utexas.edu) for providing high-performance computing and storage resources that contributed to the research results reported within this paper.

Research reported in this publication was supported in part by the National Institute of General Medical Sciences of the National Institutes of Health (NIH) under award numbers R35GM150667 (C.C.) and R35GM138340 (E.S.C.). This work was also supported by NIH grant HD110096 (C.C.) and Welch Foundation grants F-2027-20230405 (C.C.) and F-2133-20230405 (E.S.C.). C.C. was a Cancer Prevention and Research Institute of Texas (CPRIT) Scholar in Cancer Research, supported by CPRIT grant RR180042.

Author information

Hakan Ozadam
Present address: Sail Biomedicines, Cambridge, MA, USA

Authors and Affiliations

Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX, USA
Yue Liu, Shilpa Rao, Ian Hoskins, Michael Geng, Qiuxia Zhao, Jonathan Chacko, Vighnesh Ghatpande, Kangsheng Qi, Logan Persyn, Yochen Zhong, Dayea Park, Elif Sarinay Cenik, Hakan Ozadam & Can Cenik
mRNA Center of Excellence, Sanofi, Waltham, MA, USA
Jun Wang, Dinghai Zheng & Vikram Agarwal

Authors

Yue Liu
View author publications
Search author on:PubMed Google Scholar
Shilpa Rao
View author publications
Search author on:PubMed Google Scholar
Ian Hoskins
View author publications
Search author on:PubMed Google Scholar
Michael Geng
View author publications
Search author on:PubMed Google Scholar
Qiuxia Zhao
View author publications
Search author on:PubMed Google Scholar
Jonathan Chacko
View author publications
Search author on:PubMed Google Scholar
Vighnesh Ghatpande
View author publications
Search author on:PubMed Google Scholar
Kangsheng Qi
View author publications
Search author on:PubMed Google Scholar
Logan Persyn
View author publications
Search author on:PubMed Google Scholar
Jun Wang
View author publications
Search author on:PubMed Google Scholar
Dinghai Zheng
View author publications
Search author on:PubMed Google Scholar
Yochen Zhong
View author publications
Search author on:PubMed Google Scholar
Dayea Park
View author publications
Search author on:PubMed Google Scholar
Elif Sarinay Cenik
View author publications
Search author on:PubMed Google Scholar
Vikram Agarwal
View author publications
Search author on:PubMed Google Scholar
Hakan Ozadam
View author publications
Search author on:PubMed Google Scholar
Can Cenik
View author publications
Search author on:PubMed Google Scholar

Contributions

Y.L., I.H. and C.C. co-wrote the original manuscript. Y.L., I.H. and S.R. generated the figures for the manuscript. H.O., M.G. and J.C. downloaded all the data from the GEO and processed raw sequencing data. Y.L. and C.C. developed the TE calculation pipeline. J.C. and C.C. designed and implemented the winsorization method. Y.Z. performed the deduplication comparison. Y.L. and C.C. developed the TE covariation analysis and function prediction pipelines. Y.L., K.Q. and H.O. performed the quality control analysis for all sequencing data. Y.L. carried out covariation analysis, gene function prediction and AlphaFold2 analysis. I.H. conducted the RBP analysis. L.P., J.W., D.Z. and V.A. assessed the quality of TE measurements by developing machine learning approaches. H.O., J.W., D.Z., V.A., Q.Z. and E.S.C. provided suggestions for the manuscript. Y.L., Q.Z. and E.S.C. conducted the literature search to evaluate gene function predictions. S.R. conducted the experimental validation of LRRC28. S.R., I.H., V.G. and D.P. performed other experiments. C.C. provided study oversight, conceptualized the study and acquired funding. All authors approved the final manuscript.

Corresponding author

Correspondence to Can Cenik.

Ethics declarations

Competing interests

D.Z., J.W. and V.A. are employees of Sanofi and may hold shares and/or stock options in the company. H.O. is an employee of Sail Biomedicines. I.H. is an employee of Monoceros Biosystems. The remaining authors declare no competing interests.

Peer review

Peer review information

Nature Biotechnology thanks the anonymous reviewers for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Sequencing quality of ribosome profiling data.

a, Distribution of read counts for 2,195 human and 1,624 mouse ribosome profiling data in RiboBase. In all figure panels, the horizontal line corresponds to the median. The box represents the interquartile range and the whiskers extend to 1.5 times of it. b, Distribution plot similar to panel a for 1,282 human and 995 mouse ribosome profiling data with matched RNA-seq. c, Distribution of the proportion of read count aligned to transcripts, read counts with high-quality alignments, and the percentage of reads remaining after PCR deduplication, relative to the total number of reads from panel a. d, Similar plot as panel c for ribosome profiling with matched RNA-seq. e, The read length distribution of RPFs aligned to coding sequences for all human experiments. The color in the heatmap represents the z-score adjusted RPF counts (Methods). Each experiment where the percentage of RPFs mapping to CDS was greater than 70% and achieving sufficient coverage of the transcript (>= 0.1X) was annotated as QC-pass. f, Similar to panel a for mouse samples.

Extended Data Fig. 2 Quality control and RPF length selection.

a, RPFs shorter than 21 nucleotides were removed, then we identified the RPF length with the highest number of reads mapping to CDS to serve as the starting point. Subsequently, we compared one nucleotide longer or shorter than the first and chose the length with the most reads again. This looping process continued until at least 85% of the total CDS mapping RPFs were included. b, We compared the usable reads selected with two different boundary cutoffs (y-axis) and the proportion of these selected reads that map to the coding regions (x-axis) for each ribosome profiling experiment. c, The percentage of ribosome profiling experiments from GEO that pass or fail quality control (the percentage of RPFs mapping to CDS was greater than 70% and achieving at least 0.1X coverage of the transcript as QC pass).

Extended Data Fig. 3 Assessment of periodicity and data matching for TE estimation.

a-d, In ribosome profiling experiments from RiboBase, samples were classified according to distinct periodicity patterns (Methods). For all figure panels, we added error bars to represent the standard deviation across samples. Statistical significance was assessed using the Wilcoxon test, and the p-values were subsequently adjusted for all 33 comparisons using the Benjamini-Hochberg method. We considered the Group 1 pattern as indicative of the expected three-nucleotide periodicity patterns. Human samples that pass quality control (a), human samples that fail quality control (b), mouse samples that pass quality control (c), mouse samples that fail quality control (d). e, We calculated the coefficient of determination (R²) between a specific ribosome profiling experiment and its corresponding RNA-seq from RiboBase. Additionally, we determined the average R² for all other pairings for the same ribosome profiling sample with other RNA-seq data from the same study. The matching score represents the difference in R² values between these two (x-axis; Methods). f, A dashed line at 0.188 serves as the threshold to identify samples with poor matching (Methods). In each figure panel containing boxplots, the horizontal line corresponds to the median. The box represents the IQR and the whiskers extend to 1.5 times of it. g, Distribution of standard error of TE values across tissue and cell lines (y-axis) for genes with polyA and without polyA tails.

Extended Data Fig. 4 Detailed workflow of data processing for TE and TEC calculations.

a, We selected ribosome profiling data with matched RNA-seq and removed duplicated reads with identical positions and lengths (PCR-deduplication). We set the RPF read length range for individual samples with our dynamic cutoff and filtered out ribosome profiling experiments that failed quality control. After selecting high-quality samples, we reprocessed all these ribosome profiling experiments using the winsorization method with non-deduplicated data. We removed genes without polyA tails and kept genes with sufficient counts per million RPFs. After obtaining RPF counts from the coding regions for both ribosome profiling and RNA-seq, we performed CLR normalization and compositional linear regression, defining the residuals as TE for each gene in each sample. We averaged this sample-level TE based on cell lines and tissues. TEC is further calculated with rho scores³⁸. To build an RNA co-expression matrix, we transformed CDS counts from RNA-seq experiments using CLR, averaged them based on cell lines and tissue, and calculated pairwise proportionalities (rho scores).

Extended Data Fig. 5 Spearman correlation between TE and protein abundance.

a, The correlation between protein abundance and clr-transformed RPF counts from ribosome profiling (left), clr-transformed read counts from RNA-seq (middle), or TE calculated with winsorized RPFs counts using the linear regression model (right). Individual dots indicate specific experiments colored according to study (68 samples from 11 studies-HEK293, 86 samples from 10 studies-HeLa, 58 samples from 4 studies-U2OS, 29 samples from 5 studies-A549, 5 samples from 2 studies-MCF7, 7 samples from 2 studies-K562, 10 samples from 2 studies-HepG2). In the boxplot, the horizontal line corresponds to the median. The box represents the IQR and the whiskers extend to 1.5 times of this range. b, TE was calculated with winsorized RPF counts without deduplication or with deduplication based on position and fragment length. The Spearman correlation coefficient between TE calculated with winsorized RPF counts and protein abundance⁸² (y-axis) was plotted against “delta correlation” (x-axis) defined by subtracting the correlation values obtained with PCR deduplication from those obtained with the method using winsorized RPF counts without deduplication.

Extended Data Fig. 6 PCR vs UMI deduplication comparison for GSE144140.

a, Metagene plots centered on the start codon for samples GSM4282032 (RPFs range: 28-36 nt), GSM4282033 (RPFs range: 28-36 nt range), and GSM4282034 (RPFs range: 26-35 nt range) were plotted using three different deduplication methods: non-deduplication (ND), UMI-deduplication (UMI), and PCR-deduplication (PCR). b, Correlation of gene counts for GSM4282032 between the three deduplication methods. A blue diagonal line represents a 1:1 ratio in all figure panels. Same analysis as panel b for GSM4282033c, and GSM4282034d.

Extended Data Fig. 7 Conservation of gene expression between human and mouse.

a, The relationship between the mean RNA expressions (clr-transformed counts) of 9,194 orthologous genes across two species is plotted. Dots represent genes in all figure panels. b, The variability of genes’ RNA expression was quantified with metric standard deviation (msd; Methods) across different cell lines and tissues in either human or mouse. To account for the correlation between mean RNA expression and its variability, we adjusted the msd values with their mean values (Methods). c, The scatter plot shows the adjusted msd values (y-axis; Methods) and the average TE across different cell types (x-axis) for human genes. d, Similar analysis as in panel c for mouse genes.

Extended Data Fig. 8 Evaluation TEC calculation methods and TEC patterns.

a, The AUROCs for biological functions were calculated using the similarity scores among genes at ribosome occupancy level determined by eight distinct methods with 1,794 human ribosome profiling data (Methods). In the boxplot, the horizontal line corresponds to the median. The box represents the IQR and the whiskers extend to the largest value within 1.5 times the IQR from the hinge. The dot in this figure represents the AUROC for human 5’ TOP mRNAs. b, TE values that were randomly reassigned from the original data for each gene (shuffled) and TEC was calculated. In the figure panel, we plotted the number of orthologous gene pairs within specified ranges. Each dot represents the aggregated log₁₀-transformed counts of these gene pairs. The dashed line captures 95% of the data. c, Distribution of absolute TEC among 110 TOP motif-containing mRNAs¹²³ and 83 transcripts targeted by CSDE1 (Supplementary Table 22 (ref. ⁴⁷); Methods) in comparison to all 11,149 human genes as background. Statistical significance between the groups was assessed using a Wilcoxon two-tailed test.

Extended Data Fig. 9 TEC and RNA co-expression among genes with shared functions.

a, A comparison between the number of human GO terms that have AUROC of 0.8 or higher with either TEC or RNA co-expression. b, Motif enrichment in human GO terms. RNA binding proteins (RBPs) from oRNAment⁹⁴ or Transite⁹⁵ are indicated. P-values were corrected using the Holm method and those kmers with a p-value < 0.05 are shown. c, Venn diagram for mouse GO terms that achieve an AUROC of 0.8 or higher with proportionality scores (rho) among genes at either TE or RNA expression level. d, The AUROC plot was calculated with genes associated with mannosyltransferase activity in mice. e, The connections represent absolute rho values above 0.1 in either TE pattern alone (green) from d, in both RNA co-expression and TE pattern (blue), or RNA co-expression alone (gray). f, Motif enrichment in mouse GO terms. RNA binding proteins (RBPs) from oRNAment⁹⁴ or Transite⁹⁵ are indicated. P-values were corrected using the Holm method and those kmers with a p-value < 0.05 are shown. g, We summarized GO terms where genes exhibit greater similarity at the TE level than at the RNA expression level (AUROC with TEC > 0.8, and different AUROC between TEC and RNA co-expression > 0.1) in mice. We visualized the distribution of absolute rho score for gene pairs within each specific GO term (bottom; gene pairs with abs(rho) > 0.1) at the TE level.

Extended Data Fig. 10 3D structure of the interaction between LRRC28 with FOXK1.

a, AlphaFold2-multimer predicted binding between LRRC28 and FOXK1. Kinetic ECAR response of b, MCF-7 cell line (n = 6, stable overexpression) and c, HEK293T cell line (n = 6; stable overexpression) overexpressing LRRC28 or LRCC42 to 10 mM glucose and 100 mM 2-DG. Unpaired two-sided Student’s t-test, (MCF-7; measurement 4 p = 0.06, 5 p = 0.1, 6 p = 0.3 & HEK293T measurement 4 p = 0.6, 5 p = 0.8, 6 p = 0.4). Panels b & c show mean ± s.d.; n shows biological independent experiments.

Supplementary information

Supplementary Information

Supplementary Information and Figs. 1–7.

Reporting Summary

Supplementary Tables1–23

Twenty-three supplementary tables support the results in this paper. The data can also serve as the source data to repeat the analysis results. Supplementary Table 1: Curated metadata for human and mouse datasets in RiboBase. Supplementary Table 2: Quality control of human ribosome profiling data. Supplementary Table 3: Quality control of mouse ribosome profiling data. Supplementary Table 4: Quality control of human RNA-seq data. Supplementary Table 5: Quality control of mouse RNA-seq data. Supplementary Table 6: RPF boundaries for human ribosome profiling data. Supplementary Table 7: RPF boundaries for mouse ribosome profiling data. Supplementary Table 8: Non-poly(A) gene list for human. Supplementary Table 9: Non-poly(A) gene list for mouse. Supplementary Table 10: Human linear regression-based TE. Supplementary Table 11: Mouse linear regression-based TE. Supplementary Table 12: Homologous genes between human and mouse. Supplementary Table 13: AUROC values for human GO terms with either ribosome profiling or RNA-seq data. Supplementary Table 14: AUROC values for mouse GO terms with either ribosome profiling or RNA-seq data. Supplementary Table 15: Literature-supported prediction of gene exhibiting TEC with genes associated with specific human and mouse GO terms. Supplementary Table 16: All predictions of a new gene exhibiting TEC with genes associated with specific human GO terms. Supplementary Table 17: All predictions of a new gene exhibiting TEC with genes associated with specific mouse GO terms. Supplementary Table 18: pDockQ and ipTM+PTM scores between human LRRC proteins and members of the forkhead family transcription factors. Supplementary Table 19: Two-sided chi-square test on the direction of similarity (same or different) among human gene pairs from GO terms and the STRING database. Supplementary Table 20: gRNA sequences for RBP validation. Supplementary Table 21: Primer sequences for human LRRC28 and LRRC42. Supplementary Table 22: Human 5′ TOP mRNA list and CSDE1 target gene list. Supplementary Table 23: Example comparing TE calculation using the canonical log-ratio method and the linear regression approach.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liu, Y., Rao, S., Hoskins, I. et al. Translation efficiency covariation identifies conserved coordination patterns across cell types. Nat Biotechnol (2025). https://doi.org/10.1038/s41587-025-02718-5

Download citation

Received: 13 November 2024
Accepted: 23 May 2025
Published: 25 July 2025
Version of record: 25 July 2025
DOI: https://doi.org/10.1038/s41587-025-02718-5