这是indexloc提供的服务,不要输入任何密码
Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Predicting the translation efficiency of messenger RNA in mammalian cells

Abstract

The mechanisms by which mRNA sequences specify translational control remain poorly understood in mammalian cells. Here we generate a transcriptome-wide atlas of translation efficiency (TE) measurements encompassing more than 140 human and mouse cell types from 3,819 ribosomal profiling datasets. We develop RiboNN, a state-of-the-art multitask deep convolutional neural network, and classic machine learning models to predict TEs in hundreds of cell types from sequence-encoded mRNA features. While most earlier models solely considered the 5′ untranslated region (UTR) sequence, RiboNN integrates how the spatial positioning of low-level dinucleotide and trinucleotide features (that is, including codons) influences TE, capturing mechanistic principles such as how ribosomal processivity and tRNA abundance control translational output. RiboNN predicts the translational behavior of base-modified therapeutic RNA and explains evolutionary selection pressures in human 5′ UTRs. Finally, it detects a common language governing mRNA regulatory control and highlights the interconnectedness of mRNA translation, stability and localization in mammalian organisms.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Integrative analysis of thousands of human and mouse ribosomal profiling datasets measuring TE.
Fig. 2: A classical ML approach to predict mammalian TEs from mRNA sequence.
Fig. 3: Performance and interpretation of deep learning models predicting mammalian TEs from mRNA sequence.
Fig. 4: RiboNN predicts the impact of RNA modifications, genetic variants and reporter constructs on translation.
Fig. 5: Interrelationships between mRNA translation, turnover and subcellular localization.

Similar content being viewed by others

Data availability

We provide the processed data without restriction in supplementary tables herein.

Code availability

Code and pretrained models are available on Zenodo127 and GitHub (https://github.com/Sanofi-Public/RiboNN/). Our classic ML model code is available on Zenodo128 and GitHub (https://github.com/CenikLab/TE_classic_ML).

References

  1. Agarwal, V. & Shendure, J. Predicting mRNA abundance directly from genomic sequence using deep convolutional neural networks. Cell Rep. 31, 107663 (2020).

    Article  CAS  PubMed  Google Scholar 

  2. Zhou, J. et al. Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk. Nat. Genet. 50, 1171–1179 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Avsec, Ž. et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat. Methods 18, 1196–1203 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Kelley, D. R. et al. Sequential regulatory activity prediction across chromosomes with convolutional neural networks. Genome Res. 28, 739–750 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Wang, J. & Agarwal, V. How DNA encodes the start of transcription. Science 384, 382–383 (2024).

    Article  CAS  PubMed  Google Scholar 

  6. Linder, J., Srivastava, D., Yuan, H., Agarwal, V. & Kelley, D. R. Predicting RNA-seq coverage from DNA sequence as a unifying model of gene regulation. Nat. Genet. 57, 949–961 (2025).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Agarwal, V. & Kelley, D. R. The genetic and biochemical determinants of mRNA degradation rates in mammals. Genome Biol. 23, 245 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Gingold, H. & Pilpel, Y. Determinants of translation efficiency and accuracy. Mol. Syst. Biol. 7, 481 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  9. Zur, H. & Tuller, T. Predictive biophysical modeling and understanding of the dynamics of mRNA translation and its evolution. Nucleic Acids Res. 44, 9031–9049 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. Nieuwkoop, T. et al. Revealing determinants of translation efficiency via whole-gene codon randomization and machine learning. Nucleic Acids Res. 51, 2363–2376 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Shao, B. et al. Riboformer: a deep learning framework for predicting context-dependent translation dynamics. Nat. Commun. 15, 2011 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Tian, T., Li, S., Lang, P., Zhao, D. & Zeng, J. Full-length ribosome density prediction by a multi-input and multi-output model. PLoS Comput. Biol. 17, e1008842 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Tunney, R. et al. Accurate design of translational output by a neural network model of ribosome distribution. Nat. Struct. Mol. Biol. 25, 577–582 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Sample, P. J. et al. Human 5′ UTR design and variant effect prediction from a massively parallel translation assay. Nat. Biotechnol. 37, 803–809 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Cao, J. et al. High-throughput 5′ UTR engineering for enhanced protein production in non-viral gene therapies. Nat. Commun. 12, 4138 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Karollus, A., Avsec, Ž. & Gagneur, J. Predicting mean ribosome load for 5′UTR of any length using deep learning. PLoS Comput. Biol. 17, e1008982 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Bazzini, A. A. et al. Codon identity regulates mRNA stability and translation efficiency during the maternal-to-zygotic transition. EMBO J. 35, 2087–2103 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Hanson, G. & Coller, J. Codon optimality, bias and usage in translation and mRNA decay. Nat. Rev. Mol. Cell Biol. 19, 20–30 (2018).

    Article  CAS  PubMed  Google Scholar 

  19. Li, S. et al. CodonBERT large language model for mRNA vaccines. Genome Res. 34, 1027–1035 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Szostak, E. & Gebauer, F. Translational control by 3′-UTR-binding proteins. Brief. Funct. Genomics 12, 58–65 (2013).

    Article  CAS  PubMed  Google Scholar 

  21. Floor, S. N. & Doudna, J. A. Tunable protein synthesis by transcript isoforms in human cells. eLife 5, e10921 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  22. Schlusser, N., González, A., Pandey, M. & Zavolan, M. Current limitations in predicting mRNA translation with deep learning models. Genome Biol. 25, 227 (2024).

    Article  PubMed  PubMed Central  Google Scholar 

  23. Li, S. et al. mRNA-LM: full-length integrated SLM for mRNA analysis. Nucleic Acids Res. 53, gkaf044 (2025).

    Article  PubMed  PubMed Central  Google Scholar 

  24. Vogel, C. et al. Sequence signatures and mRNA concentration can explain two-thirds of protein abundance variation in a human cell line. Mol. Syst. Biol. 6, 400 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  25. Eraslan, B. et al. Quantification and discovery of sequence determinants of protein-per-mRNA amount in 29 human tissues. Mol. Syst. Biol. 15, e8513 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  26. Eisen, T. J., Li, J. J. & Bartel, D. P. The interplay between translational efficiency, poly(A) tails, microRNAs, and neuronal activation. RNA 28, 808–831 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Li, J. J., Chew, G.-L. & Biggin, M. D. Quantitative principles of cis-translational control by general mRNA sequence features in eukaryotes. Genome Biol. 20, 162 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  28. Battle, A. et al. Genomic variation. Impact of regulatory variation from RNA to protein. Science 347, 664–667 (2015).

    Article  CAS  PubMed  Google Scholar 

  29. Cenik, C. et al. Integrative analysis of RNA, translation, and protein levels reveals distinct regulatory variation across humans. Genome Res. 25, 1610–1621 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Schwanhäusser, B. et al. Global quantification of mammalian gene expression control. Nature 473, 337–342 (2011).

    Article  PubMed  Google Scholar 

  31. Jovanovic, M. et al. Immunogenetics. Dynamic profiling of the protein life cycle in response to pathogens. Science 347, 1259038 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  32. Hernandez-Alias, X., Benisty, H., Radusky, L. G., Serrano, L. & Schaefer, M. H. Using protein-per-mRNA differences among human tissues in codon optimization. Genome Biol. 24, 34 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Spies, N., Burge, C. B. & Bartel, D. P. 3′UTR-isoform choice has limited influence on the stability and translational efficiency of most mRNAs in mouse fibroblasts. Genome Res. 23, 2078–2090 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Ingolia, N. T., Ghaemmaghami, S., Newman, J. R. S. & Weissman, J. S. Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 324, 218–223 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Li, J. J., Bickel, P. J. & Biggin, M. D. System wide analyses have underestimated protein abundances and the importance of transcription in mammals. PeerJ. 2, e270 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  36. Gorgoni, B., Marshall, E., McFarland, M. R., Romano, M. C. & Stansfield, I. Controlling translation elongation efficiency: tRNA regulation of ribosome flux on the mRNA. Biochem. Soc. Trans. 42, 160–165 (2014).

    Article  CAS  PubMed  Google Scholar 

  37. Sonenberg, N. & Hinnebusch, A. G. Regulation of translation initiation in eukaryotes: mechanisms and biological targets. Cell 136, 731–745 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Jackson, R. J., Hellen, C. U. T. & Pestova, T. V. The mechanism of eukaryotic translation initiation and principles of its regulation. Nat. Rev. Mol. Cell Biol. 11, 113–127 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Hinnebusch, A. G., Ivanov, I. P. & Sonenberg, N. Translational control by 5′-untranslated regions of eukaryotic mRNAs. Science 352, 1413–1416 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Sharp, P. M. & Li, W. H. An evolutionary perspective on synonymous codon usage in unicellular organisms. J. Mol. Evol. 24, 28–38 (1986).

    Article  CAS  PubMed  Google Scholar 

  41. Presnyak, V. et al. Codon optimality is a major determinant of mRNA stability. Cell 160, 1111–1124 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Torrent, M., Chalancon, G., de Groot, N. S., Wuster, A. & Madan Babu, M. Cells alter their tRNA abundance to selectively regulate protein synthesis during stress conditions. Sci. Signal. 11, eaat6409 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  43. Weinberg, D. E. et al. Improved ribosome-footprint and mRNA measurements provide insights into dynamics and regulation of yeast translation. Cell Rep. 14, 1787–1799 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Gamble, C. E., Brule, C. E., Dean, K. M., Fields, S. & Grayhack, E. J. Adjacent codons act in concert to modulate translation efficiency in yeast. Cell 166, 679–690 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Mauger, D. M. et al. mRNA structure regulates protein expression through changes in functional half-life. Proc. Natl Acad. Sci. USA 116, 24075–24083 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Verma, M. et al. A short translational ramp determines the efficiency of protein synthesis. Nat. Commun. 10, 5774 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Burke, P. C., Park, H. & Subramaniam, A. R. A nascent peptide code for translational control of mRNA stability in human cells. Nat. Commun. 13, 6829 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Narula, A., Ellis, J., Taliaferro, J. M. & Rissland, O. S. Coding regions affect mRNA stability in human cells. RNA 25, 1751–1764 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Forrest, M. E. et al. Codon and amino acid content are associated with mRNA stability in mammalian cells. PLoS ONE 15, e0228730 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Wu, Q. et al. Translation affects mRNA stability in a codon-dependent manner in human cells. eLife 8, e45396 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  51. Hia, F. et al. Codon bias confers stability to human mRNAs. EMBO Rep. 20, e48220 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Zhu, X., Cruz, V. E., Zhang, H., Erzberger, J. P. & Mendell, J. T. Specific tRNAs promote mRNA decay by recruiting the CCR4-NOT complex to translating ribosomes. Science 386, eadq8587 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Ozadam, H., Geng, M. & Cenik, C. RiboFlow, RiboR and RiboPy: an ecosystem for analyzing ribosome profiling data at read length resolution. Bioinformatics 36, 2929–2931 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Liu, Y. et al. Translation efficiency covariation across cell types is a conserved organizing principle of mammalian transcriptomes. Preprint at bioRxiv https://doi.org/10.1101/2024.08.11.607360 (2024).

  55. Larsson, O., Sonenberg, N. & Nadon, R. Identification of differential translation in genome wide studies. Proc. Natl Acad. Sci. USA 107, 21487–21492 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Guo, J. U. & Bartel, D. P. RNA G-quadruplexes are globally unfolded in eukaryotic cells and depleted in bacteria. Science 353, aaf5371 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  57. Wang, D. et al. A deep proteome and transcriptome abundance atlas of 29 healthy human tissues. Mol. Syst. Biol. 15, e8503 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  58. Rogers, D. W., Böttcher, M. A., Traulsen, A. & Greig, D. Ribosome reinitiation can explain length-dependent translation of messenger RNA. PLoS Comput. Biol. 13, e1005592 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  59. Fernandes, L. D., de Moura, A. P. S. & Ciandrini, L. Gene length as a regulator for ribosome recruitment and protein synthesis: theoretical insights. Sci. Rep. 7, 17409 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  60. Witte, F. et al. A trans locus causes a ribosomopathy in hypertrophic hearts that affects mRNA translation in a protein length-dependent fashion. Genome Biol. 22, 191 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Thompson, M. K., Rojas-Duran, M. F., Gangaramani, P. & Gilbert, W. V. The ribosomal protein Asc1/RACK1 is required for efficient translation of short mRNAs. eLife 5, e11154 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  62. Dever, T. E., Ivanov, I. P. & Hinnebusch, A. G. Translational regulation by uORFs and start codon selection stringency. Genes Dev. 37, 474–489 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Lewis, C. J. T. et al. Quantitative profiling of human translation initiation reveals elements that potently regulate endogenous and therapeutically modified mRNAs. Mol. Cell 85, 445–445 (2024).

    Article  PubMed  Google Scholar 

  64. Strayer, E. C. et al. NaP-TRAP reveals the regulatory grammar in 5′UTR-mediated translation regulation during zebrafish development. Nat. Commun. 15, 10898 (2024).

    Article  PubMed  PubMed Central  Google Scholar 

  65. Alqaraawi, A., Schuessler, M., Weiß, P., Costanza, E. & Berthouze, N. Evaluating saliency map explanations for convolutional neural networks: a user study. Preprint at https://arxiv.org/abs/2002.00772 (2020).

  66. Simonyan, K., Vedaldi, A. & Zisserman, A. Deep inside convolutional networks: visualising image classification models and saliency maps. Preprint at https://arxiv.org/abs/1312.6034 (2013).

  67. Shrikumar, A. et al. Technical note on transcription factor motif discovery from importance scores (TF-MoDISco) version 0.5.6.5. Preprint at https://arxiv.org/abs/1811.00416 (2018).

  68. Chu, D. et al. Translation elongation can control translation initiation on eukaryotic mRNAs. EMBO J. 33, 21–34 (2014).

    Article  CAS  PubMed  Google Scholar 

  69. Wu, C. C.-C., Zinshteyn, B., Wehner, K. A. & Green, R. High-resolution ribosome profiling defines discrete ribosome elongation states and translational regulation during cellular stress. Mol. Cell 73, 959–970 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Gogakos, T. et al. Characterizing expression and processing of precursor and mature human tRNAs by hydro-tRNAseq and PAR-CLIP. Cell Rep. 20, 1463–1475 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Sterne-Weiler, T. et al. Frac-seq reveals isoform-specific recruitment to polyribosomes. Genome Res. 23, 1615–1623 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Ritter, A. J., Draper, J. M., Vollmers, C. & Sanford, J. R. Long-read subcellular fractionation and sequencing reveals the translational fate of full-length mRNA isoforms during neuronal differentiation. Genome Res. 34, 2000–2011 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Nachtergaele, S. & He, C. Chemical modifications in the life of an mRNA transcript. Annu. Rev. Genet. 52, 349–372 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Whiffin, N. et al. Characterising the loss-of-function impact of 5′ untranslated region variants in 15,708 individuals. Nat. Commun. 11, 2523 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Sevilla, T. et al. Mutations in the MORC2 gene cause axonal Charcot–Marie–Tooth disease. Brain 139, 62–72 (2015).

    Article  PubMed  Google Scholar 

  76. Dueñas Rey, A. et al. Combining a prioritization strategy and functional studies nominates 5′UTR variants underlying inherited retinal disease. Genome Med. 16, 7 (2024).

    Article  PubMed  PubMed Central  Google Scholar 

  77. Liu, L. et al. Mutation of the CDKN2A 5′ UTR creates an aberrant initiation codon and predisposes to melanoma. Nat. Genet. 21, 128–132 (1999).

    Article  PubMed  Google Scholar 

  78. Damjanovich, K. et al. 5′UTR mutations of ENG cause hereditary hemorrhagic telangiectasia. Orphanet J. Rare Dis. 6, 85 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  79. Pan, X. et al. 5′-UTR SNP of FGF13 causes translational defect and intellectual disability. eLife 10, e63021 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Lee, D. S. M. et al. Disrupting upstream translation in mRNAs is associated with human disease. Nat. Commun. 12, 1515 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Lim, Y. et al. Multiplexed functional genomic analysis of 5′ untranslated region mutations across the spectrum of prostate cancer. Nat. Commun. 12, 4217 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  82. Stephens, S. B. & Nicchitta, C. V. Divergent regulation of protein synthesis in the cytosol and endoplasmic reticulum compartments of mammalian cells. Mol. Biol. Cell 19, 623–632 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  83. Horste, E. L. et al. Subcytoplasmic location of translation controls protein output. Mol. Cell 83, 4509–4523 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  84. Hubstenberger, A. et al. P-body purification reveals the condensation of repressed mRNA regulons. Mol. Cell 68, 144–157 (2017).

    Article  CAS  PubMed  Google Scholar 

  85. Chew, G.-L., Pauli, A. & Schier, A. F. Conservation of uORF repressiveness and sequence features in mouse, human and zebrafish. Nat. Commun. 7, 11663 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Jia, L. et al. Decoding mRNA translatability and stability from the 5′ UTR. Nat. Struct. Mol. Biol. 27, 814–821 (2020).

    Article  CAS  PubMed  Google Scholar 

  87. Akirtava, C., May, G. E. & McManus, C. J. Deciphering the landscape of cis-acting sequences in natural yeast transcript leaders. Nucleic Acids Res. 53, gkaf165 (2025).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  88. Choi, Y. et al. Time-resolved profiling of RNA binding proteins throughout the mRNA life cycle. Mol. Cell 84, 1764–1782 (2024).

    Article  CAS  PubMed  Google Scholar 

  89. Singh, G., Pratt, G., Yeo, G. W. & Moore, M. J. The clothes make the mRNA: past and present trends in mRNP fashion. Annu. Rev. Biochem. 84, 325–354 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  90. May, G. E. et al. Unraveling the influences of sequence and position on yeast uORF activity using massively parallel reporter systems and machine learning. eLife 12, e69611 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. Arribere, J. A. et al. Translation readthrough mitigation. Nature 534, 719–723 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  92. Kramarski, L. & Arbely, E. Translational read-through promotes aggregation and shapes stop codon identity. Nucleic Acids Res. 48, 3747–3760 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  93. Yordanova, M. M. et al. AMD1 mRNA employs ribosome stalling as a mechanism for molecular memory formation. Nature 553, 356–360 (2018).

    Article  CAS  PubMed  Google Scholar 

  94. Hashimoto, S., Nobuta, R., Izawa, T. & Inada, T. Translation arrest as a protein quality control system for aberrant translation of the 3′-UTR in mammalian cells. FEBS Lett. 593, 777–787 (2019).

    Article  CAS  PubMed  Google Scholar 

  95. Sherlock, M. E., Baquero Galvis, L., Vicens, Q., Kieft, J. S. & Jagannathan, S. Principles, mechanisms, and biological implications of translation termination–reinitiation. RNA 29, 865–884 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  96. Wu, Q. et al. Translation of small downstream ORFs enhances translation of canonical main open reading frames. EMBO J. 39, e104763 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  97. Mayr, C. Evolution and biological roles of alternative 3′UTRs. Trends Cell Biol. 26, 227–237 (2016).

    Article  CAS  PubMed  Google Scholar 

  98. Subtelny, A. O., Eichhorn, S. W., Chen, G. R., Sive, H. & Bartel, D. P. Poly(A)-tail profiling reveals an embryonic switch in translational control. Nature 508, 66–71 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  99. Ozadam, H. et al. Single-cell quantification of ribosome occupancy in early mouse development. Nature 618, 1057–1064 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  100. Gruber, A. R. et al. Global 3′ UTR shortening has a limited effect on protein abundance in proliferating T cells. Nat. Commun. 5, 5465 (2014).

    Article  CAS  PubMed  Google Scholar 

  101. Requião, R. D., Barros, G. C., Domitrovic, T. & Palhano, F. L. Influence of nascent polypeptide positive charges on translation dynamics. Biochem. J 477, 2921–2934 (2020).

    Article  PubMed  Google Scholar 

  102. Dao Duc, K. & Song, Y. S. The impact of ribosomal interference, codon usage, and exit tunnel interactions on translation elongation rate variation. PLoS Genet. 14, e1007166 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  103. Ahmed, N. et al. Pairs of amino acids at the P- and A-sites of the ribosome predictably and causally modulate translation–elongation rates. J. Mol. Biol. 432, 166696 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  104. Kirchner, S. & Ignatova, Z. Emerging roles of tRNA in adaptive translation, signalling dynamics and disease. Nat. Rev. Genet. 16, 98–112 (2015).

    Article  CAS  PubMed  Google Scholar 

  105. Ingolia, N. T., Lareau, L. F. & Weissman, J. S. Ribosome profiling of mouse embryonic stem cells reveals the complexity and dynamics of mammalian proteomes. Cell 147, 789–802 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  106. Riba, A. et al. Protein synthesis rates and ribosome occupancies reveal determinants of translation elongation rates. Proc. Natl. Acad. Sci. USA. 116, 15023–15032 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  107. Barrington, C. L. et al. Synonymous codon usage regulates translation initiation. Cell Rep. 42, 113413 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  108. Lyons, E. F. et al. Translation elongation as a rate limiting step of protein production. Preprint at bioRxiv https://doi.org/10.1101/2023.11.27.568910 (2024).

  109. Chen, K. Y., Park, H. & Subramaniam, A. R. Massively parallel identification of sequence motifs triggering ribosome-associated mRNA quality control. Nucleic Acids Res. 52, 7171–7187 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  110. Bicknell, A. A. & Ricci, E. P. When mRNA translation meets decay. Biochem. Soc. Trans. 45, 339–351 (2017).

    Article  CAS  PubMed  Google Scholar 

  111. Bicknell, A. A. et al. Attenuating ribosome load improves protein output from mRNA by limiting translation-dependent mRNA decay. Cell Rep. 43, 114098 (2024).

    Article  CAS  PubMed  Google Scholar 

  112. Mishima, Y., Han, P., Ishibashi, K., Kimura, S. & Iwasaki, S. Ribosome slowdown triggers codon-mediated mRNA decay independently of ribosome quality control. EMBO J. 41, e109256 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  113. Bae, H. & Coller, J. Codon optimality-mediated mRNA degradation: linking translational elongation to mRNA stability. Mol. Cell 82, 1467–1476 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  114. Inada, T. Quality controls induced by aberrant translation. Nucleic Acids Res. 48, 1084–1096 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  115. Matsuo, Y. et al. RQT complex dissociates ribosomes collided on endogenous RQC substrate SDD1. Nat. Struct. Mol. Biol. 27, 323–332 (2020).

    Article  CAS  PubMed  Google Scholar 

  116. Mercier, B. C. et al. Translation-dependent and -independent mRNA decay occur through mutually exclusive pathways defined by ribosome density during T cell activation. Genome Res. 34, 394–409 (2024).

    CAS  PubMed  PubMed Central  Google Scholar 

  117. Leppek, K., Das, R. & Barna, M. Functional 5′ UTR mRNA structures in eukaryotic translation regulation and how to find them. Nat. Rev. Mol. Cell Biol. 19, 158–174 (2018).

    Article  CAS  PubMed  Google Scholar 

  118. Liu, T.-Y. et al. Time-resolved proteomics extends ribosome profiling-based measurements of protein synthesis dynamics. Cell Syst. 4, 636–644 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  119. Shah, P., Ding, Y., Niemczyk, M., Kudla, G. & Plotkin, J. B. Rate-limiting steps in yeast protein translation. Cell 153, 1589–1601 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  120. The UniProt Consortium. UniProt: the universal protein knowledgebase in 2021. Nucleic Acids Res. 49, D480–D489 (2021).

    Article  Google Scholar 

  121. Gerashchenko, M. V. & Gladyshev, V. N. Translation inhibitors cause abnormalities in ribosome profiling experiments. Nucleic Acids Res. 42, e134 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  122. Rodriguez, J. M. et al. APPRIS: selecting functionally important isoforms. Nucleic Acids Res. 50, D54–D59 (2022).

    Article  CAS  PubMed  Google Scholar 

  123. Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).

    Google Scholar 

  124. Ke, G. et al. LightGBM: a highly efficient gradient boosting decision tree. In Proc. 31st International Conference on Neural Information Processing Systems (eds von Luxburg, U. & Guyon, I.) 3146–3154 (Curran Associates, 2017).

  125. Kokhlikyan, N. et al. Captum: a unified and generic model interpretability library for PyTorch. Preprint at https://arxiv.org/abs/2009.07896 (2020).

  126. Gudmundsson, S. et al. Addendum: the mutational constraint spectrum quantified from variation in 141,456 humans. Nature 597, E3–E4 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  127. Zheng, D., Wang, J. & Agarwal, V. RiboNN: a deep learning model to predict translation efficiency from mRNA sequence. Zenodo https://doi.org/10.5281/zenodo.15360345 (2025).

  128. Persyn, L., Liu, Y. & Cenik, C. Classic TE prediction model. Zenodo https://doi.org/10.5281/zenodo.15360966 (2025).

  129. Pagès, H., Aboyoun, P., Gentleman, R. & DebRoy, S. Biostrings: efficient manipulation of biological strings. Bioconductor https://doi.org/10.18129/B9.bioc.Biostrings (2025).

Download references

Acknowledgements

We thank I. Hoskins (UT Austin) for the code and data to generate secondary structure features and M. Miladi (Sanofi) for providing critical feedback. We thank C. Thoreen and W. Gilbert (Yale University) for sharing their data before publication. Research reported in this publication was supported in part by the National Institute of General Medical Sciences of the National Institutes of Health under award R35GM150667 (to C.C.). This work was also supported by the National Institutes of Health (grant HD110096) and the Welch Foundation (grant F-2027-20230405 to C.C.). C.C. was a CPRIT Scholar in Cancer Research supported by the CPRIT (grant RR180042).

Author information

Authors and Affiliations

Authors

Contributions

D.Z. trained RiboNN models, validated model predictions with public datasets and contributed to model interpretation. J.W. interpreted RiboNN, performed comparisons between TE and third-party measurements, and analyzed genetic variant data. L.P. trained and interpreted classic ML models. Y.L. helped synthesize the data compendia and developed the compositional approach to calculate TE. F.U.-M., C.C. and V.A. supervised the study. C.C. and V.A. conceptualized and designed the study.

Corresponding authors

Correspondence to Can Cenik or Vikram Agarwal.

Ethics declarations

Competing interests

D.Z., J.W., F.U.-M. and V.A. are employees of Sanofi and may hold shares and/or stock options in the company. The other authors declare no competing interests.

Peer review

Peer review information

Nature Biotechnology thanks the anonymous reviewers for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Intercomparison of mouse cell types.

a, Same as Fig. 1b, except for the display results for 68 mouse cell types. b, Scatter plot to show the correlation between mouse mean TE from this study and ribosomal loading measured in 3T3 cells33. Pearson (r) and Spearman (ρ) correlation coefficients are indicated.

Extended Data Fig. 2 Visualization of the RiboNN model architecture.

Shown is a layer-by-layer graph of the RiboNN architecture, with input/output dimensions labeled for each layer. The ConvBlock in the broken-line box was applied 10 times in total to compress the sequence length. Light yellow nodes reflect input/output tensors, light blue nodes reflect functions, light green nodes reflect modules and numbers in parentheses reflect tensor dimensions.

Extended Data Fig. 3 Performance of deep learning models on all human cell types.

These panels mirror those shown in Supplementary Fig. 4, except they show the performance of multitask deep learning models with one of four architectures: (1) our final RiboNN architecture, (2) our RiboNN architecture, ablating the input channel recording codon positions, (3) our RiboNN architecture, anchoring all mRNAs at their 5′ end instead of at start codons and (4) the Saluki architecture7, removing the splice site input channel. For each architecture, the r2 values were measured on ten held-out CV folds (n = 10,242 total genes among the ten folds). The center of the boxes corresponds to the median (the 50th percentile). The lower and upper hinges correspond to the first and third quartiles (the 25th and 75th percentiles). The upper whisker extends from the hinge to the largest value no further than 1.5× IQR (interquartile range, or distance between the first and third quartiles) from the hinge. The lower whisker extends from the hinge to the smallest value at most 1.5× IQR of the hinge. Data beyond the end of the whiskers are plotted individually.

Extended Data Fig. 4 Performance of RiboNN on mouse cell types.

a, These panels mirror those shown in Supplementary Fig. 4, except they show the performance of our multitask RiboNN model on mouse cell types using r2 measured on ten held-out CV folds (n = 10,242 total genes among the ten folds). The center of the boxes corresponds to the median (the 50th percentile). The lower and upper hinges correspond to the first and third quartiles (the 25th and 75th percentiles). The upper whisker extends from the hinge to the largest value no further than 1.5× IQR (interquartile range, or distance between the first and third quartiles) from the hinge. The lower whisker extends from the hinge to the smallest value at most 1.5× IQR of the hinge. Data beyond the end of the whiskers are plotted individually. b,c, Scatter plots showing the relationships between our mouse RiboNN predictions to the observed mean TEs for human mRNAs (b) as well as the relationships between our human RiboNN predictions to the observed mean TEs for mouse mRNAs (c). Pearson (r) and Spearman (ρ) correlation coefficients are also shown. d,e, Scatter plots showing the relationships between sequence homology, considering the interspecies pair of mRNAs with the maximum homology, and the residual prediction error between the TE from one species and TE predicted from the alternative species. This was shown for human (d) and mouse (e) mean TE data. ‘Max homology %’ was computed as follows: (1) all human–mouse mRNA pairs were locally aligned using the ‘pairwiseAlignment’ function from the Biostrings (version 2.70.2) R package129 (‘match: 1, mismatch: −3, gap open: −2 and gap extend: −1’) and (2) for each mRNA, the final value was computed using the highest scoring alignment from the other species, calculating the maximum homology score divided by mRNA length.

Extended Data Fig. 5 Interpretation of human and mouse RiboNN models.

a, Attribution score plot for human RiboNN model focusing on specific regions along valid mRNAs (defined in Methods). The windows include the first 50 nt of the 5′ UTR, 50 nt upstream to 250 nt downstream of the start codon, 250 nt upstream and 250 nt downstream of the stop codon and the last 250 nt of the 3′ UTR. The absolute values of attribution scores were averaged across all valid mRNAs, which were grouped into one of four equally sized bins according to their mean TE. b, Metagene plot for the absolute value of attribution scores derived from the mouse RiboNN model, averaged across all mRNAs, for percentiles along the 5′ UTR, CDS and 3′ UTR. mRNAs were grouped into one of four equally sized bins according to their mean TE. c, Same as a, except it reflects results from the mouse RiboNN model. d,e, Enriched motifs learned by human (d) and mouse (e) RiboNN models for each functional region of mRNA. Motifs are ranked by the number of seqlets67 supporting each motif.

Extended Data Fig. 6 Amino acid-level-based correlation among codon influence scores.

ac, Scatter plots showing the relationship between the amino acid-level-based codon influence (that is, the predicted effect size of each inserted codon, averaged across all positional bins and across codons for each amino acid) from the human RiboNN model and the mouse model (a), A-site ribosome occupancy scores69 (b) and mean codon stability coefficients49 (c). Pearson (r) and Spearman (ρ) correlation coefficients are also shown. The properties of amino acids are labeled by different colors for hydrophobicity and by different shapes for charge. The error bar represents the standard error across codons encoding the same amino acid. To compute amino acid-level scores, we computed the mean score among codons encoding the same amino acid. All 61 non-stop codons were included in the amino acid-based analysis.

Extended Data Fig. 7 In silico mutagenesis of disease-associated gene 5′ UTRs.

ac, In silico mutagenesis of 5′ UTR regions of RDH12 (a), ENG (b) and FGF13 (c). Positions of wild-type uAUG are highlighted in purple at the top. The known disease-associated variants are boxed. Single-point mutations resulting in predicted TE differences are shown alongside annotations reflecting the corresponding gain or loss of TE.

Extended Data Fig. 8 In silico mutagenesis of disease-associated gene 5′ UTRs.

Continuation of results from Extended Data Fig. 7, except with BCL2L13.

Extended Data Fig. 9 In silico mutagenesis of cancer-associated gene 5′ UTRs.

ad, In silico mutagenesis of 5′ UTR regions of ADAM32 (a), NUMA1 (b), COMT (c) and QARS (d). Positions of wild-type uAUG are highlighted in purple at the top. The known cancer-associated variants are boxed. Single-point mutations resulting in predicted TE differences are shown alongside annotations reflecting the corresponding gain or loss of TE.

Extended Data Fig. 10 In silico mutagenesis of cancer-associated gene 5′ UTRs.

Continuation of results from Extended Data Fig. 9, except with AKT3.

Supplementary information

Supplementary Information

Supplementary Discussion, Methods and Figs. 1–12.

Reporting Summary

Supplementary Table 1

Feature sizes, sequences, CV folds and TEs of human genes.

Supplementary Table 2

Feature sizes, sequences, CV folds and TEs of mouse genes.

Supplementary Table 3

Feature sizes, sequences, CV folds and TEs predicted by the human RiboNN models.

Supplementary Table 4

Feature sizes, sequences, CV folds and TEs predicted by the mouse RiboNN models.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zheng, D., Persyn, L., Wang, J. et al. Predicting the translation efficiency of messenger RNA in mammalian cells. Nat Biotechnol (2025). https://doi.org/10.1038/s41587-025-02712-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • DOI: https://doi.org/10.1038/s41587-025-02712-x

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing