Predicting the translation efficiency of messenger RNA in mammalian cells

Zheng, Dinghai; Persyn, Logan; Wang, Jun; Liu, Yue; Ulloa-Montoya, Fernando; Cenik, Can; Agarwal, Vikram

doi:10.1038/s41587-025-02712-x

Article
Published: 25 July 2025

Predicting the translation efficiency of messenger RNA in mammalian cells

Nature Biotechnology (2025)Cite this article

11k Accesses
6 Citations
164 Altmetric
Metrics details

Subjects

Abstract

The mechanisms by which mRNA sequences specify translational control remain poorly understood in mammalian cells. Here we generate a transcriptome-wide atlas of translation efficiency (TE) measurements encompassing more than 140 human and mouse cell types from 3,819 ribosomal profiling datasets. We develop RiboNN, a state-of-the-art multitask deep convolutional neural network, and classic machine learning models to predict TEs in hundreds of cell types from sequence-encoded mRNA features. While most earlier models solely considered the 5′ untranslated region (UTR) sequence, RiboNN integrates how the spatial positioning of low-level dinucleotide and trinucleotide features (that is, including codons) influences TE, capturing mechanistic principles such as how ribosomal processivity and tRNA abundance control translational output. RiboNN predicts the translational behavior of base-modified therapeutic RNA and explains evolutionary selection pressures in human 5′ UTRs. Finally, it detects a common language governing mRNA regulatory control and highlights the interconnectedness of mRNA translation, stability and localization in mammalian organisms.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Integrative analysis of thousands of human and mouse ribosomal profiling datasets measuring TE.**

**Fig. 2: A classical ML approach to predict mammalian TEs from mRNA sequence.**

**Fig. 3: Performance and interpretation of deep learning models predicting mammalian TEs from mRNA sequence.**

**Fig. 4: RiboNN predicts the impact of RNA modifications, genetic variants and reporter constructs on translation.**

**Fig. 5: Interrelationships between mRNA translation, turnover and subcellular localization.**

Deep generative optimization of mRNA codon sequences for enhanced mRNA translation and therapeutic efficacy

Article Open access 12 November 2025

RNA-based translation activators for targeted gene upregulation

Article Open access 26 October 2023

A multiplex platform for small RNA sequencing elucidates multifaceted tRNA stress response and translational regulation

Article Open access 05 May 2022

Data availability

We provide the processed data without restriction in supplementary tables herein.

Code availability

Code and pretrained models are available on Zenodo¹²⁷ and GitHub (https://github.com/Sanofi-Public/RiboNN/). Our classic ML model code is available on Zenodo¹²⁸ and GitHub (https://github.com/CenikLab/TE_classic_ML).

References

Agarwal, V. & Shendure, J. Predicting mRNA abundance directly from genomic sequence using deep convolutional neural networks. Cell Rep. 31, 107663 (2020).
Article CAS PubMed Google Scholar
Zhou, J. et al. Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk. Nat. Genet. 50, 1171–1179 (2018).
Article CAS PubMed PubMed Central Google Scholar
Avsec, Ž. et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat. Methods 18, 1196–1203 (2021).
Article CAS PubMed PubMed Central Google Scholar
Kelley, D. R. et al. Sequential regulatory activity prediction across chromosomes with convolutional neural networks. Genome Res. 28, 739–750 (2018).
Article CAS PubMed PubMed Central Google Scholar
Wang, J. & Agarwal, V. How DNA encodes the start of transcription. Science 384, 382–383 (2024).
Article CAS PubMed Google Scholar
Linder, J., Srivastava, D., Yuan, H., Agarwal, V. & Kelley, D. R. Predicting RNA-seq coverage from DNA sequence as a unifying model of gene regulation. Nat. Genet. 57, 949–961 (2025).
Article CAS PubMed PubMed Central Google Scholar
Agarwal, V. & Kelley, D. R. The genetic and biochemical determinants of mRNA degradation rates in mammals. Genome Biol. 23, 245 (2022).
Article CAS PubMed PubMed Central Google Scholar
Gingold, H. & Pilpel, Y. Determinants of translation efficiency and accuracy. Mol. Syst. Biol. 7, 481 (2011).
Article PubMed PubMed Central Google Scholar
Zur, H. & Tuller, T. Predictive biophysical modeling and understanding of the dynamics of mRNA translation and its evolution. Nucleic Acids Res. 44, 9031–9049 (2016).
CAS PubMed PubMed Central Google Scholar
Nieuwkoop, T. et al. Revealing determinants of translation efficiency via whole-gene codon randomization and machine learning. Nucleic Acids Res. 51, 2363–2376 (2023).
Article CAS PubMed PubMed Central Google Scholar
Shao, B. et al. Riboformer: a deep learning framework for predicting context-dependent translation dynamics. Nat. Commun. 15, 2011 (2024).
Article CAS PubMed PubMed Central Google Scholar
Tian, T., Li, S., Lang, P., Zhao, D. & Zeng, J. Full-length ribosome density prediction by a multi-input and multi-output model. PLoS Comput. Biol. 17, e1008842 (2021).
Article CAS PubMed PubMed Central Google Scholar
Tunney, R. et al. Accurate design of translational output by a neural network model of ribosome distribution. Nat. Struct. Mol. Biol. 25, 577–582 (2018).
Article CAS PubMed PubMed Central Google Scholar
Sample, P. J. et al. Human 5′ UTR design and variant effect prediction from a massively parallel translation assay. Nat. Biotechnol. 37, 803–809 (2019).
Article CAS PubMed PubMed Central Google Scholar
Cao, J. et al. High-throughput 5′ UTR engineering for enhanced protein production in non-viral gene therapies. Nat. Commun. 12, 4138 (2021).
Article CAS PubMed PubMed Central Google Scholar
Karollus, A., Avsec, Ž. & Gagneur, J. Predicting mean ribosome load for 5′UTR of any length using deep learning. PLoS Comput. Biol. 17, e1008982 (2021).
Article CAS PubMed PubMed Central Google Scholar
Bazzini, A. A. et al. Codon identity regulates mRNA stability and translation efficiency during the maternal-to-zygotic transition. EMBO J. 35, 2087–2103 (2016).
Article CAS PubMed PubMed Central Google Scholar
Hanson, G. & Coller, J. Codon optimality, bias and usage in translation and mRNA decay. Nat. Rev. Mol. Cell Biol. 19, 20–30 (2018).
Article CAS PubMed Google Scholar
Li, S. et al. CodonBERT large language model for mRNA vaccines. Genome Res. 34, 1027–1035 (2024).
Article CAS PubMed PubMed Central Google Scholar
Szostak, E. & Gebauer, F. Translational control by 3′-UTR-binding proteins. Brief. Funct. Genomics 12, 58–65 (2013).
Article CAS PubMed Google Scholar
Floor, S. N. & Doudna, J. A. Tunable protein synthesis by transcript isoforms in human cells. eLife 5, e10921 (2016).
Article PubMed PubMed Central Google Scholar
Schlusser, N., González, A., Pandey, M. & Zavolan, M. Current limitations in predicting mRNA translation with deep learning models. Genome Biol. 25, 227 (2024).
Article PubMed PubMed Central Google Scholar
Li, S. et al. mRNA-LM: full-length integrated SLM for mRNA analysis. Nucleic Acids Res. 53, gkaf044 (2025).
Article PubMed PubMed Central Google Scholar
Vogel, C. et al. Sequence signatures and mRNA concentration can explain two-thirds of protein abundance variation in a human cell line. Mol. Syst. Biol. 6, 400 (2010).
Article PubMed PubMed Central Google Scholar
Eraslan, B. et al. Quantification and discovery of sequence determinants of protein-per-mRNA amount in 29 human tissues. Mol. Syst. Biol. 15, e8513 (2019).
Article PubMed PubMed Central Google Scholar
Eisen, T. J., Li, J. J. & Bartel, D. P. The interplay between translational efficiency, poly(A) tails, microRNAs, and neuronal activation. RNA 28, 808–831 (2022).
Article CAS PubMed PubMed Central Google Scholar
Li, J. J., Chew, G.-L. & Biggin, M. D. Quantitative principles of cis-translational control by general mRNA sequence features in eukaryotes. Genome Biol. 20, 162 (2019).
Article PubMed PubMed Central Google Scholar
Battle, A. et al. Genomic variation. Impact of regulatory variation from RNA to protein. Science 347, 664–667 (2015).
Article CAS PubMed Google Scholar
Cenik, C. et al. Integrative analysis of RNA, translation, and protein levels reveals distinct regulatory variation across humans. Genome Res. 25, 1610–1621 (2015).
Article CAS PubMed PubMed Central Google Scholar
Schwanhäusser, B. et al. Global quantification of mammalian gene expression control. Nature 473, 337–342 (2011).
Article PubMed Google Scholar
Jovanovic, M. et al. Immunogenetics. Dynamic profiling of the protein life cycle in response to pathogens. Science 347, 1259038 (2015).
Article PubMed PubMed Central Google Scholar
Hernandez-Alias, X., Benisty, H., Radusky, L. G., Serrano, L. & Schaefer, M. H. Using protein-per-mRNA differences among human tissues in codon optimization. Genome Biol. 24, 34 (2023).
Article CAS PubMed PubMed Central Google Scholar
Spies, N., Burge, C. B. & Bartel, D. P. 3′UTR-isoform choice has limited influence on the stability and translational efficiency of most mRNAs in mouse fibroblasts. Genome Res. 23, 2078–2090 (2013).
Article CAS PubMed PubMed Central Google Scholar
Ingolia, N. T., Ghaemmaghami, S., Newman, J. R. S. & Weissman, J. S. Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 324, 218–223 (2009).
Article CAS PubMed PubMed Central Google Scholar
Li, J. J., Bickel, P. J. & Biggin, M. D. System wide analyses have underestimated protein abundances and the importance of transcription in mammals. PeerJ. 2, e270 (2014).
Article PubMed PubMed Central Google Scholar
Gorgoni, B., Marshall, E., McFarland, M. R., Romano, M. C. & Stansfield, I. Controlling translation elongation efficiency: tRNA regulation of ribosome flux on the mRNA. Biochem. Soc. Trans. 42, 160–165 (2014).
Article CAS PubMed Google Scholar
Sonenberg, N. & Hinnebusch, A. G. Regulation of translation initiation in eukaryotes: mechanisms and biological targets. Cell 136, 731–745 (2009).
Article CAS PubMed PubMed Central Google Scholar
Jackson, R. J., Hellen, C. U. T. & Pestova, T. V. The mechanism of eukaryotic translation initiation and principles of its regulation. Nat. Rev. Mol. Cell Biol. 11, 113–127 (2010).
Article CAS PubMed PubMed Central Google Scholar
Hinnebusch, A. G., Ivanov, I. P. & Sonenberg, N. Translational control by 5′-untranslated regions of eukaryotic mRNAs. Science 352, 1413–1416 (2016).
Article CAS PubMed PubMed Central Google Scholar
Sharp, P. M. & Li, W. H. An evolutionary perspective on synonymous codon usage in unicellular organisms. J. Mol. Evol. 24, 28–38 (1986).
Article CAS PubMed Google Scholar
Presnyak, V. et al. Codon optimality is a major determinant of mRNA stability. Cell 160, 1111–1124 (2015).
Article CAS PubMed PubMed Central Google Scholar
Torrent, M., Chalancon, G., de Groot, N. S., Wuster, A. & Madan Babu, M. Cells alter their tRNA abundance to selectively regulate protein synthesis during stress conditions. Sci. Signal. 11, eaat6409 (2018).
Article PubMed PubMed Central Google Scholar
Weinberg, D. E. et al. Improved ribosome-footprint and mRNA measurements provide insights into dynamics and regulation of yeast translation. Cell Rep. 14, 1787–1799 (2016).
Article CAS PubMed PubMed Central Google Scholar
Gamble, C. E., Brule, C. E., Dean, K. M., Fields, S. & Grayhack, E. J. Adjacent codons act in concert to modulate translation efficiency in yeast. Cell 166, 679–690 (2016).
Article CAS PubMed PubMed Central Google Scholar
Mauger, D. M. et al. mRNA structure regulates protein expression through changes in functional half-life. Proc. Natl Acad. Sci. USA 116, 24075–24083 (2019).
Article CAS PubMed PubMed Central Google Scholar
Verma, M. et al. A short translational ramp determines the efficiency of protein synthesis. Nat. Commun. 10, 5774 (2019).
Article CAS PubMed PubMed Central Google Scholar
Burke, P. C., Park, H. & Subramaniam, A. R. A nascent peptide code for translational control of mRNA stability in human cells. Nat. Commun. 13, 6829 (2022).
Article CAS PubMed PubMed Central Google Scholar
Narula, A., Ellis, J., Taliaferro, J. M. & Rissland, O. S. Coding regions affect mRNA stability in human cells. RNA 25, 1751–1764 (2019).
Article CAS PubMed PubMed Central Google Scholar
Forrest, M. E. et al. Codon and amino acid content are associated with mRNA stability in mammalian cells. PLoS ONE 15, e0228730 (2020).
Article CAS PubMed PubMed Central Google Scholar
Wu, Q. et al. Translation affects mRNA stability in a codon-dependent manner in human cells. eLife 8, e45396 (2019).
Article PubMed PubMed Central Google Scholar
Hia, F. et al. Codon bias confers stability to human mRNAs. EMBO Rep. 20, e48220 (2019).
Article CAS PubMed PubMed Central Google Scholar
Zhu, X., Cruz, V. E., Zhang, H., Erzberger, J. P. & Mendell, J. T. Specific tRNAs promote mRNA decay by recruiting the CCR4-NOT complex to translating ribosomes. Science 386, eadq8587 (2024).
Article CAS PubMed PubMed Central Google Scholar
Ozadam, H., Geng, M. & Cenik, C. RiboFlow, RiboR and RiboPy: an ecosystem for analyzing ribosome profiling data at read length resolution. Bioinformatics 36, 2929–2931 (2020).
Article CAS PubMed PubMed Central Google Scholar
Liu, Y. et al. Translation efficiency covariation across cell types is a conserved organizing principle of mammalian transcriptomes. Preprint at bioRxiv https://doi.org/10.1101/2024.08.11.607360 (2024).
Larsson, O., Sonenberg, N. & Nadon, R. Identification of differential translation in genome wide studies. Proc. Natl Acad. Sci. USA 107, 21487–21492 (2010).
Article CAS PubMed PubMed Central Google Scholar
Guo, J. U. & Bartel, D. P. RNA G-quadruplexes are globally unfolded in eukaryotic cells and depleted in bacteria. Science 353, aaf5371 (2016).
Article PubMed PubMed Central Google Scholar
Wang, D. et al. A deep proteome and transcriptome abundance atlas of 29 healthy human tissues. Mol. Syst. Biol. 15, e8503 (2019).
Article PubMed PubMed Central Google Scholar
Rogers, D. W., Böttcher, M. A., Traulsen, A. & Greig, D. Ribosome reinitiation can explain length-dependent translation of messenger RNA. PLoS Comput. Biol. 13, e1005592 (2017).
Article PubMed PubMed Central Google Scholar
Fernandes, L. D., de Moura, A. P. S. & Ciandrini, L. Gene length as a regulator for ribosome recruitment and protein synthesis: theoretical insights. Sci. Rep. 7, 17409 (2017).
Article PubMed PubMed Central Google Scholar
Witte, F. et al. A trans locus causes a ribosomopathy in hypertrophic hearts that affects mRNA translation in a protein length-dependent fashion. Genome Biol. 22, 191 (2021).
Article CAS PubMed PubMed Central Google Scholar
Thompson, M. K., Rojas-Duran, M. F., Gangaramani, P. & Gilbert, W. V. The ribosomal protein Asc1/RACK1 is required for efficient translation of short mRNAs. eLife 5, e11154 (2016).
Article PubMed PubMed Central Google Scholar
Dever, T. E., Ivanov, I. P. & Hinnebusch, A. G. Translational regulation by uORFs and start codon selection stringency. Genes Dev. 37, 474–489 (2023).
Article CAS PubMed PubMed Central Google Scholar
Lewis, C. J. T. et al. Quantitative profiling of human translation initiation reveals elements that potently regulate endogenous and therapeutically modified mRNAs. Mol. Cell 85, 445–445 (2024).
Article PubMed Google Scholar
Strayer, E. C. et al. NaP-TRAP reveals the regulatory grammar in 5′UTR-mediated translation regulation during zebrafish development. Nat. Commun. 15, 10898 (2024).
Article PubMed PubMed Central Google Scholar
Alqaraawi, A., Schuessler, M., Weiß, P., Costanza, E. & Berthouze, N. Evaluating saliency map explanations for convolutional neural networks: a user study. Preprint at https://arxiv.org/abs/2002.00772 (2020).
Simonyan, K., Vedaldi, A. & Zisserman, A. Deep inside convolutional networks: visualising image classification models and saliency maps. Preprint at https://arxiv.org/abs/1312.6034 (2013).
Shrikumar, A. et al. Technical note on transcription factor motif discovery from importance scores (TF-MoDISco) version 0.5.6.5. Preprint at https://arxiv.org/abs/1811.00416 (2018).
Chu, D. et al. Translation elongation can control translation initiation on eukaryotic mRNAs. EMBO J. 33, 21–34 (2014).
Article CAS PubMed Google Scholar
Wu, C. C.-C., Zinshteyn, B., Wehner, K. A. & Green, R. High-resolution ribosome profiling defines discrete ribosome elongation states and translational regulation during cellular stress. Mol. Cell 73, 959–970 (2019).
Article CAS PubMed PubMed Central Google Scholar
Gogakos, T. et al. Characterizing expression and processing of precursor and mature human tRNAs by hydro-tRNAseq and PAR-CLIP. Cell Rep. 20, 1463–1475 (2017).
Article CAS PubMed PubMed Central Google Scholar
Sterne-Weiler, T. et al. Frac-seq reveals isoform-specific recruitment to polyribosomes. Genome Res. 23, 1615–1623 (2013).
Article CAS PubMed PubMed Central Google Scholar
Ritter, A. J., Draper, J. M., Vollmers, C. & Sanford, J. R. Long-read subcellular fractionation and sequencing reveals the translational fate of full-length mRNA isoforms during neuronal differentiation. Genome Res. 34, 2000–2011 (2024).
Article CAS PubMed PubMed Central Google Scholar
Nachtergaele, S. & He, C. Chemical modifications in the life of an mRNA transcript. Annu. Rev. Genet. 52, 349–372 (2018).
Article CAS PubMed PubMed Central Google Scholar
Whiffin, N. et al. Characterising the loss-of-function impact of 5′ untranslated region variants in 15,708 individuals. Nat. Commun. 11, 2523 (2020).
Article CAS PubMed PubMed Central Google Scholar
Sevilla, T. et al. Mutations in the MORC2 gene cause axonal Charcot–Marie–Tooth disease. Brain 139, 62–72 (2015).
Article PubMed Google Scholar
Dueñas Rey, A. et al. Combining a prioritization strategy and functional studies nominates 5′UTR variants underlying inherited retinal disease. Genome Med. 16, 7 (2024).
Article PubMed PubMed Central Google Scholar
Liu, L. et al. Mutation of the CDKN2A 5′ UTR creates an aberrant initiation codon and predisposes to melanoma. Nat. Genet. 21, 128–132 (1999).
Article PubMed Google Scholar
Damjanovich, K. et al. 5′UTR mutations of ENG cause hereditary hemorrhagic telangiectasia. Orphanet J. Rare Dis. 6, 85 (2011).
Article PubMed PubMed Central Google Scholar
Pan, X. et al. 5′-UTR SNP of FGF13 causes translational defect and intellectual disability. eLife 10, e63021 (2021).
Article CAS PubMed PubMed Central Google Scholar
Lee, D. S. M. et al. Disrupting upstream translation in mRNAs is associated with human disease. Nat. Commun. 12, 1515 (2021).
Article CAS PubMed PubMed Central Google Scholar
Lim, Y. et al. Multiplexed functional genomic analysis of 5′ untranslated region mutations across the spectrum of prostate cancer. Nat. Commun. 12, 4217 (2021).
Article CAS PubMed PubMed Central Google Scholar
Stephens, S. B. & Nicchitta, C. V. Divergent regulation of protein synthesis in the cytosol and endoplasmic reticulum compartments of mammalian cells. Mol. Biol. Cell 19, 623–632 (2008).
Article CAS PubMed PubMed Central Google Scholar
Horste, E. L. et al. Subcytoplasmic location of translation controls protein output. Mol. Cell 83, 4509–4523 (2023).
Article CAS PubMed PubMed Central Google Scholar
Hubstenberger, A. et al. P-body purification reveals the condensation of repressed mRNA regulons. Mol. Cell 68, 144–157 (2017).
Article CAS PubMed Google Scholar
Chew, G.-L., Pauli, A. & Schier, A. F. Conservation of uORF repressiveness and sequence features in mouse, human and zebrafish. Nat. Commun. 7, 11663 (2016).
Article CAS PubMed PubMed Central Google Scholar
Jia, L. et al. Decoding mRNA translatability and stability from the 5′ UTR. Nat. Struct. Mol. Biol. 27, 814–821 (2020).
Article CAS PubMed Google Scholar
Akirtava, C., May, G. E. & McManus, C. J. Deciphering the landscape of cis-acting sequences in natural yeast transcript leaders. Nucleic Acids Res. 53, gkaf165 (2025).
Article CAS PubMed PubMed Central Google Scholar
Choi, Y. et al. Time-resolved profiling of RNA binding proteins throughout the mRNA life cycle. Mol. Cell 84, 1764–1782 (2024).
Article CAS PubMed Google Scholar
Singh, G., Pratt, G., Yeo, G. W. & Moore, M. J. The clothes make the mRNA: past and present trends in mRNP fashion. Annu. Rev. Biochem. 84, 325–354 (2015).
Article CAS PubMed PubMed Central Google Scholar
May, G. E. et al. Unraveling the influences of sequence and position on yeast uORF activity using massively parallel reporter systems and machine learning. eLife 12, e69611 (2023).
Article CAS PubMed PubMed Central Google Scholar
Arribere, J. A. et al. Translation readthrough mitigation. Nature 534, 719–723 (2016).
Article CAS PubMed PubMed Central Google Scholar
Kramarski, L. & Arbely, E. Translational read-through promotes aggregation and shapes stop codon identity. Nucleic Acids Res. 48, 3747–3760 (2020).
Article CAS PubMed PubMed Central Google Scholar
Yordanova, M. M. et al. AMD1 mRNA employs ribosome stalling as a mechanism for molecular memory formation. Nature 553, 356–360 (2018).
Article CAS PubMed Google Scholar
Hashimoto, S., Nobuta, R., Izawa, T. & Inada, T. Translation arrest as a protein quality control system for aberrant translation of the 3′-UTR in mammalian cells. FEBS Lett. 593, 777–787 (2019).
Article CAS PubMed Google Scholar
Sherlock, M. E., Baquero Galvis, L., Vicens, Q., Kieft, J. S. & Jagannathan, S. Principles, mechanisms, and biological implications of translation termination–reinitiation. RNA 29, 865–884 (2023).
Article CAS PubMed PubMed Central Google Scholar
Wu, Q. et al. Translation of small downstream ORFs enhances translation of canonical main open reading frames. EMBO J. 39, e104763 (2020).
Article CAS PubMed PubMed Central Google Scholar
Mayr, C. Evolution and biological roles of alternative 3′UTRs. Trends Cell Biol. 26, 227–237 (2016).
Article CAS PubMed Google Scholar
Subtelny, A. O., Eichhorn, S. W., Chen, G. R., Sive, H. & Bartel, D. P. Poly(A)-tail profiling reveals an embryonic switch in translational control. Nature 508, 66–71 (2014).
Article CAS PubMed PubMed Central Google Scholar
Ozadam, H. et al. Single-cell quantification of ribosome occupancy in early mouse development. Nature 618, 1057–1064 (2023).
Article CAS PubMed PubMed Central Google Scholar
Gruber, A. R. et al. Global 3′ UTR shortening has a limited effect on protein abundance in proliferating T cells. Nat. Commun. 5, 5465 (2014).
Article CAS PubMed Google Scholar
Requião, R. D., Barros, G. C., Domitrovic, T. & Palhano, F. L. Influence of nascent polypeptide positive charges on translation dynamics. Biochem. J 477, 2921–2934 (2020).
Article PubMed Google Scholar
Dao Duc, K. & Song, Y. S. The impact of ribosomal interference, codon usage, and exit tunnel interactions on translation elongation rate variation. PLoS Genet. 14, e1007166 (2018).
Article PubMed PubMed Central Google Scholar
Ahmed, N. et al. Pairs of amino acids at the P- and A-sites of the ribosome predictably and causally modulate translation–elongation rates. J. Mol. Biol. 432, 166696 (2020).
Article CAS PubMed PubMed Central Google Scholar
Kirchner, S. & Ignatova, Z. Emerging roles of tRNA in adaptive translation, signalling dynamics and disease. Nat. Rev. Genet. 16, 98–112 (2015).
Article CAS PubMed Google Scholar
Ingolia, N. T., Lareau, L. F. & Weissman, J. S. Ribosome profiling of mouse embryonic stem cells reveals the complexity and dynamics of mammalian proteomes. Cell 147, 789–802 (2011).
Article CAS PubMed PubMed Central Google Scholar
Riba, A. et al. Protein synthesis rates and ribosome occupancies reveal determinants of translation elongation rates. Proc. Natl. Acad. Sci. USA. 116, 15023–15032 (2019).
Article CAS PubMed PubMed Central Google Scholar
Barrington, C. L. et al. Synonymous codon usage regulates translation initiation. Cell Rep. 42, 113413 (2023).
Article CAS PubMed PubMed Central Google Scholar
Lyons, E. F. et al. Translation elongation as a rate limiting step of protein production. Preprint at bioRxiv https://doi.org/10.1101/2023.11.27.568910 (2024).
Chen, K. Y., Park, H. & Subramaniam, A. R. Massively parallel identification of sequence motifs triggering ribosome-associated mRNA quality control. Nucleic Acids Res. 52, 7171–7187 (2024).
Article CAS PubMed PubMed Central Google Scholar
Bicknell, A. A. & Ricci, E. P. When mRNA translation meets decay. Biochem. Soc. Trans. 45, 339–351 (2017).
Article CAS PubMed Google Scholar
Bicknell, A. A. et al. Attenuating ribosome load improves protein output from mRNA by limiting translation-dependent mRNA decay. Cell Rep. 43, 114098 (2024).
Article CAS PubMed Google Scholar
Mishima, Y., Han, P., Ishibashi, K., Kimura, S. & Iwasaki, S. Ribosome slowdown triggers codon-mediated mRNA decay independently of ribosome quality control. EMBO J. 41, e109256 (2022).
Article CAS PubMed PubMed Central Google Scholar
Bae, H. & Coller, J. Codon optimality-mediated mRNA degradation: linking translational elongation to mRNA stability. Mol. Cell 82, 1467–1476 (2022).
Article CAS PubMed PubMed Central Google Scholar
Inada, T. Quality controls induced by aberrant translation. Nucleic Acids Res. 48, 1084–1096 (2020).
Article CAS PubMed PubMed Central Google Scholar
Matsuo, Y. et al. RQT complex dissociates ribosomes collided on endogenous RQC substrate SDD1. Nat. Struct. Mol. Biol. 27, 323–332 (2020).
Article CAS PubMed Google Scholar
Mercier, B. C. et al. Translation-dependent and -independent mRNA decay occur through mutually exclusive pathways defined by ribosome density during T cell activation. Genome Res. 34, 394–409 (2024).
CAS PubMed PubMed Central Google Scholar
Leppek, K., Das, R. & Barna, M. Functional 5′ UTR mRNA structures in eukaryotic translation regulation and how to find them. Nat. Rev. Mol. Cell Biol. 19, 158–174 (2018).
Article CAS PubMed Google Scholar
Liu, T.-Y. et al. Time-resolved proteomics extends ribosome profiling-based measurements of protein synthesis dynamics. Cell Syst. 4, 636–644 (2017).
Article CAS PubMed PubMed Central Google Scholar
Shah, P., Ding, Y., Niemczyk, M., Kudla, G. & Plotkin, J. B. Rate-limiting steps in yeast protein translation. Cell 153, 1589–1601 (2013).
Article CAS PubMed PubMed Central Google Scholar
The UniProt Consortium. UniProt: the universal protein knowledgebase in 2021. Nucleic Acids Res. 49, D480–D489 (2021).
Article Google Scholar
Gerashchenko, M. V. & Gladyshev, V. N. Translation inhibitors cause abnormalities in ribosome profiling experiments. Nucleic Acids Res. 42, e134 (2014).
Article PubMed PubMed Central Google Scholar
Rodriguez, J. M. et al. APPRIS: selecting functionally important isoforms. Nucleic Acids Res. 50, D54–D59 (2022).
Article CAS PubMed Google Scholar
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Google Scholar
Ke, G. et al. LightGBM: a highly efficient gradient boosting decision tree. In Proc. 31st International Conference on Neural Information Processing Systems (eds von Luxburg, U. & Guyon, I.) 3146–3154 (Curran Associates, 2017).
Kokhlikyan, N. et al. Captum: a unified and generic model interpretability library for PyTorch. Preprint at https://arxiv.org/abs/2009.07896 (2020).
Gudmundsson, S. et al. Addendum: the mutational constraint spectrum quantified from variation in 141,456 humans. Nature 597, E3–E4 (2021).
Article CAS PubMed PubMed Central Google Scholar
Zheng, D., Wang, J. & Agarwal, V. RiboNN: a deep learning model to predict translation efficiency from mRNA sequence. Zenodo https://doi.org/10.5281/zenodo.15360345 (2025).
Persyn, L., Liu, Y. & Cenik, C. Classic TE prediction model. Zenodo https://doi.org/10.5281/zenodo.15360966 (2025).
Pagès, H., Aboyoun, P., Gentleman, R. & DebRoy, S. Biostrings: efficient manipulation of biological strings. Bioconductor https://doi.org/10.18129/B9.bioc.Biostrings (2025).

Download references

Acknowledgements

We thank I. Hoskins (UT Austin) for the code and data to generate secondary structure features and M. Miladi (Sanofi) for providing critical feedback. We thank C. Thoreen and W. Gilbert (Yale University) for sharing their data before publication. Research reported in this publication was supported in part by the National Institute of General Medical Sciences of the National Institutes of Health under award R35GM150667 (to C.C.). This work was also supported by the National Institutes of Health (grant HD110096) and the Welch Foundation (grant F-2027-20230405 to C.C.). C.C. was a CPRIT Scholar in Cancer Research supported by the CPRIT (grant RR180042).

Author information

These authors contributed equally: Dinghai Zheng, Logan Persyn, Jun Wang.

Authors and Affiliations

mRNA Center of Excellence, Sanofi, Waltham, MA, USA
Dinghai Zheng, Jun Wang, Fernando Ulloa-Montoya & Vikram Agarwal
Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, USA
Logan Persyn, Yue Liu & Can Cenik

Authors

Dinghai Zheng
View author publications
Search author on:PubMed Google Scholar
Logan Persyn
View author publications
Search author on:PubMed Google Scholar
Jun Wang
View author publications
Search author on:PubMed Google Scholar
Yue Liu
View author publications
Search author on:PubMed Google Scholar
Fernando Ulloa-Montoya
View author publications
Search author on:PubMed Google Scholar
Can Cenik
View author publications
Search author on:PubMed Google Scholar
Vikram Agarwal
View author publications
Search author on:PubMed Google Scholar

Contributions

D.Z. trained RiboNN models, validated model predictions with public datasets and contributed to model interpretation. J.W. interpreted RiboNN, performed comparisons between TE and third-party measurements, and analyzed genetic variant data. L.P. trained and interpreted classic ML models. Y.L. helped synthesize the data compendia and developed the compositional approach to calculate TE. F.U.-M., C.C. and V.A. supervised the study. C.C. and V.A. conceptualized and designed the study.

Corresponding authors

Correspondence to Can Cenik or Vikram Agarwal.

Ethics declarations

Competing interests

D.Z., J.W., F.U.-M. and V.A. are employees of Sanofi and may hold shares and/or stock options in the company. The other authors declare no competing interests.

Peer review

Peer review information

Nature Biotechnology thanks the anonymous reviewers for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Intercomparison of mouse cell types.

a, Same as Fig. 1b, except for the display results for 68 mouse cell types. b, Scatter plot to show the correlation between mouse mean TE from this study and ribosomal loading measured in 3T3 cells³³. Pearson (r) and Spearman (ρ) correlation coefficients are indicated.

Extended Data Fig. 2 Visualization of the RiboNN model architecture.

Shown is a layer-by-layer graph of the RiboNN architecture, with input/output dimensions labeled for each layer. The ConvBlock in the broken-line box was applied 10 times in total to compress the sequence length. Light yellow nodes reflect input/output tensors, light blue nodes reflect functions, light green nodes reflect modules and numbers in parentheses reflect tensor dimensions.

Extended Data Fig. 3 Performance of deep learning models on all human cell types.

These panels mirror those shown in Supplementary Fig. 4, except they show the performance of multitask deep learning models with one of four architectures: (1) our final RiboNN architecture, (2) our RiboNN architecture, ablating the input channel recording codon positions, (3) our RiboNN architecture, anchoring all mRNAs at their 5′ end instead of at start codons and (4) the Saluki architecture⁷, removing the splice site input channel. For each architecture, the r² values were measured on ten held-out CV folds (n = 10,242 total genes among the ten folds). The center of the boxes corresponds to the median (the 50th percentile). The lower and upper hinges correspond to the first and third quartiles (the 25th and 75th percentiles). The upper whisker extends from the hinge to the largest value no further than 1.5× IQR (interquartile range, or distance between the first and third quartiles) from the hinge. The lower whisker extends from the hinge to the smallest value at most 1.5× IQR of the hinge. Data beyond the end of the whiskers are plotted individually.

Extended Data Fig. 4 Performance of RiboNN on mouse cell types.

a, These panels mirror those shown in Supplementary Fig. 4, except they show the performance of our multitask RiboNN model on mouse cell types using r² measured on ten held-out CV folds (n = 10,242 total genes among the ten folds). The center of the boxes corresponds to the median (the 50th percentile). The lower and upper hinges correspond to the first and third quartiles (the 25th and 75th percentiles). The upper whisker extends from the hinge to the largest value no further than 1.5× IQR (interquartile range, or distance between the first and third quartiles) from the hinge. The lower whisker extends from the hinge to the smallest value at most 1.5× IQR of the hinge. Data beyond the end of the whiskers are plotted individually. b,c, Scatter plots showing the relationships between our mouse RiboNN predictions to the observed mean TEs for human mRNAs (b) as well as the relationships between our human RiboNN predictions to the observed mean TEs for mouse mRNAs (c). Pearson (r) and Spearman (ρ) correlation coefficients are also shown. d,e, Scatter plots showing the relationships between sequence homology, considering the interspecies pair of mRNAs with the maximum homology, and the residual prediction error between the TE from one species and TE predicted from the alternative species. This was shown for human (d) and mouse (e) mean TE data. ‘Max homology %’ was computed as follows: (1) all human–mouse mRNA pairs were locally aligned using the ‘pairwiseAlignment’ function from the Biostrings (version 2.70.2) R package¹²⁹ (‘match: 1, mismatch: −3, gap open: −2 and gap extend: −1’) and (2) for each mRNA, the final value was computed using the highest scoring alignment from the other species, calculating the maximum homology score divided by mRNA length.

Extended Data Fig. 5 Interpretation of human and mouse RiboNN models.

a, Attribution score plot for human RiboNN model focusing on specific regions along valid mRNAs (defined in Methods). The windows include the first 50 nt of the 5′ UTR, 50 nt upstream to 250 nt downstream of the start codon, 250 nt upstream and 250 nt downstream of the stop codon and the last 250 nt of the 3′ UTR. The absolute values of attribution scores were averaged across all valid mRNAs, which were grouped into one of four equally sized bins according to their mean TE. b, Metagene plot for the absolute value of attribution scores derived from the mouse RiboNN model, averaged across all mRNAs, for percentiles along the 5′ UTR, CDS and 3′ UTR. mRNAs were grouped into one of four equally sized bins according to their mean TE. c, Same as a, except it reflects results from the mouse RiboNN model. d,e, Enriched motifs learned by human (d) and mouse (e) RiboNN models for each functional region of mRNA. Motifs are ranked by the number of seqlets⁶⁷ supporting each motif.

Extended Data Fig. 6 Amino acid-level-based correlation among codon influence scores.

a–c, Scatter plots showing the relationship between the amino acid-level-based codon influence (that is, the predicted effect size of each inserted codon, averaged across all positional bins and across codons for each amino acid) from the human RiboNN model and the mouse model (a), A-site ribosome occupancy scores⁶⁹ (b) and mean codon stability coefficients⁴⁹ (c). Pearson (r) and Spearman (ρ) correlation coefficients are also shown. The properties of amino acids are labeled by different colors for hydrophobicity and by different shapes for charge. The error bar represents the standard error across codons encoding the same amino acid. To compute amino acid-level scores, we computed the mean score among codons encoding the same amino acid. All 61 non-stop codons were included in the amino acid-based analysis.

Extended Data Fig. 7 In silico mutagenesis of disease-associated gene 5′ UTRs.

a–c, In silico mutagenesis of 5′ UTR regions of RDH12 (a), ENG (b) and FGF13 (c). Positions of wild-type uAUG are highlighted in purple at the top. The known disease-associated variants are boxed. Single-point mutations resulting in predicted TE differences are shown alongside annotations reflecting the corresponding gain or loss of TE.

Extended Data Fig. 8 In silico mutagenesis of disease-associated gene 5′ UTRs.

Continuation of results from Extended Data Fig. 7, except with BCL2L13.

Extended Data Fig. 9 In silico mutagenesis of cancer-associated gene 5′ UTRs.

a–d, In silico mutagenesis of 5′ UTR regions of ADAM32 (a), NUMA1 (b), COMT (c) and QARS (d). Positions of wild-type uAUG are highlighted in purple at the top. The known cancer-associated variants are boxed. Single-point mutations resulting in predicted TE differences are shown alongside annotations reflecting the corresponding gain or loss of TE.

Extended Data Fig. 10 In silico mutagenesis of cancer-associated gene 5′ UTRs.

Continuation of results from Extended Data Fig. 9, except with AKT3.

Supplementary information

Supplementary Information

Supplementary Discussion, Methods and Figs. 1–12.

Reporting Summary

Supplementary Table 1

Feature sizes, sequences, CV folds and TEs of human genes.

Supplementary Table 2

Feature sizes, sequences, CV folds and TEs of mouse genes.

Supplementary Table 3

Feature sizes, sequences, CV folds and TEs predicted by the human RiboNN models.

Supplementary Table 4

Feature sizes, sequences, CV folds and TEs predicted by the mouse RiboNN models.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zheng, D., Persyn, L., Wang, J. et al. Predicting the translation efficiency of messenger RNA in mammalian cells. Nat Biotechnol (2025). https://doi.org/10.1038/s41587-025-02712-x

Download citation

Received: 13 November 2024
Accepted: 21 May 2025
Published: 25 July 2025
Version of record: 25 July 2025
DOI: https://doi.org/10.1038/s41587-025-02712-x

This article is cited by

Translation efficiency covariation identifies conserved coordination patterns across cell types
- Yue Liu
- Shilpa Rao
- Can Cenik
Nature Biotechnology (2025)