+
Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • News & Views
  • Published:

Neuromorphic computing

Overcoming computational bottlenecks in large language models through analog in-memory computing

A recent study demonstrates the potential of using in-memory computing architecture for implementing large language models for an improved computational efficiency in both time and energy while maintaining a high accuracy.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Accelerate attention mechanism with analog in-memory computing.

References

  1. de Vries, A. Joule 7, 2191–2194 (2023).

    Article  Google Scholar 

  2. Leroux, N. et al. Nat. Comput. Sci. https://doi.org/10.1038/s43588-025-00854-1 (2025).

    Article  Google Scholar 

  3. Horowitz, M. Computing’s energy problem (and what we can do about it). In Proc. 2014 IEEE International Solid-State Circuits Conference 10–14 (IEEE, 2014).

  4. Liu, Z. et al. Scissorhands: exploiting the persistence of importance hypothesis for LLM KV cache compression at test time. In Proc. Advances in Neural Information Processing Systems (eds Oh, A. et al.) 52342–52364 (NeurIPS, 2023).

  5. Wan, W. et al. Nature 608, 504–512 (2022).

    Article  Google Scholar 

  6. Yao, P. et al. Nature 577, 641–646 (2020).

    Article  Google Scholar 

  7. Lin, Y. et al. Nat. Comput. Sci. 5, 27–36 (2025).

    Article  Google Scholar 

  8. Yang, X., Yan, B., Li, H. & Chen, Y. ReTransformer: ReRAM-based processing-in-memory architecture for transformer acceleration. In Proc. 39th International Conference on Computer-Aided Design (Association for Computing Machinery, 2020).

  9. Yang, H. et al. Monolithic 3D integration of analog RRAM-based fully weight stationary and novel CFET 2T0C-based partially weight stationary for accelerating transformer. In Proc. 2024 IEEE Symposium on VLSI Technology and Circuits (IEEE, 2024).

  10. Sridharan, S., Stevens, J. R., Roy, K. & Raghunathan, A. IEEE Trans. Very Large Scale Integr. VLSI Syst. 31, 1223–1233 (2023).

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianshi Tang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, Y., Tang, J. Overcoming computational bottlenecks in large language models through analog in-memory computing. Nat Comput Sci 5, 711–712 (2025). https://doi.org/10.1038/s43588-025-00860-3

Download citation

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1038/s43588-025-00860-3

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载