Shifting the Lens: Detecting Malware in npm Ecosystem with Large Language Models

Zahan, Nusrat; Burckhardt, Philipp; Lysenko, Mikola; Aboukhadijeh, Feross; Williams, Laurie

Computer Science > Cryptography and Security

arXiv:2403.12196v1 (cs)

[Submitted on 18 Mar 2024 (this version), latest version 6 Jan 2025 (v4)]

Title:Shifting the Lens: Detecting Malware in npm Ecosystem with Large Language Models

Authors:Nusrat Zahan, Philipp Burckhardt, Mikola Lysenko, Feross Aboukhadijeh, Laurie Williams

View PDF HTML (experimental)

Abstract:The Gartner 2022 report predicts that 45% of organizations worldwide will encounter software supply chain attacks by 2025, highlighting the urgency to improve software supply chain security for community and national interests. Current malware detection techniques aid in the manual review process by filtering benign and malware packages, yet such techniques have high false-positive rates and limited automation support. Therefore, malware detection techniques could benefit from advanced, more automated approaches for accurate and minimally false-positive results. The goal of this study is to assist security analysts in identifying malicious packages through the empirical study of large language models (LLMs) to detect potential malware in the npm ecosystem.
We present SocketAI Scanner, a multi-stage decision-maker malware detection workflow using iterative self-refinement and zero-shot-role-play-Chain of Thought (CoT) prompting techniques for ChatGPT. We studied 5,115 npm packages (of which 2,180 are malicious) and performed a baseline comparison of the GPT-3 and GPT-4 models with a static analysis tool. Our findings showed promising results for GPT models with low misclassification alert rates. Our baseline comparison demonstrates a notable improvement over static analysis in precision scores above 25% and F1 scores above 15%. We attained precision and F1 scores of 91% and 94%, respectively, for the GPT-3 model. Overall, GPT-4 demonstrates superior performance in precision (99%) and F1 (97%) scores, while GPT-3 presents a cost-effective balance between performance and expenditure.

Comments:	13 pages, 1 Figure, 7 tables
Subjects:	Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2403.12196 [cs.CR]
	(or arXiv:2403.12196v1 [cs.CR] for this version)
	https://doi.org/10.48550/arXiv.2403.12196

Submission history

From: Nusrat Zahan [view email]
[v1] Mon, 18 Mar 2024 19:10:12 UTC (175 KB)
[v2] Fri, 9 Aug 2024 16:29:44 UTC (381 KB)
[v3] Fri, 13 Dec 2024 04:41:50 UTC (280 KB)
[v4] Mon, 6 Jan 2025 16:29:32 UTC (280 KB)

Computer Science > Cryptography and Security

Title:Shifting the Lens: Detecting Malware in npm Ecosystem with Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Cryptography and Security

Title:Shifting the Lens: Detecting Malware in npm Ecosystem with Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators