3LM: Bridging Arabic, STEM, and Code through Benchmarking

Boussaha, Basma El Amel; AlQadi, Leen; Farooq, Mugariya; Alsuwaidi, Shaikha; Campesan, Giulia; Alzubaidi, Ahmed; Alyafeai, Mohammed; Hacid, Hakim

Computer Science > Computation and Language

arXiv:2507.15850 (cs)

[Submitted on 21 Jul 2025 (v1), last revised 22 Jul 2025 (this version, v2)]

Title:3LM: Bridging Arabic, STEM, and Code through Benchmarking

Authors:Basma El Amel Boussaha, Leen AlQadi, Mugariya Farooq, Shaikha Alsuwaidi, Giulia Campesan, Ahmed Alzubaidi, Mohammed Alyafeai, Hakim Hacid

View PDF

Abstract:Arabic is one of the most widely spoken languages in the world, yet efforts to develop and evaluate Large Language Models (LLMs) for Arabic remain relatively limited. Most existing Arabic benchmarks focus on linguistic, cultural, or religious content, leaving a significant gap in domains like STEM and code which are increasingly relevant for real-world LLM applications. To help bridge this gap, we present 3LM, a suite of three benchmarks designed specifically for Arabic. The first is a set of STEM-related question-answer pairs, naturally sourced from Arabic textbooks and educational worksheets. The second consists of synthetically generated STEM questions, created using the same sources. The third benchmark focuses on code generation, built through a careful translation of two widely used code benchmarks, incorporating a human-in-the-loop process with several rounds of review to ensure high-quality and faithful translations. We release all three benchmarks publicly to support the growth of Arabic LLM research in these essential but underrepresented areas.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2507.15850 [cs.CL]
	(or arXiv:2507.15850v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2507.15850

Submission history

From: Basma El Amel Boussaha [view email]
[v1] Mon, 21 Jul 2025 17:58:27 UTC (1,481 KB)
[v2] Tue, 22 Jul 2025 18:43:45 UTC (1,480 KB)

Computer Science > Computation and Language

Title:3LM: Bridging Arabic, STEM, and Code through Benchmarking

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:3LM: Bridging Arabic, STEM, and Code through Benchmarking

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators